Re: [DISCUSS] KIP-1002: Fetch remote segment indexes at once

Divij Vaidya Mon, 13 Nov 2023 06:52:00 -0800

Hi Jorge

1. I don't think we need a new API here because alternatives solutions
exist even with the current API. As an example, when the first index is
fetched, the RSM plugin can choose to download all indexes and cache it
locally. On the next call to fetch an index from the remote tier, we will
hit the cache and retrieve the index from there.

2. The KIP assumes that all indexes are required at all times. However,
indexes such as transaction indexes are only required for read_committed
fetches and time index is only required when a fetch call wants to search
offset by timestamp. As a future step in Tiered Storage, I would actually
prefer to move towards a direction where we are lazily fetching indexes
on-demand instead of fetching them together as proposed in the KIP.

--
Divij Vaidya

On Fri, Nov 10, 2023 at 4:00 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hello everyone,
>
> I would like to start the discussion on a KIP for Tiered Storage. It's
> about improving cross-segment latencies by reducing calls to fetch indexes
> individually.
> Have a look:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1002%3A+Fetch+remote+segment+indexes+at+once
>
> Cheers,
> Jorge
>

Re: [DISCUSS] KIP-1002: Fetch remote segment indexes at once

Reply via email to