Re: [DISCUSS] KIP-405: Kafka Tiered Storage

Satish Duggana Wed, 06 Nov 2019 19:04:23 -0800

>So that means a consumer which gets behind by half an hour will find its
reads being served from remote storage. And, if I understand the proposed
algorithm, each such consumer fetch request could result in a separate
fetch request from the remote storage. I.e. there's no mechanism to
amortize the cost of the fetching between multiple consumers fetching
similar ranges?


local log segments are deleted according to the local
log.retention.time/.size settings though they may have been already
copied to remote storage. Consumers would still be able to fetch the
messages from local storage if they are not yet deleted based on the
retention. They will be served from remote storage only when they are
not locally available.

Thanks,
Satish.

On Thu, Nov 7, 2019 at 7:58 AM Tom Bentley <tbent...@redhat.com> wrote:
>
> Hi Ying,
>
> Because only inactive segments can be shipped to remote storage, to be able
> > to ship log data as soon
> > as possible, we will roll log segment very fast (e.g. every half hour).
> >
>
> So that means a consumer which gets behind by half an hour will find its
> reads being served from remote storage. And, if I understand the proposed
> algorithm, each such consumer fetch request could result in a separate
> fetch request from the remote storage. I.e. there's no mechanism to
> amortize the cost of the fetching between multiple consumers fetching
> similar ranges?
>
> (Actually the doc for RemoteStorageManager.read() says "It will read at
> least one batch, if the 1st batch size is larger than maxBytes.". Does that
> mean the broker might have to retry with increased maxBytes if the first
> request fails to read a batch? If so, how does it know how much to increase
> maxBytes by?)
>
> Thanks,
>
> Tom

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

Reply via email to