Hi all,

If there are no more comments, I'll start a vote soon.

Thanks,
Kamal

On Wed, Aug 14, 2024 at 5:05 PM Luke Chen <show...@gmail.com> wrote:

> Hi Kamal,
>
> Thanks for the update.
> LGTM.
>
> Luke
>
> On Wed, Aug 14, 2024 at 7:25 PM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Hi all,
> >
> > > I saw we added some new configs/metrics.
> >
> > I have removed the recent changes to the public interfaces to limit the
> > scope of the KIP to minimum. PTAL.
> >
> > Thanks,
> > Kamal
> >
> > On Wed, Aug 14, 2024 at 9:58 AM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> > > Hi Luke,
> > >
> > > > LC5
> > > Agree, this is a rare scenario. Given that we have a common pool of
> > > request handler threads to accept
> > > all the incoming requests and there are no quotas to handle for each
> > > request. I'm OK with reusing the
> > > same remote-log-reader threads for LIST_OFFSETS requests. There may be
> > > noisy neighbor issues
> > > in handling the LIST_OFFSETS and FETCH remote requests when we read
> from
> > > remote storage
> > > aggressively and all the remote-log-reader threads are busy.
> > >
> > > >  If so, maybe the additional config/metrics are not necessary?
> > > Do you mean to have a separate thread pool and hardcode the num
> threads?
> > >
> > > > LC6 and LC7
> > > Updated the KIP.
> > >
> > > Thanks,
> > > Kamal
> > >
> > > On Wed, Aug 14, 2024 at 8:05 AM Luke Chen <show...@gmail.com> wrote:
> > >
> > >> Hi Kamal,
> > >>
> > >> Thanks for the response.
> > >>
> > >> I saw we added some new configs/metrics. Comments:
> > >>
> > >> LC5: Do you think this is a commonly happened issue that we need to
> add
> > a
> > >> separate `remote.log.offset.reader.threads` for it?
> > >> I thought this rarely happened. If so, maybe the additional
> > config/metrics
> > >> are not necessary? It makes the config more complicated.
> > >>
> > >> LC6: The config name:
> > >> `remote.log.offset.read.max.pending.tasks` , should we be consistent
> to
> > >> use
> > >> `reader`, instead of `read`?
> > >>
> > >> LC7: We should set a default value for the newly introduced configs
> and
> > >> written in KIP.
> > >>
> > >> Thanks.
> > >> Luke
> > >>
> > >> On Tue, Aug 13, 2024 at 8:47 PM Kamal Chandraprakash <
> > >> kamal.chandraprak...@gmail.com> wrote:
> > >>
> > >> > Hi Luke,
> > >> >
> > >> > Thanks for the review!
> > >> >
> > >> > > LC2:
> > >> > a. If the time taken to process the request is less than 5 mins,
> then
> > >> the
> > >> > Admin client will get a response.
> > >> > b. If the time taken to process the request is more than 5 mins,
> then
> > >> the
> > >> > Admin client will itself timeout the request due to the
> > >> > default-api-timeout.
> > >> > c. If the time taken to process the request is more than 6 mins,
> then
> > >> > the server will cancel the request in the DelayedRemoteListOffsets
> > >> > purgatory (to be implemented) and
> > >> >     send TimeoutException back to the client if the client is
> waiting
> > >> for
> > >> > the response.
> > >> >
> > >> > > LC3:
> > >> > Updated the KIP.
> > >> >
> > >> > > LC4:
> > >> > The consumer retries the LIST_OFFSETS request incase of
> > failures/timeout
> > >> > but not the AdminClient. So, I think this is a retry feature in the
> > >> > Consumer.
> > >> >
> > >> > Updated the "Public Interfaces" section in the KIP
> > >> > <
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1075%3A+Introduce+delayed+remote+list+offsets+purgatory+to+make+LIST_OFFSETS+async
> > >> > >
> > >> > by adding more details. PTAL.
> > >> >
> > >> > Thanks,
> > >> > Kamal
> > >> >
> > >> >
> > >> > On Tue, Aug 13, 2024 at 2:03 PM Luke Chen <show...@gmail.com>
> wrote:
> > >> >
> > >> > > Hi Kamal,
> > >> > >
> > >> > > Thanks for the update.
> > >> > > LC1: I see. Thanks.
> > >> > > LC2: What I still don't understand is what is the relationship
> > between
> > >> > > remote.list.offsets.request.timeout.ms V.S.
> > >> > > request.timeout/default.api.timeout.
> > >> > > Suppose we set request timeout to 30 seconds,
> default.api.timeout=5
> > >> mins
> > >> > > and remote.list.offsets.request.timeout.ms = 6 mins.
> > >> > > So, when Admin sends a list offset request that needs to query
> > remote
> > >> > > storage, when will it throw timeout exception? 30 secs or 5 mins
> or
> > 6
> > >> > mins?
> > >> > > We might need to make it clear in the KIP.
> > >> > >
> > >> > > LC3:
> > >> > > "Admin sends only one request and wait for upto
> default-api-timeout.
> > >> (eg)
> > >> > > If the admin is configured with default-api-timeout as 5 mins and
> > >> > > request-timeout as 30 seconds. And, the server takes 50 seconds to
> > >> > process
> > >> > > the LIST_OFFSETS request, then the admin sends only one
> LIST_OFFSETS
> > >> > > request, then receives the request from server after 50 seconds."
> > >> > >
> > >> > > In the end of the section, it should be receives the "response"
> from
> > >> > > server?
> > >> > >
> > >> > > LC4: I found the different consumer and admin behavior when
> setting
> > >> > > "request.timeout" and "default.api.timeout" is confusing. Are they
> > >> > expected
> > >> > > or a bug?
> > >> > >
> > >> > > Thank you.
> > >> > > Luke
> > >> > >
> > >> > >
> > >> > > On Thu, Aug 8, 2024 at 10:06 PM Kamal Chandraprakash <
> > >> > > kamal.chandraprak...@gmail.com> wrote:
> > >> > >
> > >> > > > Hi Luke,
> > >> > > >
> > >> > > > Thanks for the review!
> > >> > > >
> > >> > > > LC1: When the consumer starts to read data, then it might need
> the
> > >> > below
> > >> > > > offsets:
> > >> > > > earliest, latest, and last-committed-offset based on the
> > >> > > > "auto.offset.reset" config.
> > >> > > >
> > >> > > > The earliest and latest offsets have special timestamps -2 and
> -1,
> > >> > those
> > >> > > > timestamp corresponding offsets are cached in the
> > >> > > > broker memory and get served immediately. The
> > last-committed-offset
> > >> is
> > >> > > also
> > >> > > > cached in the GroupMetadata and
> > >> > > > gets served in the "OFFSET_FETCH" request. Unless the consumer
> > >> > > > explicitly uses the KafkaConsumer#offsetForTimes API,
> > >> > > > there won't be any delay in serving the data from the local log.
> > >> > > >
> > >> > > > In this KIP, we are trying to address the case in which multiple
> > >> > > consumers
> > >> > > > start at the same time and use the 'offsetForTimes` LIST_OFFSETS
> > >> API,
> > >> > > > assuming the remote requests are slow, then it should not block
> > >> other
> > >> > > > PRODUCE/FETCH requests.
> > >> > > >
> > >> > > > LC2: Sorry for the confusion. We were planning to introduce only
> > the
> > >> > > broker
> > >> > > > config "remote.list.offsets.timeout.ms".
> > >> > > > If we add the timeout in the ListOffsetsRequest.json, then when
> > old
> > >> > > clients
> > >> > > > talk with the new broker, we don't have
> > >> > > > a timeout to set on the server. Moved adding the timeout to the
> > >> > > > ListOffsetsRequest to the rejected alternatives section.
> > >> > > >
> > >> > > > --
> > >> > > > Kamal
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Reply via email to