Hi, Rajini,

Thanks for the explanation. The KIP looks good to me now.

Also, it seems that metadata.recovery.strategy is missing for
AdminClientConfigs in
https://kafka.apache.org/documentation/#adminclientconfigs. Not sure why
this is happening since it's autogenerated.

Jun

On Thu, Nov 7, 2024 at 12:56 AM Rajini Sivaram <rajinisiva...@gmail.com>
wrote:

> Hi Jun,
>
> Thanks for reviewing the KIP.
>
> 1) When we need metadata, leastLoadedNode() attempts connection to node1
> and if that fails, node1 is in reconnection backoff for a second. So
> leastLoadedNode() attempts connection to node2. If the connection attempt
> to node2 takes 5 seconds and fails, we try leastLoadedNode again. This
> time, node1 is no longer in reconnection backoff because its backoff period
> of one second has elapsed. With the metadata recovery strategy in KIP-899,
> all nodes have to be unavailable to trigger rebootstrap. Since node1 is no
> longer in backoff, we will attempt node1 again rather than rebootstrap.
>
> 2) Good point. Since we are changing the default rebootstrap strategy, it
> makes sense to include share consumer and KStreams as well in this KIP.
> Looks like we have WorkerConfig in Connect too. Will add them to the KIP.
>
> Regards,
>
> Rajini
>
> On Thu, Nov 7, 2024 at 1:32 AM Jun Rao <j...@confluent.io.invalid> wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the KIP. Looks good to me overall. Just a couple of minor
> > comments.
> >
> > 1. "So if a connection attempt is stuck for over a second, other nodes
> that
> > are in reconnection backoff state will no longer be in backoff and as a
> > result, rebootstrap will never be attempted." Not sure that I understand
> > this. Why would a stuck connection to one node affect the reconnection
> > backoff state to other nodes?
> >
> > 2. metadata.recovery.strategy and
> metadata.recovery.rebootstrap.trigger.ms
> > :
> > Should we add them to share consumer and kstreams too or leave that for
> > future KIPs?
> >
> > Jun
> >
> > On Mon, Nov 4, 2024 at 7:38 AM Rajini Sivaram <rajinisiva...@gmail.com>
> > wrote:
> >
> > > Hi Apoorv,
> > >
> > > Thanks for the review.  Good idea to prefix with metadata.recovery, I
> > have
> > > renamed the config to metadata.recovery.rebootstrap.trigger.ms as you
> > > suggested.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > > On Mon, Nov 4, 2024 at 2:43 PM Apoorv Mittal <apoorvmitta...@gmail.com
> >
> > > wrote:
> > >
> > > > Hi Rajni,
> > > > Thanks for the KIP. It would be a good addition.
> > > >
> > > > If I understand it correctly then the client will trigger the
> > > re-bootstrap
> > > > process when the client is unable to obtain metadata in a defined
> > period
> > > of
> > > > time. The proposed configuration "rebootstrap.timeout.ms" seems to
> me
> > > more
> > > > of a timeout for the re-bootstrap process, rather than the time to
> > > trigger
> > > > the re-bootstrap process.
> > > >
> > > > Do you think "metadata.recovery.rebootstrap.trigger.ms" would be a
> > > better
> > > > name?
> > > >
> > > >    -
> > > >
> > > >    *metadata.recovery.rebootstrap.trigger.ms
> > > >    <http://metadata.recovery.rebootstrap.trigger.ms>:* This is a
> clear
> > > and
> > > >    concise option. It directly conveys the purpose of the
> > configuration,
> > > > which
> > > >    is to define the timeout for triggering a re-bootstrap when
> metadata
> > > > cannot
> > > >    be obtained. It specifies not only the trigger but also the
> recovery
> > > >    strategy (re-bootstrap) associated with it.
> > > >
> > > > Additionally, the other option I was thinking of was "
> > > > metadata.rebootstrap.trigger.ms".
> > > >
> > > > Regards,
> > > > Apoorv Mittal
> > > >
> > > >
> > > > On Mon, Nov 4, 2024 at 10:17 AM Rajini Sivaram <
> > rajinisiva...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Any other feedback or suggestions? If there are no concerns, I will
> > > start
> > > > > the vote tomorrow.
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Rajini
> > > > >
> > > > >
> > > > > On Wed, Oct 30, 2024 at 4:27 PM Rajini Sivaram <
> > > rajinisiva...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Andrew,
> > > > > >
> > > > > > Thanks for reviewing the KIP.
> > > > > >
> > > > > > AS1: Updated to KIP-559. Thanks for pointing that out.
> > > > > > AS2: Yes, that is correct. The error code is a performance
> > > optimization
> > > > > to
> > > > > > avoid unavailability for 5 minutes when a proxy may be able to
> tell
> > > the
> > > > > > client to rebootstrap immediately.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 30, 2024 at 2:51 PM Andrew Schofield <
> > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > >
> > > > > >> Hi Rajini,
> > > > > >> Thanks for the KIP. It looks like a useful improvement for
> > > > > >> rebootstrapping.
> > > > > >> I really like using the new major release as a way to change the
> > > > default
> > > > > >> so
> > > > > >> rebootstrapping can be taken for granted in the future.
> > > > > >>
> > > > > >> A couple of comments.
> > > > > >> AS1: (nit) KIP-559, not KIP-599, was the proxy-friendliness KIP.
> > > > > >> AS2: I suppose that the new error code is really just a
> > performance
> > > > > >> optimisation for situations where the proxy can tell that
> > > > > rebootstrapping
> > > > > >> is required and it can avoid the clients waiting the full 5
> > minutes.
> > > > Is
> > > > > >> this
> > > > > >> correct or is there more to it?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Andrew
> > > > > >>
> > > > > >> ________________________________________
> > > > > >> From: Rajini Sivaram <rajinisiva...@gmail.com>
> > > > > >> Sent: 28 October 2024 19:31
> > > > > >> To: dev <dev@kafka.apache.org>
> > > > > >> Subject: [DISCUSS] KIP-1102: Enable clients to rebootstrap based
> > on
> > > > > >> timeout or error code
> > > > > >>
> > > > > >> Hi everyone,
> > > > > >>
> > > > > >> I would like to start discussion on KIP-1102 (
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1102%3A+Enable+clients+to+rebootstrap+based+on+timeout+or+error+code
> > > > > >> ).
> > > > > >> This KIP extends KIP-899 by introducing a timeout configuration
> > and
> > > > > error
> > > > > >> code that can be used to trigger rebootstrapping in clients. The
> > KIP
> > > > > also
> > > > > >> proposes to enable rebootstrapping by default in 4.0.0.
> > > > > >>
> > > > > >> Thank you,
> > > > > >>
> > > > > >> Rajini
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to