Hello Matthias, Thanks for the updated KIP, overall I'm +1 on this proposal. Some minor comments (I know gmail mixed that again for me so I'm leaving it as a combo for both DISCUSS and VOTE :)
1) There are some inconsistent statements in the proposal regarding what to deprecated: at the beginning it says "We propose to deprecate the retries configuration parameter for the producer and admin client" but later in compatibility we say "Producer and admin client behavior does not change; both still respect retries config." My understanding is that we will only deprecate the StreamsConfig#retries while not touch on ProducerConfig/AdminClientConfig#retries, AND we will always override the embedded producer / admin retries config to infinity so that we never rely on those configs but always bounded by the timeout config. Is that right, if yes could you clarify in the doc? 2) We should also document the related behavior change in PartitionAssignor that uses AdminClient. More specifically, the admin client's retries config is piggy-backed inside InternalTopicManager as an outer-loop retry logic in addition to AdminClient's own inner retry loop. There are some existing issues like KAFKA-9999 / 10006 that Sophie and Boyang has been working on. I exchanged some ideas with them, and generally we should consider if the outer-loop of InternalTopicManager should just be removed and if we got TimeoutException we should just trigger another rebalance etc. BTW as I mentioned in the previous statement, today throwing an exception that kills one thread but not the whole instance is still an issue for monitoring purposes, but I suppose this is not going to be in this KIP but addressed by another KIP, right? Guozhang On Fri, May 15, 2020 at 1:14 PM Boyang Chen <reluctanthero...@gmail.com> wrote: > Good, good. > > Read through the discussions, the KIP looks good to me, +1 (non-binding) > > On Fri, May 15, 2020 at 11:51 AM Sophie Blee-Goldman <sop...@confluent.io> > wrote: > > > Called out! > > > > Seems like gmail struggles with [...] prefixed subjects. You'd think they > > would adapt > > all their practices to conform to the Apache Kafka mailing list > standards, > > but no! > > > > +1 (non-binding) by the way > > > > On Fri, May 15, 2020 at 11:46 AM John Roesler <vvcep...@apache.org> > wrote: > > > > > Hi Boyang, > > > > > > It is a separate thread, and you have just revealed yourself as a gmail > > > user ;) > > > > > > (Gmail sometimes conflates vote and discuss threads for no apparent > > reason > > > ) > > > > > > -John > > > > > > On Fri, May 15, 2020, at 13:39, Boyang Chen wrote: > > > > Hey Matthias, should this be on a separate VOTE thread? > > > > > > > > On Fri, May 15, 2020 at 11:38 AM John Roesler <vvcep...@apache.org> > > > wrote: > > > > > > > > > Thanks, Matthias! > > > > > > > > > > I’m +1 (binding) > > > > > > > > > > -John > > > > > > > > > > On Fri, May 15, 2020, at 11:55, Matthias J. Sax wrote: > > > > > > Hi, > > > > > > > > > > > > I would like to start the vote on KIP-572: > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-572%3A+Improve+timeouts+and+retries+in+Kafka+Streams > > > > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > > > > > > > Attachments: > > > > > > * signature.asc > > > > > > > > > > > > > > > -- -- Guozhang