Thanks for the feedback Boyang, I will revise the KIP with the mathematical relations as per your suggestion. To address your feedback:
1. Currently, with the default of 100 ms per retry backoff, in 1 second we would have 10 retries. In the case of using an exponential backoff, we would have a total of 4 retries in 1 second. Thus we have less than half of the amount of retries in the same timeframe and can lessen broker pressure. This calculation is done as following (using the formula laid out in the KIP: Try 1 at time 0 ms, failures = 0, next retry in 100 ms (default retry ms is initially 100 ms) Try 2 at time 100 ms, failures = 1, next retry in 200 ms Try 3 at time 300 ms, failures = 2, next retry in 400 ms Try 4 at time 700 ms, failures = 3, next retry in 800 ms Try 5 at time 1500 ms, failures = 4, next retry in 1000 ms (default max retry ms is 1000 ms) For 2 and 3, could you elaborate more about what you mean with respect to client timeouts? I’m not very familiar with the Streams framework, so would love to get more insight to how that currently works, with respect to producer transactions, so I can appropriately update the KIP to address these scenarios. On Mar 13, 2020, 7:15 PM -0700, Boyang Chen <reluctanthero...@gmail.com>, wrote: > Thanks for the KIP Sanjana. I think the motivation is good, but lack of > more quantitative analysis. For instance: > > 1. How much retries we are saving by applying the exponential retry vs > static retry? There should be some mathematical relations between the > static retry ms, the initial exponential retry ms, the max exponential > retry ms in a given time interval. > 2. How does this affect the client timeout? With exponential retry, the > client shall be getting easier to timeout on a parent level caller, for > instance stream attempts to retry initializing producer transactions with > given 5 minute interval. With exponential retry this mechanism could > experience more frequent timeout which we should be careful with. > 3. With regards to #2, we should have more detailed checklist of all the > existing static retry scenarios, and adjust the initial exponential retry > ms to make sure we won't get easily timeout in high level due to too few > attempts. > > Boyang > > On Fri, Mar 13, 2020 at 4:38 PM Sanjana Kaundinya <skaundi...@gmail.com> > wrote: > > > Hi Everyone, > > > > I’ve written a KIP about introducing exponential backoff for Kafka > > clients. Would appreciate any feedback on this. > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients > > > > Thanks, > > Sanjana > >