I am also interested in learning how others are handling this. I also support several services where average message processing time takes 20 seconds per message but p99 time is about 20 minutes and the stop-the-world rebalancing is very painful
On Fri, Jul 19, 2019, 11:38 AM Raman Gupta <rocketra...@gmail.com> wrote: > I've found > https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing:+Support+and+Policies > and > https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing+for+Streams > . > This is *exactly* what I need, right down to the Kubernetes pod > restart case. The number of issues with the current approach to > rebalancing elucidated in these documents is downright scary, and now > I am not surprised I am having tonnes of issues. > > Are there any plans to start implementing delayed imbalance and > standby bootstrap? > > Are there any short-term best practices that can help alleviate these > issues? My main problem right now is the "Instance Bounce" and > "Instance Failover" scenarios, and according to this wiki page, > num.standby.replicas should help with at least the former. Can someone > explain what this does? > > Regards, > Raman > > On Fri, Jul 19, 2019 at 12:53 PM Raman Gupta <rocketra...@gmail.com> > wrote: > > > > I have a situation in which the current rebalancing algorithm seems to > > be extremely sub-optimal. > > > > I have a topic with 100 partitions, and up to 100 separate consumers. > > Processing each message on this topic takes between 1 and 20 minutes, > > depending on the message. > > > > If any of the 100 consumers dies or drops out of the group, there is a > > huge amount of idle time as many consumers (up to 99 of them) finish > > their work and sit around idle, just waiting for the rebalance to > > complete. > > > > In addition, with 100 consumers, its not unusual for one to die for > > one reason or another, so these stop-the-world rebalances are > > happening all the time, making the entire system slow to a snail's > > pace. > > > > It surprises me that rebalance is so inefficient. I would have thought > > that partitions would just be assigned/unassigned to consumers in > > real-time without waiting for the entire consumer group to quiesce. > > > > Is there anything I can do to improve matters? > > > > Regards, > > Raman >