Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-25 Thread Guozhang Wang
That seems to be a real bug -- and a pretty common one. We will look into it asap. Guozhang On Thu, Jul 25, 2019 at 7:26 AM Raman Gupta wrote: > I'm looking forward to the incremental rebalancing protocol. In the > meantime, I've updated to Kafka 2.3.0 to take advantage of the static > group m

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-25 Thread Raman Gupta
I'm looking forward to the incremental rebalancing protocol. In the meantime, I've updated to Kafka 2.3.0 to take advantage of the static group membership, and this has actually already helped tremendously. However, unfortunately while it was working initially, some streams are now unable to start

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-22 Thread Guozhang Wang
Hello Raman, since you are using Consumer and you are concerning about the member-failure triggered rebalance, I think KIP-429 is most relevant to your scenario. As Matthias mentioned we are working on getting it in to the next release 2.4. Guozhang On Sat, Jul 20, 2019 at 6:36 PM Matthias J. Sa

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-20 Thread Matthias J. Sax
Static-Group membership ships with AK 2.3 (the open tickets of the KIP are minor): https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances There is also KIP-415 for Kafka Connect in AK 2.3: https://cwiki.apache.org/confluenc

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-19 Thread Jeff Widman
I am also interested in learning how others are handling this. I also support several services where average message processing time takes 20 seconds per message but p99 time is about 20 minutes and the stop-the-world rebalancing is very painful On Fri, Jul 19, 2019, 11:38 AM Raman Gupta wrote:

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-19 Thread Raman Gupta
I've found https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing:+Support+and+Policies and https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing+for+Streams. This is *exactly* what I need, right down to the Kubernetes pod restart cas

Rebalancing algorithm is extremely suboptimal for long processing

2019-07-19 Thread Raman Gupta
I have a situation in which the current rebalancing algorithm seems to be extremely sub-optimal. I have a topic with 100 partitions, and up to 100 separate consumers. Processing each message on this topic takes between 1 and 20 minutes, depending on the message. If any of the 100 consumers dies o