Hey Guozhang, I just forced the error by killing one of my consumer JVMs and I am getting a consumer rebalance failure:
2013-11-18 22:46:54 k.c.ZookeeperConsumerConnector [ERROR] [bridgeTopology_host-1384493092466-7099d843], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: bridgeTopology_host-1384493092466-7099d843 can't rebalance after 10 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:428) ~[stormjar.jar:na] at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:355) ~[stormjar.jar:na] These are the relevant lines in my consumer properties file: rebalance.max.retries=10 rebalance.backoff.ms=10000 My topic has 128 partitions Are there some other configuration settings I should be using? On Mon, Nov 18, 2013 at 2:37 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Hello Drew, > > Do you see any rebalance failure exceptions in the consumer log? > > Guozhang > > > On Mon, Nov 18, 2013 at 2:14 PM, Drew Goya <d...@gradientx.com> wrote: > > > So I've run into a problem where occasionally, some partitions within a > > topic end up in a "none" owner state for a long time. > > > > I'm using the high-level consumer on several machines, each consumer has > 4 > > threads. > > > > Normally when I run the ConsumerOffsetChecker, all partitions have owners > > and similar lag. > > > > Occasionally I end up in this state: > > > > trackingGroup Events2 32 552506856 > > 569853398 17346542 none > > trackingGroup Events2 33 553649131 > > 569775298 16126167 none > > trackingGroup Events2 34 552380321 > > 569572719 17192398 none > > trackingGroup Events2 35 553206745 > > 569448821 16242076 none > > trackingGroup Events2 36 553673576 > > 570084283 16410707 none > > trackingGroup Events2 37 552669833 > > 569765642 17095809 none > > trackingGroup Events2 38 553147178 > > 569766985 16619807 none > > trackingGroup Events2 39 552495219 > > 569837815 17342596 none > > trackingGroup Events2 40 570108655 > > 570111080 2425 > > trackingGroup_host6-1384385417822-23157ae8-0 > > trackingGroup Events2 41 570288505 > > 570291068 2563 > > trackingGroup_host6-1384385417822-23157ae8-0 > > trackingGroup Events2 42 569929870 > > 569932330 2460 > > trackingGroup_host6-1384385417822-23157ae8-0 > > > > I'm at the point where I'm considering writing my own client but > hopefully > > the user group has the answer! > > > > I am using this commit of 8.0 on both the brokers and clients: > > d4553da609ea9af6db8a79faf116d1623c8a856f > > > > > > -- > -- Guozhang >