Here is a snapshot of our logs. We've known that our three brokers somehow were offline and that caused the exception.
key LOAN.SMC.134096888 takes 2 ms [06/02/14 08:54:06:006 AM EST] 198 INFO consumer.kafkaConsumerImpl: commitOddLots happen, topicName = credit.cache.smc.debt .topic, selector = null [06/02/14 08:54:28:028 AM EST] 215 INFO mkconsumer.MKConsumer: Put to table Credit.SMC.LOAN info put gdm entity Credit.SMC.LOAN, key LOAN.SMC.134096887 takes 3 ms [06/02/14 08:54:46:046 AM EST] 198 INFO consumer.kafkaConsumerImpl: commitOddLots happen, topicName = credit.cache.smc.debt .topic, selector = null [06/02/14 08:56:03:003 AM EST] 102 ERROR producer.SyncProducer: Producer connection to host7:11934 unsucce ssful java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:634) at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) at kafka.producer.SyncProducer.connect(SyncProducer.scala:146) at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:161) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:68) at kafka.producer.SyncProducer.send(SyncProducer.scala:112) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:53) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:88) at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:64) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) [06/02/14 08:56:03:003 AM EST] 102 ERROR producer.SyncProducer: Producer connection to host9:11934 unsucce ssful java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:634) at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) at kafka.producer.SyncProducer.connect(SyncProducer.scala:146) at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:161) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:68) at kafka.producer.SyncProducer.send(SyncProducer.scala:112) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:53) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:88) at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:64) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) [06/02/14 08:56:20:020 AM EST] 102 ERROR consumer.ZookeeperConsumerConnector: [mluser_mlusergroup_host8-1391315687926-95bc2ef2], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: mluser_mlusergroup_host8-1391315687926-95bc2ef2 can't rebalance after 4 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:397) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326) [06/02/14 08:56:20:020 AM EST] 102 ERROR consumer.ZookeeperConsumerConnector: [mluser_mlusergroup_host8-1391315695110-e8e36bd0], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: mluser_mlusergroup_host8-1391315695110-e8e36bd0 can't rebalance after 4 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:397) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326) [06/02/14 08:56:20:020 AM EST] 102 ERROR consumer.ZookeeperConsumerConnector: [mluser_mlusergroup_host8-1391315699203-94c60ea8], error during syncedRebalance kafka.common.ConsumerRebalanceFailedException: mluser_mlusergroup_host8-1391315699203-94c60ea8 can't rebalance after 4 retries at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:397) at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326) [06/02/14 08:59:18:018 AM EST] 215 INFO mkconsumer.MKConsumer: Put to table Credit.SMC.MUNI info put gdm entity Credit.SMC.MUNI, key MUNI.SMC.134103391 takes 4 ms [06/02/14 08:59:18:018 AM EST] 215 INFO mkconsumer.MKConsumer: Put to table Credit.SMC. Regards, Libo -----Original Message----- From: Neha Narkhede [mailto:neha.narkh...@gmail.com] Sent: Tuesday, February 25, 2014 4:03 PM To: users@kafka.apache.org Subject: Re: ConsumerRebalanceFailedException Could you send around the consumer log when it throws ConsumerRebalanceFailedException. It should state the reason for the failed rebalance attempts. Thanks, Neha On Tue, Feb 25, 2014 at 12:01 PM, Yu, Libo <libo...@citi.com> wrote: > Hi all, > > I tried to reproduce this exception. In case one, when no broker was > running, I launched all consumers and got this exception. In case two, > while the consumers and brokers were running, I shutdown all brokers > one by one and did not see this exception. I wonder why in case two > this exception did not occur. Thanks. > > > Regards, > > Libo > >