I did not see any attachments in either emails? Guozhang
On Fri, Jun 13, 2014 at 1:36 PM, Bhavesh Mistry <mistry.p.bhav...@gmail.com> wrote: > HI Guozhang, > > We have own monitoring tool which get the data from the Zookeeper and you > can see the attached screen which list no owner, but back consumer are not > picking up the partition ? I had restart the java process on machine which > had the above issue. Also, my previous email I have send you the > exception. Please review the image. > > > Thanks, > > Bhavesh > > > On Fri, Jun 13, 2014 at 1:28 PM, Guozhang Wang <wangg...@gmail.com> wrote: > >> They should automatically pick up the partition with no owner. >> >> Could you use kafka.tool.ConsumerOffsetChecker to verify which partition >> does not have an owner, and check the logs of the back-up consumers for >> rebalance process, any exceptions/warning there? >> >> Guozhang >> >> >> On Fri, Jun 13, 2014 at 12:34 PM, Bhavesh Mistry < >> mistry.p.bhav...@gmail.com >> > wrote: >> >> > We have 3 node cluster separate physical box for consumer group and >> > consumer that died "mupd_logmon_hb_ >> > events_sdc-q1-logstream-8-1402448850475-6521f70a". On the box, I show >> the >> > above Exception. What can I configure such way, that when a partition >> in >> > COnsumer Group does not have "Owner" other consumers in group (that are >> > back up can take over). Please let me know. >> > >> > Thanks in advance for your help. >> > >> > Thanks, >> > Bhavesh >> > >> > >> > On Fri, Jun 13, 2014 at 8:14 AM, Guozhang Wang <wangg...@gmail.com> >> wrote: >> > >> > > From which consumer instance did you see these exceptions? >> > > >> > > Guozhang >> > > >> > > >> > > On Thu, Jun 12, 2014 at 4:39 PM, Bhavesh Mistry < >> > > mistry.p.bhav...@gmail.com> >> > > wrote: >> > > >> > > > Hi Kafka Dev Team/ Users, >> > > > >> > > > We have high level consumer group consuming from 32 partitions for a >> > > > topic. We have been running 48 consumers in this group across >> > multiple >> > > > servers. We have kept 16 as back-up consumers, and hoping when the >> > > > consumer dies, meaning when Zookeeper does not have an owner for a >> > > > particular partition. The back-up consumer will take over. But I >> do >> > not >> > > > see this behavior after an active consumer died, the back-up >> consumer >> > did >> > > > not pick the partitions. Please let us know what I can do to >> achieve >> > > > this. This is very likely scenario when rolling out new code on >> > consumer >> > > > side (we will be dong incremental code roll out). Please see the >> > > exception >> > > > below. We are using version 0.8 for now. >> > > > >> > > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a], >> > > > exception during rebalance >> > > > org.I0Itec.zkclient.exception.ZkNoNodeException: >> > > > org.apache.zookeeper.KeeperException$NoNodeException: >> KeeperErrorCode = >> > > > NoNode for >> > > > >> > > > >> > > >> > >> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a >> > > > at >> > > > >> org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) >> > > > at >> > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) >> > > > at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766) >> > > > at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761) >> > > > at kafka.utils.ZkUtils$.readData(Unknown Source) >> > > > at kafka.consumer.TopicCount$.constructTopicCount(Unknown >> Source) >> > > > at >> > > > >> > > > >> > > >> > >> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown >> > > > Source) >> > > > at >> > > > >> > > > >> > > >> > >> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown >> > > > Source) >> > > > at >> scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) >> > > > at >> > > > >> > > > >> > > >> > >> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown >> > > > Source) >> > > > at >> > > > >> > > > >> > > >> > >> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown >> > > > Source) >> > > > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException: >> > > > KeeperErrorCode = NoNode for >> > > > >> > > > >> > > >> > >> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a* >> > > > at >> > > > >> org.apache.zookeeper.KeeperException.create(KeeperException.java:102) >> > > > at >> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) >> > > > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921) >> > > > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950) >> > > > at >> org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103) >> > > > at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770) >> > > > at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766) >> > > > at >> > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) >> > > > >> > > > >> > > > 11 Jun 2014 14:12:16,710 ERROR >> > > > >> > > > >> > > >> > >> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor] >> > > > (kafka.utils.Logging$class.error:?) - >> > > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a], >> > error >> > > > during syncedRebalance >> > > > kafka.common.ConsumerRebalanceFailedException: >> > > > mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a >> can't >> > > > rebalance *after 4 retries* >> > > > at >> > > > >> > > > >> > > >> > >> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown >> > > > Source) >> > > > at >> > > > >> > > > >> > > >> > >> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown >> > > > Source) >> > > > >> > > > >> > > > >> > > > Thanks, >> > > > Bhavesh >> > > > >> > > >> > > >> > > >> > > -- >> > > -- Guozhang >> > > >> > >> >> >> >> -- >> -- Guozhang >> > > -- -- Guozhang