After executing PreferredReplicaLeaderElectionCommand on broker instance, we observed one of the consumers cannot find the leadership and stopped consuming. The following exception is all over the log file and it appears that the consumer cannot recover from it:
2014-10-29 00:53:30,492 WARN surorouter-logsummary_surorouter-logsummary-i-eaef7107-1413327811303-4afb7b23-leader-finder-thread ConsumerFetcherManager$LeaderFinderThread - [surorouter-logsummary_surorouter-logsummary-i-eaef7107-1413327811303-4afb7b23-leader-finder-thread], Failed to find leader for Set([nf_errors_log,28], [nf_errors_log,29]) java.lang.NullPointerException at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:99) at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416) at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413) at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409) at kafka.utils.ZkUtils$.getChildrenParentMayNotExist(ZkUtils.scala:487) at kafka.utils.ZkUtils$.getAllBrokersInCluster(ZkUtils.scala:84) at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:65) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) Except for this instance, other consumer instances are fine. Is there a workaround? Should we report it as a bug? Thanks, Allen