This NOT a harmless race. Now my QA teammate is encountering this issue under load. The result of it is a background thread that is spinning in a loop that always hits a NullPointerException.
I have implemented a variety of assurances in my application code to ensure that the high-level consumer I'm spinning up in Java stays alive for at least 10 seconds before being asked to shutdown. Yet the issue still persists. Suggestions? ________________________________________ From: Jun Rao [jun...@gmail.com] Sent: Tuesday, June 25, 2013 11:58 PM To: users@kafka.apache.org Subject: Re: 0.8 throwing exception "Failed to find leader" and high-level consumer fails to make progress The exception is likely due to a race condition btw the logic in ZK watcher and the closing of ZK connection. It's harmless, except for the weird exception. Thanks, Jun On Tue, Jun 25, 2013 at 10:07 AM, Hargett, Phil < phil.harg...@mirror-image.com> wrote: > Possibly. > > I see evidence that its being stopped / started every 30 seconds in same > cases (due to my code). It's entirely possible that I have a race, too, in > that 2 separate pieces of code could be triggering such a stop / start. > > Gives me something to track down. Thank you!! > > On Jun 25, 2013, at 12:45 PM, "Jun Rao" <jun...@gmail.com> wrote: > > > This typically only happens when the consumerConnector is shut down. Are > > you restarting the consumerConnector often? > > > > Thanks, > > > > Jun > > > > > > On Tue, Jun 25, 2013 at 9:40 AM, Hargett, Phil < > > phil.harg...@mirror-image.com> wrote: > > > >> Seeing this exception a LOT (3-4 times per second, same log topic). > >> > >> I'm using external code to feed data to about 50 different log topics > over > >> a cluster of 3 Kafka 0.8 brokers. There are 3 ZooKeeper instances as > well, > >> all of this is running on EC2. My application creates a high-level > >> consumer (1 per topic) to consumer data from each and do further > processing. > >> > >> The problem is this exception is in the high-level consumer, so my code > >> has no way of knowing that it's become stuck. > >> > >> This exception does not always appear, but as far as I can tell, once > this > >> happens, the only cure is to restart my application's process. > >> > >> I saw this in 0.8 built from source about 1 week ago, and also am seeing > >> it today after pulling the latest 0.8 sources and rebuilding Kafka. > >> > >> Thoughts? > >> > >> Failed to find leader for Set([topic6,0]): > java.lang.NullPointerException > >> at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416) > >> at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413) > >> at > >> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) > >> at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413) > >> at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409) > >> at > >> kafka.utils.ZkUtils$.getChildrenParentMayNotExist(ZkUtils.scala:438) > >> at kafka.utils.ZkUtils$.getAllBrokersInCluster(ZkUtils.scala:75) > >> at > >> > kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:63) > >> at > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) > >> >