Did you see any fetcher threads in the thread dump? If not it seems they may have exited for some reason and the iterators are blocked on receiving data.
On Fri, Jul 25, 2014 at 12:50:13PM +0100, Pablo Picko wrote: > Hey Joel > > I actually did issue a kill -3 to get a view on the consumer at the time of > the issue. I have just found the output I had 20 threads and all of them > look like the following. I think it looks Ok. > > 2014/07/24 00:24:03 | "pool-2-thread-20" prio=3D10 = > tid=3D0x00007f55f4764800 nid=3D0x76b1 waiting on condition = > [0x00007f56526cf000] > 2014/07/24 00:24:03 | java.lang.Thread.State: = > WAITING (parking) > 2014/07/24 00:24:03 | at = > sun.misc.Unsafe.park(Native Method) > 2014/07/24 00:24:03 | - parking to wait for = > <0x00000000e0521508> (a = > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > 2014/07/24 00:24:03 | at = > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > 2014/07/24 00:24:03 | at = > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awai= > t(AbstractQueuedSynchronizer.java:2043) > 2014/07/24 00:24:03 | at = > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442= > ) > 2014/07/24 00:24:03 | at = > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:63) > 2014/07/24 00:24:03 | at = > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33) > 2014/07/24 00:24:03 | at = > kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) > 2014/07/24 00:24:03 | at = > kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58) > 2014/07/24 00:24:03 | at = > com.foor.bar.kafka.FooBarConsumerTask.run(FooBarConsumerTask.java:47) > 2014/07/24 00:24:03 | at = > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > 2014/07/24 00:24:03 | at = > java.util.concurrent.FutureTask.run(FutureTask.java:262) > 2014/07/24 00:24:03 | at = > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:= > 1145) > 2014/07/24 00:24:03 | at = > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= > :615) > 2014/07/24 00:24:03 | at = > java.lang.Thread.run(Thread.java:744) > 2014/07/24 00:24:03 |=20 > > Thanks > Paul > > > > On Thu, Jul 24, 2014 at 9:56 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > > > Pablo, if you see this again, can you take a thread-dump of your > > consumer and verify that the fetchers to all the brokers are still > > alive as well as the corresponding iterator threads? It could be that > > your consumer ran into some decoder error or some other exception > > (although in general that should show up in the log). > > > > Thanks, > > > > Joel > > > > On Thu, Jul 24, 2014 at 03:47:58PM -0400, Joe Stein wrote: > > > For the consumer you should see logs like > > > > > > "Connecting to zookeeper instance at " + config.zkConnect > > > "begin registering consumer " + consumerIdString + " in ZK > > > consumerThreadId + " successfully owned partition " + partition + " for > > > topic " + topic > > > "starting auto committer every " + config.autoCommitIntervalMs + " ms" > > > > > > all coming from kafka.consumer.ZookeeperConsumerConnector > > > > > > /******************************************* > > > Joe Stein > > > Founder, Principal Consultant > > > Big Data Open Source Security LLC > > > http://www.stealth.ly > > > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > > > ********************************************/ > > > > > > > > > On Thu, Jul 24, 2014 at 3:30 PM, Pablo Picko <p...@pitchsider.com> > > wrote: > > > > > > > Hey guys.. > > > > > > > > I have my my log level set to info, saying that I am not seeing much > > logs > > > > at all for kafka on startup i see detail about the serializer.class my > > > > producer uses but very little consumer related logs is there anything I > > > > should always see if my log config is correct for the info level > > > > > > > > In relation to my settings for the number of streams here is my code. > > > > > > > > *Map<String, Integer> topicCountMap = new HashMap<>();* > > > > > > > > *//i've 20 threads for the topic for the 20 partitions with no > > replicas* > > > > > > > > *topicCountMap.put(topicConsumer.getTopic(), > > > > topicConsumer.getNumThreads());* > > > > > > > > > > > > *consumerConnector.createMessageStreams(topicCountMap);* > > > > > > > > Thanks > > > > > > > > Pablo > > > > > > > > > > > > On 24 Jul 2014 18:34, "Joe Stein" <joe.st...@stealth.ly> wrote: > > > > > > > > > What is the value for what you are setting for your number of streams > > > > when > > > > > calling createMessageStreamsByFilter or if using > > createMessageStreams for > > > > > the TopicCount ( topic -> numberOfStreams )? > > > > > > > > > > How are you threading the iterator on each stream? > > > > > > > > > > /******************************************* > > > > > Joe Stein > > > > > Founder, Principal Consultant > > > > > Big Data Open Source Security LLC > > > > > http://www.stealth.ly > > > > > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > > > > > ********************************************/ > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 1:05 PM, Pablo Picko <p...@pitchsider.com> > > > > wrote: > > > > > > > > > > > Guozhang > > > > > > > > > > > > I didn't no. I did spot other people with similar symptoms to my > > > > problem > > > > > > mentioning your suggestion too but I don't see anything in the log > > to > > > > > > suggest it rebalanced. It could very well be the reason but I > > can't see > > > > > > anything suggesting it is yet. > > > > > > > > > > > > Thanks > > > > > > Pablo > > > > > > On 24 Jul 2014 17:57, "Guozhang Wang" <wangg...@gmail.com> wrote: > > > > > > > > > > > > > Pablo, > > > > > > > > > > > > > > Do you see any rebalance related logs in consumers? > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 9:02 AM, Pablo Picko < > > p...@pitchsider.com> > > > > > > wrote: > > > > > > > > > > > > > > > Hey Guozhang > > > > > > > > > > > > > > > > Thanks for the reply, No nothing at all in the logs to suggest > > > > > anything > > > > > > > > went wrong. > > > > > > > > > > > > > > > > Its really puzzling as to what's happened. When I restarted the > > > > > > consumer > > > > > > > > everything worked again. > > > > > > > > > > > > > > > > Prior to the restart I even stopped the producer for a bit. > > However > > > > > any > > > > > > > > messages that got assigned to the Broker C never got processed. > > > > When > > > > > I > > > > > > > ran > > > > > > > > the console consumer script before I restarted it was able to > > all > > > > > print > > > > > > > all > > > > > > > > messages, including messages on broker C. It seems to be that > > for > > > > my > > > > > > > > consumers consumergroup one of the brokers messages just became > > > > > > > > inaccessible. > > > > > > > > > > > > > > > > Thanks > > > > > > > > Pablo > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 4:12 PM, Guozhang Wang < > > wangg...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Pablo, > > > > > > > > > > > > > > > > > > During the period did you see any exception/errors on Broker > > C's > > > > > logs > > > > > > > and > > > > > > > > > the consumer logs also? > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 6:23 AM, Pablo Picko < > > > > p...@pitchsider.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello all > > > > > > > > > > > > > > > > > > > > Some background. > > > > > > > > > > > > > > > > > > > > I have a 3 kafka brokers A,B and C, there is a kafka topic > > > > called > > > > > > > topic > > > > > > > > > > with 20 partitions (no replicas). > > > > > > > > > > > > > > > > > > > > Everything has been working fine for about a week when > > suddenly > > > > > all > > > > > > > the > > > > > > > > > > data sent to partitions belonging to broker C are not seen > > by > > > > the > > > > > > > > > Consumer > > > > > > > > > > the consumer is using the high level consumer and does not > > look > > > > > > much > > > > > > > > > > different to the sample provided in the documentation. > > > > > > > > > > > > > > > > > > > > When I inspected the topic i can see that all the > > partitions > > > > are > > > > > > > > lagging > > > > > > > > > > behind. A restart (og the consumer) seems to sort it out > > but I > > > > am > > > > > > > > stumped > > > > > > > > > > as to whats doing on any help appreciated. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > Pablo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > >