Hey Joel I can see a ConsumerFetcherThread in the dump, I include the full dump this time as an attachment in case it proves useful to you.
Thanks for all the help Pablo On Fri, Jul 25, 2014 at 7:30 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > Did you see any fetcher threads in the thread dump? If not it seems > they may have exited for some reason and the iterators are blocked on > receiving data. > > On Fri, Jul 25, 2014 at 12:50:13PM +0100, Pablo Picko wrote: > > Hey Joel > > > > I actually did issue a kill -3 to get a view on the consumer at the time > of > > the issue. I have just found the output I had 20 threads and all of them > > look like the following. I think it looks Ok. > > > > 2014/07/24 00:24:03 | "pool-2-thread-20" prio=3D10 = > > tid=3D0x00007f55f4764800 nid=3D0x76b1 waiting on condition = > > [0x00007f56526cf000] > > 2014/07/24 00:24:03 | java.lang.Thread.State: = > > WAITING (parking) > > 2014/07/24 00:24:03 | at = > > sun.misc.Unsafe.park(Native Method) > > 2014/07/24 00:24:03 | - parking to wait for = > > <0x00000000e0521508> (a = > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > 2014/07/24 00:24:03 | at = > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > 2014/07/24 00:24:03 | at = > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awai= > > t(AbstractQueuedSynchronizer.java:2043) > > 2014/07/24 00:24:03 | at = > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442= > > ) > > 2014/07/24 00:24:03 | at = > > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:63) > > 2014/07/24 00:24:03 | at = > > kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33) > > 2014/07/24 00:24:03 | at = > > kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) > > 2014/07/24 00:24:03 | at = > > kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58) > > 2014/07/24 00:24:03 | at = > > com.foor.bar.kafka.FooBarConsumerTask.run(FooBarConsumerTask.java:47) > > 2014/07/24 00:24:03 | at = > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > 2014/07/24 00:24:03 | at = > > java.util.concurrent.FutureTask.run(FutureTask.java:262) > > 2014/07/24 00:24:03 | at = > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:= > > 1145) > > 2014/07/24 00:24:03 | at = > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java= > > :615) > > 2014/07/24 00:24:03 | at = > > java.lang.Thread.run(Thread.java:744) > > 2014/07/24 00:24:03 |=20 > > > > Thanks > > Paul > > > > > > > > On Thu, Jul 24, 2014 at 9:56 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > > > > > Pablo, if you see this again, can you take a thread-dump of your > > > consumer and verify that the fetchers to all the brokers are still > > > alive as well as the corresponding iterator threads? It could be that > > > your consumer ran into some decoder error or some other exception > > > (although in general that should show up in the log). > > > > > > Thanks, > > > > > > Joel > > > > > > On Thu, Jul 24, 2014 at 03:47:58PM -0400, Joe Stein wrote: > > > > For the consumer you should see logs like > > > > > > > > "Connecting to zookeeper instance at " + config.zkConnect > > > > "begin registering consumer " + consumerIdString + " in ZK > > > > consumerThreadId + " successfully owned partition " + partition + " > for > > > > topic " + topic > > > > "starting auto committer every " + config.autoCommitIntervalMs + " > ms" > > > > > > > > all coming from kafka.consumer.ZookeeperConsumerConnector > > > > > > > > /******************************************* > > > > Joe Stein > > > > Founder, Principal Consultant > > > > Big Data Open Source Security LLC > > > > http://www.stealth.ly > > > > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > > > > ********************************************/ > > > > > > > > > > > > On Thu, Jul 24, 2014 at 3:30 PM, Pablo Picko <p...@pitchsider.com> > > > wrote: > > > > > > > > > Hey guys.. > > > > > > > > > > I have my my log level set to info, saying that I am not seeing > much > > > logs > > > > > at all for kafka on startup i see detail about the > serializer.class my > > > > > producer uses but very little consumer related logs is there > anything I > > > > > should always see if my log config is correct for the info level > > > > > > > > > > In relation to my settings for the number of streams here is my > code. > > > > > > > > > > *Map<String, Integer> topicCountMap = new HashMap<>();* > > > > > > > > > > *//i've 20 threads for the topic for the 20 partitions with no > > > replicas* > > > > > > > > > > *topicCountMap.put(topicConsumer.getTopic(), > > > > > topicConsumer.getNumThreads());* > > > > > > > > > > > > > > > *consumerConnector.createMessageStreams(topicCountMap);* > > > > > > > > > > Thanks > > > > > > > > > > Pablo > > > > > > > > > > > > > > > On 24 Jul 2014 18:34, "Joe Stein" <joe.st...@stealth.ly> wrote: > > > > > > > > > > > What is the value for what you are setting for your number of > streams > > > > > when > > > > > > calling createMessageStreamsByFilter or if using > > > createMessageStreams for > > > > > > the TopicCount ( topic -> numberOfStreams )? > > > > > > > > > > > > How are you threading the iterator on each stream? > > > > > > > > > > > > /******************************************* > > > > > > Joe Stein > > > > > > Founder, Principal Consultant > > > > > > Big Data Open Source Security LLC > > > > > > http://www.stealth.ly > > > > > > Twitter: @allthingshadoop < > http://www.twitter.com/allthingshadoop> > > > > > > ********************************************/ > > > > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 1:05 PM, Pablo Picko < > p...@pitchsider.com> > > > > > wrote: > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > I didn't no. I did spot other people with similar symptoms to > my > > > > > problem > > > > > > > mentioning your suggestion too but I don't see anything in the > log > > > to > > > > > > > suggest it rebalanced. It could very well be the reason but I > > > can't see > > > > > > > anything suggesting it is yet. > > > > > > > > > > > > > > Thanks > > > > > > > Pablo > > > > > > > On 24 Jul 2014 17:57, "Guozhang Wang" <wangg...@gmail.com> > wrote: > > > > > > > > > > > > > > > Pablo, > > > > > > > > > > > > > > > > Do you see any rebalance related logs in consumers? > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 9:02 AM, Pablo Picko < > > > p...@pitchsider.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hey Guozhang > > > > > > > > > > > > > > > > > > Thanks for the reply, No nothing at all in the logs to > suggest > > > > > > anything > > > > > > > > > went wrong. > > > > > > > > > > > > > > > > > > Its really puzzling as to what's happened. When I > restarted the > > > > > > > consumer > > > > > > > > > everything worked again. > > > > > > > > > > > > > > > > > > Prior to the restart I even stopped the producer for a bit. > > > However > > > > > > any > > > > > > > > > messages that got assigned to the Broker C never got > processed. > > > > > When > > > > > > I > > > > > > > > ran > > > > > > > > > the console consumer script before I restarted it was able > to > > > all > > > > > > print > > > > > > > > all > > > > > > > > > messages, including messages on broker C. It seems to be > that > > > for > > > > > my > > > > > > > > > consumers consumergroup one of the brokers messages just > became > > > > > > > > > inaccessible. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > Pablo > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 4:12 PM, Guozhang Wang < > > > wangg...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Pablo, > > > > > > > > > > > > > > > > > > > > During the period did you see any exception/errors on > Broker > > > C's > > > > > > logs > > > > > > > > and > > > > > > > > > > the consumer logs also? > > > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jul 24, 2014 at 6:23 AM, Pablo Picko < > > > > > p...@pitchsider.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hello all > > > > > > > > > > > > > > > > > > > > > > Some background. > > > > > > > > > > > > > > > > > > > > > > I have a 3 kafka brokers A,B and C, there is a kafka > topic > > > > > called > > > > > > > > topic > > > > > > > > > > > with 20 partitions (no replicas). > > > > > > > > > > > > > > > > > > > > > > Everything has been working fine for about a week when > > > suddenly > > > > > > all > > > > > > > > the > > > > > > > > > > > data sent to partitions belonging to broker C are not > seen > > > by > > > > > the > > > > > > > > > > Consumer > > > > > > > > > > > the consumer is using the high level consumer and does > not > > > look > > > > > > > much > > > > > > > > > > > different to the sample provided in the documentation. > > > > > > > > > > > > > > > > > > > > > > When I inspected the topic i can see that all the > > > partitions > > > > > are > > > > > > > > > lagging > > > > > > > > > > > behind. A restart (og the consumer) seems to sort it > out > > > but I > > > > > am > > > > > > > > > stumped > > > > > > > > > > > as to whats doing on any help appreciated. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > Pablo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >