[ 
https://issues.apache.org/jira/browse/KAFKA-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203901#comment-15203901
 ] 

Mikael Sundberg commented on KAFKA-3172:
----------------------------------------

Im seeing the same thing. A machine can suddenly stop consuming one or several 
partitions, no errors shown in log. Threaddump gives foe ex:
"pool-4-thread-1" #29 prio=5 os_prio=0 tid=0x00007f8d35a5e000 nid=0x21 runnable 
[0x00007f8d022cc000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000000eeaebf50> (a sun.nio.ch.Util$2)
        - locked <0x00000000eeaebf60> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000eeaebf08> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at org.apache.kafka.common.network.Selector.select(Selector.java:425)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:254)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270)
        at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303)
        at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197)
        at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187)
        at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:877)
        at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829)
        at 
com.klarna.ordermanagement.messaging.kafka.KafkaMessageConsumer.consumeRecords(KafkaMessageConsumer.java:63)
        at 
com.klarna.ordermanagement.messaging.kafka.KafkaMessageConsumer$$Lambda$87/2015198349.run(Unknown
 Source)
        at 
com.klarna.ordermanagement.commons.LambdaUtils.repeat(LambdaUtils.java:81)
        at 
com.klarna.ordermanagement.messaging.kafka.KafkaMessageConsumer.run(KafkaMessageConsumer.java:79)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

> Consumer threads stay in 'Watiting' status and are blocked at consumer poll 
> method
> ----------------------------------------------------------------------------------
>
>                 Key: KAFKA-3172
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3172
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.9.0.0
>         Environment: linux
>            Reporter: Dany Benjamin
>            Assignee: Neha Narkhede
>            Priority: Critical
>             Fix For: 0.9.0.0
>
>         Attachments: jmx_info.png, jstack.png, lagSample.png
>
>
> When running multiple consumers on same group (400 - for a 400 partitioned 
> topic), the application for all threads blocks at consumer.poll() method. The 
> timeout parameter sent in is 1.
> Stack dump:
> "pool-1-thread-198" #424 prio=5 os_prio=0 tid=0x00007f6bb6d53800 nid=0xc349 
> waiting on condition [0x00007f63df8f7000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0000000605812710> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>         at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>         at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> "kafka-producer-network-thread | producer-198" #423 daemon prio=5 os_prio=0 
> tid=0x00007f6bb6d52000 nid=0xc348 runnable [0x00007f63df9f8000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>         - locked <0x00000006058283e8> (a sun.nio.ch.Util$2)
>         - locked <0x00000006058283d8> (a 
> java.util.Collections$UnmodifiableSet)
>         - locked <0x0000000605828390> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>         at org.apache.kafka.common.network.Selector.select(Selector.java:425)
>         at org.apache.kafka.common.network.Selector.poll(Selector.java:254)
>         at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
>         at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
>         at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to