On 2/13/16 6:16 AM, Ismael Juma wrote: > Hi Damian, > > KAFKA-2978, which would cause consumption to stop would happen after a > consumer group rebalance. Was this the case for you? > > It would be great if you could upgrade the client to 0.9.0.1 RC1 in order > to check if the problem still happens. There were other bugs fixed in the > 0.9.0 branch and it simplifies the analysis if we can rule them out. See > the following for the full list: > > https://github.com/apache/kafka/compare/0.9.0.0...0.9.0.1
I was having a similar problem with 0.9.0.0. I have a very simple setup with just one broker, clients on the same host and manual partition assignment. The clients use poll(100) in a loop. Sometimes there are periods of an hour or more when there are no messages to consume. With 0.9.0.0, this would result in a slew of "marking the coordinator ... dead" messages and then the clients would just stop consuming. With 0.9.0.1-RC1, I still get the log messages every 9 minutes; but the clients now seem to recover or are not affected. I notice that 9 minutes is the default for the new property, connections.max.idle.ms. Do I need to increase the value of this property? Phil > > Thanks, > Ismael > > On Sat, Feb 13, 2016 at 9:43 AM, Damian Guy <damian....@gmail.com> wrote: > >> I've been having some issues with the New Consumer. I'm aware there is a >> bug that has been fixed for 0.9.0.1, but is this the same thing? >> I'm using manual partition assignment due to latency issues making it near >> impossible to work with the group management features. >> >> So, my consumer was going along fine for most of the day - it just consumes >> from a topic with a single partition. However it has just stopped receiving >> messages and I can see there is a backlog of around 100k messages to get >> through. Since message consumption has stopped i get the below "Marking the >> coordinator dead" log messages every 9 minutes. I have done multiple stack >> dumps to see what is happening, one of which is below, and it is always >> appears to be in the consumer.poll >> >> So.. same bug as the one i believe is fixed on 0.9.0.1? In which case i'll >> upgrade my client to the latest from the branch. Or is this something >> different? >> >> Thanks, >> Damian >> >> 2016/02/13 00:07:57 131.73 MB/1.8 GB INFO >> org.apache.kafka.clients.consumer.internals.AbstractCoordinator - >> Marking the coordinator 2147479630 dead. >> 2016/02/13 00:16:57 151.75 MB/1.79 GB INFO >> org.apache.kafka.clients.consumer.internals.AbstractCoordinator - >> Marking the coordinator 2147479630 dead. >> 2016/02/13 00:25:57 181.07 MB/1.76 GB INFO >> org.apache.kafka.clients.consumer.internals.AbstractCoordinator - >> Marking the coordinator 2147479630 dead. >> org.apache.kafka.clients.consumer.internals.AbstractCoordinator - >> Marking the coordinator 2147479630 dead. >> >> "poll-kafka-1" #45 prio=5 os_prio=0 tid=0x00007f7dba9da800 nid=0x52fd >> runnable [0x00007f7cecbe3000] >> java.lang.Thread.State: RUNNABLE >> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) >> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) >> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) >> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) >> - locked <0x00000000ac6df4c8> (a sun.nio.ch.Util$2) >> - locked <0x00000000ac6df4b0> (a >> java.util.Collections$UnmodifiableSet) >> - locked <0x00000000ac53db20> (a sun.nio.ch.EPollSelectorImpl) >> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) >> at >> org.apache.kafka.common.network.Selector.select(Selector.java:425) >> at org.apache.kafka.common.network.Selector.poll(Selector.java:254) >> at >> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:270) >> at >> >> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:303) >> at >> >> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:197) >> at >> >> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:187) >> at >> >> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:877) >> at >> >> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829) >>