- To understand what the stuck consumer is doing, it would be useful to collect the logs and a thread dump. I'd try to find out what the fetcher threads are doing. What about the handler/application/stream threads? - are the offsets committed? After a restart, could it be that the consumer is just re-fetching from the last checkpoint, as opposed to fetching messages produced after it stalled. - you ZK session timeout is huge, 16+mn. What is the zk session state before the restart? If the consumer is DISCONNECTED from the ZK cluster, the partitions assigned to it won't be released until the timeout. During that time, the messages sent to these partitions will not be consumed. I believe the logs will show the ZK session state transitions.
On Thu, Mar 10, 2016 at 5:31 AM Abhishek Chawla <abhi...@gmail.com> wrote: > Hi, > I'm using kafka version 0.8.2.1 with old high level consumer api. > I'm following this gude: > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example > > The client is working fine but after few hours of inactivity it gets > inactive and stops receiving messages. > but if I restart my client again then it starts fetching all the messages > which were not consumed during the time of it's inactivity. > > Im using this configuration: > props.put("zookeeper.connect", ZOOKEEPER_HOST_ADDR); > props.put("group.id", "test"); > props.put("zookeeper.session.timeout.ms","1000000"); > props.put("zookeeper.sync.time.ms","20"); > props.put("auto.commit.enable","true"); > props.put("auto.commit.interval.ms", "100"); > props.put("fetch.wait.max.ms", "100"); > props.put("rebalance.backoff.ms", "20000"); > > can someone please help me in figuring out whats going wrong here. > > -- > regards, > Abhishek Chawla >