The consumer instances close I.e leave the group only if they are idle for a long time..we have expiration threads which monitor this and remove any consumer instances if they keep sitting . Also , consumers are closed when the application is shut down. The poll() does receive around 481 records the second time, but we process only 10 messages at a time. So the processing time is not very large . ________________________________ From: Ewen Cheslack-Postava <e...@confluent.io> Sent: 20 June 2016 10:52:29 To: users@kafka.apache.org Subject: Re: consumer.poll() takes approx. 30 seconds - 0.9 new consumer api
Rohit, The 30s number sounds very suspicious because it is exactly the value of the session timeout. But if you are driving the consumer correctly, you shouldn't normally hit this timeout. Dana was asking about consumers leaving gracefully because that is one case where you can inadvertently trigger the 30s timeout, require *all* group members to wait that long before they decide one of the previous members has ungracefully left the group and they move on without it. It sounds like something you are doing is causing the group to wait for the session timeout. Is it possible any of your processes are exiting without calling consumer.close()? Or that any of your processes are not calling consumer.poll() within the session timeout of 30s? This can sometimes happen if they receive too much data and take too long to process it (0.10 introduced max.poll.records to help users control this, and we're making further refinements to the consumer to provide better application control over number of messages fetched vs total processing time). -Ewen On Sun, Jun 19, 2016 at 10:01 PM, Rohit Sardesai <rohit.sarde...@outlook.com > wrote: > > Can anybody help out on this? > ________________________________ > From: Rohit Sardesai > Sent: 19 June 2016 11:47:01 > To: users@kafka.apache.org > Subject: Re: consumer.poll() takes approx. 30 seconds - 0.9 new consumer > api > > > In my tests , I am using around 24 consumer groups. I never call > consumer.close() or consumer.unsubscribe() until the application is > shutting down. > > So the consumers never leave but new consumer instances do get created as > the parallel requests pile up . Also, I am reusing consumer instances > > if they are idle ( i,.e not serving any consume request). So with 9 > partitions , I do 9 parallel consume requests in parallel every second > under the same consumer group. > > So to summarize I have the following test setup : 3 Kafka brokers , 2 > zookeeper nodes, 1 topic , 9 partitions , 24 consumer groups and 9 consume > requests at a time. > > > ________________________________ > From: Dana Powers <dana.pow...@gmail.com> > Sent: 19 June 2016 10:45 > To: users@kafka.apache.org > Subject: Re: consumer.poll() takes approx. 30 seconds - 0.9 new consumer > api > > Is your test reusing a group name? And if so, are your consumer instances > gracefully leaving? This may cause subsequent 'rebalance' operations to > block until those old consumers check-in or the session timeout happens > (30secs) > > -Dana > On Jun 18, 2016 8:56 PM, "Rohit Sardesai" <rohit.sarde...@outlook.com> > wrote: > > > I am using the group management feature of Kafka 0.9 to handle partition > > assignment to consumer instances. I use the subscribe() API to subscribe > to > > the topic I am interested in reading data from. I have an environment > > where I have 3 Kafka brokers with a couple of Zookeeper nodes . I > created > > a topic with 9 partitions . The performance tests attempt to send 9 > > parallel poll() requests to the Kafka brokers every second. The results > > show that each poll() operation takes around 30 seconds for the first > time > > it polls and returns 0 records. Also , when I print the partition > > assignment to this consumer instance , I see no partitions assigned to > it. > > The next poll() does return quickly ( ~ 10-20 ms) with data and some > > partitions assigned to it. > > > > With each consumer taking 30 seconds , the performance tests report very > > low throughput since I run the tests for around 1000 seconds out which I > > produce messages on the topic for the complete duration and I start the > > parallel consume requests after 400 seconds. So out of 400 seconds , > with 9 > > consumers taking 30 seconds each , around 270 seconds are spent in the > > first poll without any data. Is this because of the re-balance operation > > that the consumers are blocked on the poll() ? What is the best way to > use > > poll() if I have to serve many parallel requests per second ? Should I > > prefer manual assignment of partitions in this case instead of relying on > > re-balance ? > > > > > > Regards, > > > > Rohit Sardesai > > > > > -- Thanks, Ewen