A few thoughts from a non-expert: connections are also processed asynchronously in the poll loop. If you are not enabling any timeout, you may be seeing a few initial iterations spent on setting up the channel connections. Also you probably need a few loop iterations to get through an initial metadata request / response.
also, if I recall, records should be returned in batches per topic-partition; not one-by-one. So if/when records are ready, you would get as many as were received via completed FetchRequests -- depends on message size and fetch configs max.partition.fetch.bytes, fetch.min.bytes, and fetch.max.wait.ms. So you shouldn't expect to poll 500x. I'd suggest using a small, but non-zero timeout when polling. 100ms is used in the docs quite a bit. -Dana On Wed, Dec 30, 2015 at 10:03 AM, Franco Giacosa <fgiac...@gmail.com> wrote: > Hi, > > I am running kafka 0.9.0 locally. > > I am having a particular situation in the following scenario. > > (1) 1 Producer inserts 500 records (300bytes each aprox) to 1 topic 0 > partition (or 1 as you prefer) > (2) After the producer finished inserting the 500 records, 1 Consumer reads > in a loop from this topic with consumer.poll(0) > and max.partition.fetch.bytes=500, sometimes that call brings records and > something the loop has to go over a few times until it brings something. > Can someone explain me why it doesn't fetch a record each time that it > polls? can a poll operation affect another poll operation? > why if I've inserted 500 records I have to poll more than 500 times? > > I have tried using poll(0), because in the documentation it says, "if 0, > returns with any records that are available now". > > Thanks >