A few thoughts from a non-expert:

connections are also processed asynchronously in the poll loop. If you are
not enabling any timeout, you may be seeing a few initial iterations spent
on setting up the channel connections. Also you probably need a few loop
iterations to get through an initial metadata request / response.

also, if I recall, records should be returned in batches per
topic-partition; not one-by-one. So if/when records are ready, you would
get as many as were received via completed FetchRequests -- depends on
message size and fetch configs max.partition.fetch.bytes, fetch.min.bytes,
and fetch.max.wait.ms. So you shouldn't expect to poll 500x.

I'd suggest using a small, but non-zero timeout when polling. 100ms is used
in the docs quite a bit.

-Dana

On Wed, Dec 30, 2015 at 10:03 AM, Franco Giacosa <fgiac...@gmail.com> wrote:

> Hi,
>
> I am running kafka 0.9.0 locally.
>
> I am having a particular situation in the following scenario.
>
> (1) 1 Producer inserts 500 records (300bytes each aprox) to 1 topic 0
> partition (or 1 as you prefer)
> (2) After the producer finished inserting the 500 records, 1 Consumer reads
> in a loop from this topic with consumer.poll(0)
> and max.partition.fetch.bytes=500, sometimes that call brings records and
> something the loop has to go over a few times until it brings something.
> Can someone explain me why it doesn't fetch a record each time that it
> polls? can a poll operation affect another poll operation?
> why if I've inserted 500 records I have to poll more than 500 times?
>
> I have tried using poll(0), because in the documentation it says, "if 0,
> returns with any records that are available now".
>
> Thanks
>

Reply via email to