Hi,

I am using kafka-0.9.0.1 and have configured the Kafka consumer  to fetch
8192 bytes by setting max.partition.fetch.bytes

Here are the properties I am using

props.put("bootstrap.servers", servers);
props.put("group.id", "perf-test");
props.put("offset.storage", "kafka");
props.put("enable.auto.commit", "false");
props.put("session.timeout.ms", 60000);
props.put("request.timeout.ms", 70000);
props.put("heartbeat.interval.ms", 50000);
props.put("auto.offset.reset", "latest");
props.put("max.partition.fetch.bytes", "8192");
props.put("key.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");

I am setting up 12 Consumers with 4 workers each to listen on a topic with
200 partitions.
I have also enabled the compression when sending to Kafka.

The problem I am getting is, even though the fetch size is less, the
consumers when polling, poll too many records. If the topics have many
messages and it is behind in the consumption it tries to fetch bigger size,
if the consumer is not behind then it try and fetch around 45, but anyways
if I set the max.partition.fetch.bytes shouldn't the fetch size have an
upper limit ? Is there any other setting I am missing here ?
I am myself controlling the message size so it's not that some bigger
messages are coming through, each message must be around 200-300 bytes only.

Due the large number of messages it is polling, the inner process sometimes
not able to finish the process within the heartbeat interval limit, which
makes the consumer rebalancing kick in, again and again, this only happens
when the consumer is way behind in offset e.g there are 100000 messages to
be processed in the topic.

Thanks

Reply via email to