Hi, I am using kafka-0.9.0.1 and have configured the Kafka consumer to fetch 8192 bytes by setting max.partition.fetch.bytes
Here are the properties I am using props.put("bootstrap.servers", servers); props.put("group.id", "perf-test"); props.put("offset.storage", "kafka"); props.put("enable.auto.commit", "false"); props.put("session.timeout.ms", 60000); props.put("request.timeout.ms", 70000); props.put("heartbeat.interval.ms", 50000); props.put("auto.offset.reset", "latest"); props.put("max.partition.fetch.bytes", "8192"); props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); I am setting up 12 Consumers with 4 workers each to listen on a topic with 200 partitions. I have also enabled the compression when sending to Kafka. The problem I am getting is, even though the fetch size is less, the consumers when polling, poll too many records. If the topics have many messages and it is behind in the consumption it tries to fetch bigger size, if the consumer is not behind then it try and fetch around 45, but anyways if I set the max.partition.fetch.bytes shouldn't the fetch size have an upper limit ? Is there any other setting I am missing here ? I am myself controlling the message size so it's not that some bigger messages are coming through, each message must be around 200-300 bytes only. Due the large number of messages it is polling, the inner process sometimes not able to finish the process within the heartbeat interval limit, which makes the consumer rebalancing kick in, again and again, this only happens when the consumer is way behind in offset e.g there are 100000 messages to be processed in the topic. Thanks