I've been investigating consumer group rebalances happening when I don't think they should and have noticed an issue. In a nutshell, if a consumer is receiving messages in response to every fetch request then it won't run delayed tasks, most notably heartbeats and automatic commits, which in turn will cause a rebalance.
Note that in the situation I'm describing the consumer is polling regularly, well within the session timeout, so a rebalance is not expected. In KafkaConsumer::pollOnce there is a check for fetched records, and if there are records found then it skips running client.poll. Then up in KafkaConsumer::poll if records are returned it initiates fetches and does a quick poll, which won't run delayed tasks but will receive fetched records. So if the fetch responses are coming in during every quick poll the consumer gets in a state where it's never calling client.poll and running delayed tasks (until it stops receiving records in response to its fetches). I can provide detailed reproduction steps if needs be. The key parameters are that there must be at least 2 brokers involved, and the max fetch size should be reduced, to limit the size of the fetch batches. If anyone can verify what I'm seeing I'll create a bug, and if anyone has any ideas on how to prevent this from happening I would appreciate them.