Hi, We are considering an architecture with a Kafka cluster of 3 nodes and a high number of consumers. We see that with a low number of partitions, e.g. 3, and a higher number of consumers, e.g. 16, there will be only 3 consumers actually consuming data, because only the owners of partitions can consume messages. To see the owners we do the following:
$ bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zookeeper localhost:2181 --group consumer_group Group Topic Pid Offset logSize Lag Owner consumer_group statistics 0 5335 5373 38 consumer_group_balthasar-1449651803301-63a1d620-0 consumer_group statistics 1 5335 5374 39 consumer_group_balthasar-1449651803820-35a84426-0 consumer_group statistics 2 5335 5374 39 consumer_group_balthasar-1449651803934-2b3cc1bd-0 One solution to being able to have many consumers is to increase the amount of partitions to a high number, e.g. 1024. This would put more load on the machines running Kafka, but would this load be crazy? The machines that'll be running Kafka have 64GB RAM and a Xeon E5-2620 CPU (6 cores clocked at 2GHz, 24 hardware threads in total). Are there any other reasons not to use such a high number of partitions? Kind regards, Balthasar Schopman Kind regards, Balthasar Schopman Software Developer LeaseWeb Technologies B.V. T: +31 20 316 0232 M: E: b.schop...@tech.leaseweb.com W: http://www.leaseweb.com Luttenbergweg 8, 1101 EC Amsterdam, Netherlands