Hi,

We are considering an architecture with a Kafka cluster of 3 nodes and a high 
number of consumers. We see that with a low number of partitions, e.g. 3, and a 
higher number of consumers, e.g. 16, there will be only 3 consumers actually 
consuming data, because only the owners of partitions can consume messages. To 
see the owners we do the following:

$ bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zookeeper 
localhost:2181 --group consumer_group
Group           Topic     Pid Offset logSize Lag Owner
consumer_group statistics 0   5335   5373    38  
consumer_group_balthasar-1449651803301-63a1d620-0
consumer_group statistics 1   5335   5374    39  
consumer_group_balthasar-1449651803820-35a84426-0
consumer_group statistics 2   5335   5374    39  
consumer_group_balthasar-1449651803934-2b3cc1bd-0

One solution to being able to have many consumers is to increase the amount of 
partitions to a high number, e.g. 1024. This would put more load on the 
machines running Kafka, but would this load be crazy? The machines that'll be 
running Kafka have 64GB RAM and a Xeon E5-2620 CPU (6 cores clocked at 2GHz, 24 
hardware threads in total).

Are there any other reasons not to use such a high number of partitions?

Kind regards,
Balthasar Schopman

Kind regards,

Balthasar Schopman
Software Developer
LeaseWeb Technologies B.V.

T: +31 20 316 0232
M:
E: b.schop...@tech.leaseweb.com
W: http://www.leaseweb.com

Luttenbergweg 8, 1101 EC Amsterdam, Netherlands


Reply via email to