Hi Scott, What version of Kafka is this?
In general our throughput will scale linearly with the number of machines or more specifically the number of disks. Our bottleneck will really be with the number of partitions. With thousands of partitions leader election can get slower (seconds), and if you have consumers that consume all partitions the rebalancing in these consumers can get slow (minutes). We hope to fix these issues but that is the current state up through 0.8. -Jay On Fri, Aug 2, 2013 at 2:27 PM, Scott Arthur <sart...@salesforce.com> wrote: > Hi, > > I have a question about scaling the broker count of a Kafka cluster. We > have a scenario where we'll have two clusters replicating data into a > third. We're wondering how we should size that third cluster so that it > can handle the volume of messages from the two source clusters. Should we > just make the number of brokers match? e.g. five brokers in the two source > clusters, therefore 10 in the destination cluster. In general, what is the > horizontal scaling model we should use? Also, is there an upper limit to > the number of brokers you should put in a cluster, after which you get > diminishing returns on throughput? > > Thanks, > Scott Arthur >