Hi,

This will be with Kafka 0.8.  That is some good guidance, thank you.  To
summarize, we can scale the # of hosts/HDs as high as we want, but we
should keep an eye on the total number of partitions being handled.  We've
currently configured a default of 4 partitions per topic, so we'll watch
closely once we reach >250 topics.  That should give us plenty to work
with.  Thanks!

Scott Arthur


On Fri, Aug 2, 2013 at 10:49 PM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Hi Scott,
>
> What version of Kafka is this?
>
> In general our throughput will scale linearly with the number of machines
> or more specifically the number of disks. Our bottleneck will really be
> with the number of partitions. With thousands of partitions leader election
> can get slower (seconds), and if you have consumers that consume all
> partitions the rebalancing in these consumers can get slow (minutes).
>
> We hope to fix these issues but that is the current state up through 0.8.
>
> -Jay
>
>
> On Fri, Aug 2, 2013 at 2:27 PM, Scott Arthur <sart...@salesforce.com>
> wrote:
>
> > Hi,
> >
> > I have a question about scaling the broker count of a Kafka cluster.  We
> > have a scenario where we'll have two clusters replicating data into a
> > third.  We're wondering how we should size that third cluster so that it
> > can handle the volume of messages from the two source clusters.  Should
> we
> > just make the number of brokers match? e.g. five brokers in the two
> source
> > clusters, therefore 10 in the destination cluster.  In general, what is
> the
> > horizontal scaling model we should use?  Also, is there an upper limit to
> > the number of brokers you should put in a cluster, after which you get
> > diminishing returns on throughput?
> >
> > Thanks,
> > Scott Arthur
> >
>

Reply via email to