Going to stand with Jay here :)

I just posted an email yesterday about how we size clusters and topics.
Basically, have at least as many partitions as you have consumers in your
consumer group (preferably a multiple). If you want to balance it across
the cluster, also have it be a multiple of the number of brokers you have.
We tend to ignore the second one on most clusters, but we will expand a
topic (as long as it is not keyed) if the retention on disk exceeds 50 GB.
That's just a guideline we have so it's easier to balance the traffic and
move partitions around when needed.

-Todd


On Tue, Apr 7, 2015 at 10:28 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> I think the blog post was giving that as an upper bound not a recommended
> size. I think that blog goes through some of the trade offs of having more
> or fewer partitions.
>
> -Jay
>
> On Tue, Apr 7, 2015 at 10:13 AM, François Méthot <fmetho...@gmail.com>
> wrote:
>
> > Hi,
> >
> >   We initially had configured our topics to have between 8 to 16
> partitions
> > each on a cluster of 10 brokers (vm with 2 cores, 16 MB ram, Few TB of
> SAN
> > Disk).
> >
> > Then I came across the rule of thump formula *100 x b x r.*
> > (
> >
> >
> http://blog.confluent.io/2015/03/12/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
> > )
> >
> > 100 x 10 brokers x 2 Replication = 2000 partitions.
> >
> > We gave it try and but our single threaded kafka producer performance
> > dropped by 80%.
> >
> > What is the benefits of having that much partitions?
> >
> > Is there any problem in the long run with using a topic with as few as 16
> > partitions?
> >
> >
> > Francois
> >
>

Reply via email to