Going to stand with Jay here :) I just posted an email yesterday about how we size clusters and topics. Basically, have at least as many partitions as you have consumers in your consumer group (preferably a multiple). If you want to balance it across the cluster, also have it be a multiple of the number of brokers you have. We tend to ignore the second one on most clusters, but we will expand a topic (as long as it is not keyed) if the retention on disk exceeds 50 GB. That's just a guideline we have so it's easier to balance the traffic and move partitions around when needed.
-Todd On Tue, Apr 7, 2015 at 10:28 AM, Jay Kreps <jay.kr...@gmail.com> wrote: > I think the blog post was giving that as an upper bound not a recommended > size. I think that blog goes through some of the trade offs of having more > or fewer partitions. > > -Jay > > On Tue, Apr 7, 2015 at 10:13 AM, François Méthot <fmetho...@gmail.com> > wrote: > > > Hi, > > > > We initially had configured our topics to have between 8 to 16 > partitions > > each on a cluster of 10 brokers (vm with 2 cores, 16 MB ram, Few TB of > SAN > > Disk). > > > > Then I came across the rule of thump formula *100 x b x r.* > > ( > > > > > http://blog.confluent.io/2015/03/12/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ > > ) > > > > 100 x 10 brokers x 2 Replication = 2000 partitions. > > > > We gave it try and but our single threaded kafka producer performance > > dropped by 80%. > > > > What is the benefits of having that much partitions? > > > > Is there any problem in the long run with using a topic with as few as 16 > > partitions? > > > > > > Francois > > >