This may be of some help to you --> http://grokbase.com/t/kafka/users/13a6xxp29n/managing-millions-of-paritions-in-kafka
Kiran On Tue, Apr 8, 2014 at 12:29 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) < skada...@bloomberg.net> wrote: > Ah, thanks. The intent of my question though was to better understand how > a large number of partitions affects Kafka itself. > > ----- Original Message ----- > From: balaji.sesha...@dish.com > To: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN), users@kafka.apache.org > At: Apr 8 2014 15:26:49 > > We have 131 partitions and run 6 tomcat instances each spawning 5 threads. > > Depending on the number of partitions you have you got to parallelize your > consumers horizontally to scale. > > May be start with 10-20 consumer instances with 4-5 threads each > processing more than one partition might help. > > 20 instances * 10 threads = 200 > > If you have 1000 partitions then distribution would be 5 partitions will > be consumed by 1 thread > > This is just rough estimate based on my understanding. > > -----Original Message----- > From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) [mailto: > skada...@bloomberg.net] > Sent: Tuesday, April 08, 2014 1:00 PM > To: Seshadri, Balaji; users@kafka.apache.org > Subject: RE: Single thread, Multiple partitions > > Ah, thanks, figured it out now. > > What kind of bottlenecks should I expect to run into if I'm looking at 10s > of 1000s of partitions for a topic? The amount of data passing through each > partition or in aggregate is somewhat small (few 100 GB per day across all > partitions). The high partition count is because it simplifies application > semantics. > > ----- Original Message ----- > From: balaji.sesha...@dish.com > To: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN), users@kafka.apache.org > At: Apr 8 2014 14:08:41 > > I think you are looking for accessing messages from set of partitions by > your own policy.You should use simple consumers in 0.8 and maintain the > offsets you have read. > > > https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example > > If it is 0.9 I'm yet to come upto speed. > > Thanks, > > Balaji > > > -----Original Message----- > From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) [mailto: > skada...@bloomberg.net] > Sent: Tuesday, April 08, 2014 11:58 AM > To: users@kafka.apache.org > Subject: Single thread, Multiple partitions > > Let's say I've a single consumer thread reading off multiple partitions > (I'll have around 10K partitions). As per the documentation on > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example, > there are no guarantees on the order in which messages are read off the set > of partitions. If I wanted to enforce priority-weighted round robin reads > off the partitions, could I get a pointer on what code to fiddle with? > Thanks! > > > >