Number of files to manage by os, I suppose. Why wouldn't you use consistent hashing with deliberately engineered collisions to generate a limited number of topics / partitions and filter at the consumer level?
Regards Milind On May 23, 2013 4:22 PM, "Timothy Chen" <tnac...@gmail.com> wrote: > Hi Neha, > > Not sure if this sounds crazy, but if we'd like to have the events for the > same session id go to the same partition one way could be that each session > key creates its own topic with single partition, therefore there could be > millions of topic with single partition. > > I wonder what would be the bottleneck of doing this? > > Thanks, > > Tim > > > On Wed, May 22, 2013 at 4:32 PM, Neha Narkhede <neha.narkh...@gmail.com > >wrote: > > > Not automatically as of today. You have to run the reassign-partitions > tool > > and explicitly move selected partitions to the new brokers. If you use > this > > tool, you can move partitions to the new broker without any downtime. > > > > Thanks, > > Neha > > > > > > On Wed, May 22, 2013 at 2:20 PM, Timothy Chen <tnac...@gmail.com> wrote: > > > > > Hi Neha/Chris, > > > > > > Thanks for the reply, so if I set a fixed number of partitions and just > > add > > > brokers to the broker pool, does it rebalance the load to the new > brokers > > > (along with the data)? > > > > > > Tim > > > > > > > > > On Wed, May 22, 2013 at 1:15 PM, Neha Narkhede < > neha.narkh...@gmail.com > > > >wrote: > > > > > > > - I see that Kafka server.properties allows one to specify the number > > of > > > > partitions it supports. However, when we want to scale I wonder if we > > > add # > > > > of partitions or # of brokers, will the same partitioner start > > > distributing > > > > the messages to different partitions? > > > > And if it does, how can that same consumer continue to read off the > > > > messages of those ids if it was interrupted in the middle? > > > > > > > > The num.partitions config in server.properties is used only for > topics > > > that > > > > are auto created (controlled by auto.create.topics.enable). For > topics > > > that > > > > you create using the admin tool, you can specify the number of > > partitions > > > > that you want. After that, currently there is no way to change that. > > For > > > > that reason, it is a good idea to over partition your topic, which > also > > > > helps load balance partitions onto the brokers. You are right that if > > you > > > > change the number of partitions later, then previously messages that > > > stuck > > > > to a certain partition would now get routed to a different partition, > > > which > > > > is undesirable for applications that want to use sticky partitioning. > > > > > > > > - I'd like to create a consumer per partition, and for each one to > > > > subscribe to the changes of that one. How can this be done in kafka? > > > > > > > > For your use case, it seems like SimpleConsumer might be a better > fit. > > > > However, it will require you to write code to handle discovery of > > leader > > > > for the partition that your consumer is consuming. Chris has written > > up a > > > > great example that you can follow - > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example > > > > > > > > Thanks, > > > > Neha > > > > > > > > > > > > On Wed, May 22, 2013 at 12:37 PM, Chris Curtin < > curtin.ch...@gmail.com > > > > >wrote: > > > > > > > > > Hi Tim, > > > > > > > > > > > > > > > On Wed, May 22, 2013 at 3:25 PM, Timothy Chen <tnac...@gmail.com> > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I'm currently trying to understand how Kafka (0.8) can scale with > > our > > > > > usage > > > > > > pattern and how to setup the partitioning. > > > > > > > > > > > > We want to route the same messages belonging to the same id to > the > > > same > > > > > > queue, so its consumer will able to consume all the messages of > > that > > > > id. > > > > > > > > > > > > My questions: > > > > > > > > > > > > - From my understanding, in Kafka we would need to have a custom > > > > > > partitioner that routes the same messages to the same partition > > > right? > > > > > I'm > > > > > > trying to find examples of writing this partitioner logic, but I > > > can't > > > > > find > > > > > > any. Can someone point me to an example? > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example > > > > > > > > > > The partitioner here does a simple mod on the IP address and the # > of > > > > > partitions. You'd need to define your own logic, but this is a > start. > > > > > > > > > > > > > > > > - I see that Kafka server.properties allows one to specify the > > number > > > > of > > > > > > partitions it supports. However, when we want to scale I wonder > if > > we > > > > > add # > > > > > > of partitions or # of brokers, will the same partitioner start > > > > > distributing > > > > > > the messages to different partitions? > > > > > > And if it does, how can that same consumer continue to read off > > the > > > > > > messages of those ids if it was interrupted in the middle? > > > > > > > > > > > > > > > > I'll let someone else answer this. > > > > > > > > > > > > > > > > > > > > > > - I'd like to create a consumer per partition, and for each one > to > > > > > > subscribe to the changes of that one. How can this be done in > > kafka? > > > > > > > > > > > > > > > > Two ways: Simple Consumer or Consumer Groups: > > > > > > > > > > Depends on the level of control you want on code processing a > > specific > > > > > partition vs. getting one assigned to it (and level of control over > > > > offset > > > > > management). > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example > > > > > < > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Tim > > > > > > > > > > > > > > > > > > > > >