Number of files to manage by os, I suppose.

Why wouldn't you use consistent hashing with deliberately engineered
collisions to generate a limited number of topics / partitions and filter
at the consumer level?

Regards
Milind
On May 23, 2013 4:22 PM, "Timothy Chen" <tnac...@gmail.com> wrote:

> Hi Neha,
>
> Not sure if this sounds crazy, but if we'd like to have the events for the
> same session id go to the same partition one way could be that each session
> key creates its own topic with single partition, therefore there could be
> millions of topic with single partition.
>
> I wonder what would be the bottleneck of doing this?
>
> Thanks,
>
> Tim
>
>
> On Wed, May 22, 2013 at 4:32 PM, Neha Narkhede <neha.narkh...@gmail.com
> >wrote:
>
> > Not automatically as of today. You have to run the reassign-partitions
> tool
> > and explicitly move selected partitions to the new brokers. If you use
> this
> > tool, you can move partitions to the new broker without any downtime.
> >
> > Thanks,
> > Neha
> >
> >
> > On Wed, May 22, 2013 at 2:20 PM, Timothy Chen <tnac...@gmail.com> wrote:
> >
> > > Hi Neha/Chris,
> > >
> > > Thanks for the reply, so if I set a fixed number of partitions and just
> > add
> > > brokers to the broker pool, does it rebalance the load to the new
> brokers
> > > (along with the data)?
> > >
> > > Tim
> > >
> > >
> > > On Wed, May 22, 2013 at 1:15 PM, Neha Narkhede <
> neha.narkh...@gmail.com
> > > >wrote:
> > >
> > > > - I see that Kafka server.properties allows one to specify the number
> > of
> > > > partitions it supports. However, when we want to scale I wonder if we
> > > add #
> > > > of partitions or # of brokers, will the same partitioner start
> > > distributing
> > > > the messages to different partitions?
> > > >  And if it does, how can that same consumer continue to read off the
> > > > messages of those ids if it was interrupted in the middle?
> > > >
> > > > The num.partitions config in server.properties is used only for
> topics
> > > that
> > > > are auto created (controlled by auto.create.topics.enable). For
> topics
> > > that
> > > > you create using the admin tool, you can specify the number of
> > partitions
> > > > that you want. After that, currently there is no way to change that.
> > For
> > > > that reason, it is a good idea to over partition your topic, which
> also
> > > > helps load balance partitions onto the brokers. You are right that if
> > you
> > > > change the number of partitions later, then previously messages that
> > > stuck
> > > > to a certain partition would now get routed to a different partition,
> > > which
> > > > is undesirable for applications that want to use sticky partitioning.
> > > >
> > > > - I'd like to create a consumer per partition, and for each one to
> > > > subscribe to the changes of that one. How can this be done in kafka?
> > > >
> > > > For your use case, it seems like SimpleConsumer might be a better
> fit.
> > > > However, it will require you to write code to handle discovery of
> > leader
> > > > for the partition that your consumer is consuming. Chris has written
> > up a
> > > > great example that you can follow -
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
> > > >
> > > > Thanks,
> > > > Neha
> > > >
> > > >
> > > > On Wed, May 22, 2013 at 12:37 PM, Chris Curtin <
> curtin.ch...@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi Tim,
> > > > >
> > > > >
> > > > > On Wed, May 22, 2013 at 3:25 PM, Timothy Chen <tnac...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm currently trying to understand how Kafka (0.8) can scale with
> > our
> > > > > usage
> > > > > > pattern and how to setup the partitioning.
> > > > > >
> > > > > > We want to route the same messages belonging to the same id to
> the
> > > same
> > > > > > queue, so its consumer will able to consume all the messages of
> > that
> > > > id.
> > > > > >
> > > > > > My questions:
> > > > > >
> > > > > >  - From my understanding, in Kafka we would need to have a custom
> > > > > > partitioner that routes the same messages to the same partition
> > > right?
> > > > >  I'm
> > > > > > trying to find examples of writing this partitioner logic, but I
> > > can't
> > > > > find
> > > > > > any. Can someone point me to an example?
> > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example
> > > > >
> > > > > The partitioner here does a simple mod on the IP address and the #
> of
> > > > > partitions. You'd need to define your own logic, but this is a
> start.
> > > > >
> > > > >
> > > > > > - I see that Kafka server.properties allows one to specify the
> > number
> > > > of
> > > > > > partitions it supports. However, when we want to scale I wonder
> if
> > we
> > > > > add #
> > > > > > of partitions or # of brokers, will the same partitioner start
> > > > > distributing
> > > > > > the messages to different partitions?
> > > > > >  And if it does, how can that same consumer continue to read off
> > the
> > > > > > messages of those ids if it was interrupted in the middle?
> > > > > >
> > > > >
> > > > > I'll let someone else answer this.
> > > > >
> > > > >
> > > > > >
> > > > > > - I'd like to create a consumer per partition, and for each one
> to
> > > > > > subscribe to the changes of that one. How can this be done in
> > kafka?
> > > > > >
> > > > >
> > > > > Two ways: Simple Consumer or Consumer Groups:
> > > > >
> > > > > Depends on the level of control you want on code processing a
> > specific
> > > > > partition vs. getting one assigned to it (and level of control over
> > > > offset
> > > > > management).
> > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
> > > > > <
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Tim
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to