Great, that's reassuring!

What's the time frame for having a more or less stable version to try out?

Jason


On Mon, Jul 7, 2014 at 12:59 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> I see your point now. The old consumer does have a hard-coded
> "round-robin-per-topic" logic which have this issue. In the new consumer,
> we will make the assignment logic customizable so that people can specify
> different rebalance algorithms they like.
>
> Also I will soon send out a new consumer design summary email for more
> comments. Feel free to give us more thoughts you have about the new
> consumer design.
>
> Guozhang
>
>
> On Mon, Jul 7, 2014 at 8:44 AM, Jason Rosenberg <j...@squareup.com> wrote:
>
> > Guozhang,
> >
> > I'm not suggesting we parallelize within a partition....
> >
> > The problem with the current high-level consumer is, if you use a regex
> to
> > select multiple topics, and then have multiple consumers in the same
> group,
> > usually the first consumer will 'own' all the topics, and no amount of
> > sub-sequent rebalancing will allow other consumers in the group to own
> some
> > of the topics.  Re-balancing does allow other consumers to own multiple
> > partitions, but if a topic has only 1 partition, only the first consumer
> to
> > initialize will get all the work.
> >
> > So, I'm wondering if the new api will be better about re-balancing the
> work
> > at the partition level, and not the topic level, as such.
> >
> > Jason
> >
> >
> > On Mon, Jul 7, 2014 at 11:17 AM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Hi Jason,
> > >
> > > In the new design the consumption is still at the per-partition
> > > granularity. The main rationale of doing this is ordering: Within a
> > > partition we want to preserve the ordering such that message B produced
> > > after message A will also be consumed and processed after message A.
> And
> > > producers can use keys to make sure messages with the same ordering
> group
> > > will be in the same partition. To do this we have to make one partition
> > > only being consumed by a single client at a time. On the other hand,
> when
> > > one wants to add the number of consumers beyond the number of
> partitions,
> > > he can always use the topic tool to dynamically add more partitions to
> > the
> > > topic.
> > >
> > > Do you have a specific scenario in mind that would require
> > single-partition
> > > topics?
> > >
> > > Guozhang
> > >
> > >
> > >
> > > On Mon, Jul 7, 2014 at 7:43 AM, Jason Rosenberg <j...@squareup.com>
> > wrote:
> > >
> > > > I've been looking at the new consumer api outlined here:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design
> > > >
> > > > One issue in the current high-level consumer, is that it does not do
> a
> > > good
> > > > job of distributing a set of topics between multiple consumers,
> unless
> > > each
> > > > topic has multiple partitions.  This has always seemed strange to me,
> > > since
> > > > at the end of the day, even for single partition topics, the basic
> unit
> > > of
> > > > consumption is still at the partition level (so you'd expect
> > rebalancing
> > > to
> > > > try to evenly distribute partitions (regardless of the topic)).
> > > >
> > > > It's not clearly spelled out in the new consumer api wiki, so I'll
> just
> > > > ask, will this issue be addressed in the new api?  I think I've asked
> > > this
> > > > before, but I wanted to go check again, and am not seeing this
> > explicitly
> > > > addressed in the design.
> > > >
> > > > Thanks
> > > >
> > > > Jason
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to