You can take a look at the "consumer rebalancing algorithm" part in
http://kafka.apache.org/documentation.html. Basically, partitions are
evenly distributed among all consumers in the same group. If there are more
consumers in a group than partitions, some consumers will never get any
data.

Thanks,

Jun

On Mon, Oct 27, 2014 at 4:14 AM, Shlomi Hazan <shl...@viber.com> wrote:

> Hi All,
>
> Using Kafka's high consumer API I have bumped into a situation where
> launching a consumer process P1 with X consuming threads on a topic with X
> partition kicks out all other existing consumer threads that consumed prior
> to launching the process P.
> That is, consumer process P is stealing all partitions from all other
> consumer processes.
>
> While understandable, it makes it hard to size & deploy a cluster with a
> number of partitions that will both allow balancing of consumption across
> consuming processes, dividing the partitions across consumers by setting
> each consumer with it's share of the total number of partitions on the
> consumed topic, and on the other hand provide room for growth and addition
> of new consumers to help with increasing traffic into the cluster and the
> topic.
>
> This stealing effect forces me to have more partitions then really needed
> at the moment, planning for future growth, or stick to what I need and
> trust the option to add partitions which comes with a price in terms of
> restarting consumers, bumping into out of order messages (hash
> partitioning) etc.
>
> Is this policy of stealing is intended, or did I just jump to conclusions?
> what is the way to cope with the sizing question?
>
> Shlomi
>

Reply via email to