Shlomi, If you are on trunk, and your consumer subscriptions are identical then you can try a slightly different partition assignment strategy. Try setting partition.assignment.strategy="roundrobin" in your consumer config.
Thanks, Joel On Wed, Oct 29, 2014 at 06:29:30PM -0700, Jun Rao wrote: > By consumer, I actually mean consumer threads (the thread # you used when > creating consumer streams). So, if you have 4 consumers, each with 4 > threads, 4 of the threads will not get any data with 12 partitions. It > sounds like that's not what you get? What's the output of the > ConsumerOffsetChecker (see http://kafka.apache.org/documentation.html)? > > For consumer.id, you don't need to set it in general. We generate some uuid > automatically. > > Thanks, > > Jun > > On Tue, Oct 28, 2014 at 4:59 AM, Shlomi Hazan <shl...@viber.com> wrote: > > > Jun, > > > > I hear you say "partitions are evenly distributed among all consumers in > > the same group", yet I did bump into a case where launching a process with > > X high level consumer API threads took over all partitions, sending > > existing consumers to be unemployed. > > > > According to the claim above, and if I am not mistaken: > > on a topic T with 12 partitions and 3 consumers C1-C3 on the same group > > with 4 threads each, > > adding a new consumer C4 with 12 threads should yield the following > > balance: > > C1-C3 each relinquish a single partition holding only 3 partitions each. > > C4 holds the 3 partitions relinquished by C1-C3. > > Yet, in the case I described what happened is that C4 gained all 12 > > partitions and sent C1-C3 out of business with 0 partitions each. > > Now maybe I overlooked something but I think I did see that happen. > > > > BTW > > What key is used to distinguish one consumer from another? "consumer.id"? > > docs for "consumer.id" are "Generated automatically if not set." > > What is the best practice for setting it's value? leave empty? is server > > host name good enough? what are the considerations? > > When using the high level consumer API, are all threads identified as the > > same consumer? I guess they are, right?... > > > > Thanks, > > Shlomi > > > > > > On Tue, Oct 28, 2014 at 4:21 AM, Jun Rao <jun...@gmail.com> wrote: > > > > > You can take a look at the "consumer rebalancing algorithm" part in > > > http://kafka.apache.org/documentation.html. Basically, partitions are > > > evenly distributed among all consumers in the same group. If there are > > more > > > consumers in a group than partitions, some consumers will never get any > > > data. > > > > > > Thanks, > > > > > > Jun > > > > > > On Mon, Oct 27, 2014 at 4:14 AM, Shlomi Hazan <shl...@viber.com> wrote: > > > > > > > Hi All, > > > > > > > > Using Kafka's high consumer API I have bumped into a situation where > > > > launching a consumer process P1 with X consuming threads on a topic > > with > > > X > > > > partition kicks out all other existing consumer threads that consumed > > > prior > > > > to launching the process P. > > > > That is, consumer process P is stealing all partitions from all other > > > > consumer processes. > > > > > > > > While understandable, it makes it hard to size & deploy a cluster with > > a > > > > number of partitions that will both allow balancing of consumption > > across > > > > consuming processes, dividing the partitions across consumers by > > setting > > > > each consumer with it's share of the total number of partitions on the > > > > consumed topic, and on the other hand provide room for growth and > > > addition > > > > of new consumers to help with increasing traffic into the cluster and > > the > > > > topic. > > > > > > > > This stealing effect forces me to have more partitions then really > > needed > > > > at the moment, planning for future growth, or stick to what I need and > > > > trust the option to add partitions which comes with a price in terms of > > > > restarting consumers, bumping into out of order messages (hash > > > > partitioning) etc. > > > > > > > > Is this policy of stealing is intended, or did I just jump to > > > conclusions? > > > > what is the way to cope with the sizing question? > > > > > > > > Shlomi > > > > > > > > >