Hi Jason, You might find this blog post useful: http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
- Luke On Sat, Jan 16, 2016 at 1:30 PM, Jason Williams <jasonjwwilli...@gmail.com> wrote: > Hi Franco, > > Thank you for the info! It is my reading of the docs that adding > partitions would require manual rebalancing to spread the existing load? > > Would it be advisable instead to start out with a partition count that is > 10x the initial consumer count? For example if we anticipate starting with > 5 consumers, use a partition count of 50 or 100. That way we could just add > consumers during a load spike and remove them when it passes? > > Sorry for all the questions. I just am not able to find a lot of > information about how folks are handling auto-scaling kind of situations > like this. > > -J > > Sent via iPhone > > > On Jan 16, 2016, at 10:52, Franco Giacosa <fgiac...@gmail.com> wrote: > > > > Hi Jason, > > > > You can try to repartition and assign more consumers. > > > > *Documentation* > > "Modifying topics > > > > You can change the configuration or partitioning of a topic using the > same > > topic tool. > > To add partitions you can do > > > >> bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic > > my_topic_name > > --partitions 40 > > Be aware that one use case for partitions is to semantically partition > > data, and adding partitions doesn't change the partitioning of existing > > data so this may disturb consumers if they rely on that partition. That > is > > if data is partitioned by hash(key) % number_of_partitions then this > > partitioning will potentially be shuffled by adding partitions but Kafka > > will not attempt to automatically redistribute data in any way." > > > > > > Franco. > > > > > > > > 2016-01-16 19:04 GMT+01:00 Jason Williams <jasonjwwilli...@gmail.com>: > > > >> Hi Franco, > >> > >> Thanks for responding but that doesn't really answer my question. > >> > >> The situation described is it I have N partitions and already N > consumers > >> in the CG, and then I receive a spike in messages. How is it suggested > to > >> handle adding more consumption capacity to deal the spike...since adding > >> consumers when I'm already at N will just add idle consumers? > >> > >> -J > >> > >> > >> Sent via iPhone > >> > >>> On Jan 16, 2016, at 03:21, Franco Giacosa <fgiac...@gmail.com> wrote: > >>> > >>> 1 consumer group can have many partitions, if the consumer group has 1 > >>> consumer and there are N partitions, it will consume from N, if you > have > >> a > >>> spike you can add up to N more consumers to that consumer group. > >>> > >>> 2016-01-16 11:32 GMT+01:00 Jason Williams <jasonjwwilli...@gmail.com>: > >>> > >>>> Thanks Jens! > >>>> > >>>> So assuming you've already paired your partition count to the consumer > >>>> count...if you experience a spike in messages and want to spin up more > >>>> consumers to add temporary processing capacity, what's the suggested > >> way to > >>>> handle this? (since it would seem you can't just add consumers as they > >>>> would remain idle...and adding partitions appears to require manual > >>>> rebalancing). > >>>> > >>>> -J > >>>> > >>>> Sent via iPhone > >>>> > >>>>> On Jan 16, 2016, at 01:54, Jens Rantil <jens.ran...@tink.se> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> > >>>>> You are correct. The others will remain idle. This is why you > generally > >>>> want to have at least the same number of partitions as consumers. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> Cheers, > >>>>> > >>>>> Jens > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> – > >>>>> Skickat från Mailbox > >>>>> > >>>>> On Sat, Jan 16, 2016 at 2:34 AM, Jason J. W. Williams > >>>>> <jasonjwwilli...@gmail.com> wrote: > >>>>> > >>>>>> Hi, > >>>>>> I'm trying to make sure I understand this statement in the docs: > >>>>>> "Each broker partition is consumed by a single consumer within a > given > >>>>>> consumer group. The consumer must establish its ownership of a given > >>>>>> partition before any consumption can begin." > >>>>>> If I have: > >>>>>> * a topic with 1 partition > >>>>>> * subscribe a consumer group to the topic > >>>>>> * the consumer group has 10 consumers belonging to it > >>>>>> Will only 1 consumer of the 10 ever receive messages from the topic, > >> and > >>>>>> the other 9 remain idle? Or does this mean only 1 consumer at a time > >>>> from > >>>>>> the group will be consuming...in a round-robin fashion? > >>>>>> -J > >> >