1. Yes, you may have to overprovision the number of partitions to handle the load peaks. Refer this <https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster> document to choose the no. of partitions. 2. KIP-429 <https://cwiki.apache.org/confluence/display/KAFKA/KIP-429%3A+Kafka+Consumer+Incremental+Rebalance+Protocol> is proposed to reduce the time taken by the consumer rebalance protocol when a consumer instance is added/removed from the group.
On Mon, May 6, 2019 at 7:47 PM Moritz Petersen <mpete...@adobe.com.invalid> wrote: > Hi all, > > I’m new to Kafka and have a very basic question: > > We build a cloud-scale platform and evaluate if we can use Kafka for > pub-sub messaging between our services. Most of our services scale > dynamically based on load (number of requests, CPU load etc.). In our > current architecture, services are both, producers and consumers since all > services listen to some kind of events. > > With Kafka, I assume we have two restrictions or issues: > > 1. Number of consumers is restricted to the number of partitions of a > topic. Changing the number of partitions is a relatively expensive > operation (at least compared to scaling services). Is it necessary to > overprovision on the number of partitions in order to be prepared for load > peaks? > 2. Adding or removing consumers halts processing of the related > partition for a short period of time. Is it possible to avoid or > significantly minimize this lag? > > Are there any additional best practices to implement Kafka consumers on a > cloud scale environment? > > Thanks, > Moritz > >