(From http://kafka.apache.org/design.html) one potential benefit of the
existing rebalancing logic is to reduce the number of connections to
brokers per consumer instance. However, if you have a large number of
partitions and few brokers and/or consumer instances then it wouldn't
really help; so I
Jira ticket https://issues.apache.org/jira/browse/KAFKA-687
2013/1/7 Pablo Barrera González
> Thank you Jun and Neha
>
> I was trying to avoid adding more partitions. I have enough partitions if
> you count all partitions in all topics. I understand the problem with
> different data load per t
Thank you Jun and Neha
I was trying to avoid adding more partitions. I have enough partitions if
you count all partitions in all topics. I understand the problem with
different data load per topic but the current schema does not solve this
problem either so we shouldn't be worse is we consider all
Pablo,
That is a good suggestion. Ideally, the partitions across all topics should
be distributed evenly across consumer streams instead of a per-topic based
decision. There is no particular advantage to the current scheme of
per-topic rebalancing that I can think of. Would you mind filing a JIRA
Pablo,
Currently, partition is the smallest unit that we distribute data among
consumers (in the same consumer group). So, if the # of consumers is larger
than the total number of partitions in a Kafka cluster (across all
brokers), some consumers will never get any data. Such a decision is done
on
Hello
We are starting to use Kafka in production but we found an unexpected (at
least for me) behavior with the use of partitions. We have a bunch of
topics with a few partitions each. We try to consume all data from several
consumers (just one consumer group).
The problem is in the rebalance ste