Hello Richard, Thank you for your answer.
Upon examining the `__consumer_offsets` topic, it seems that all commit messages for a given consumer `group.id` go to the same partition. So, there's nothing much to do if we have a dominant consumer group reading from all topics. The only solution would be to split it in multiple consumer groups reading from different subsets of topics. Best Regards, Fares On Mon, May 1, 2023 at 11:07 AM Richard Bosch <richard.bo...@axual.com> wrote: > Hi Fares, > > You're right in your description of the contents of the __consumer_offsets > topic, and how they are stored. > The most common reason for an uneven load on the consumer offsets are. > > 1. Configuration of offset commits in the client > 2. Load on topic being consumed > > If a topic has 10 partitions, and the producer produces records with a key > and a partitioner based on the key hash, then it can happen that one or > more partitions get more records than the others. Just because several keys > are more often used than others. > Now the consumer needs to read more records from those partitions. > If the consumer commits offsets at a time interval or after an N amount of > records consumed, it follows that this results in more offset commits for > the topic partitions containing more records. > > You might want to check the load on the topic partitions being consumed to > confirm this is the case. > Unfortunately, I do not have an easy answer on how to remedy that problem. > > You can check what the offset commit settings are for your application, and > if you can update the logic to reflect the higher load. > If the trigger is based on time or nrOfRecords consumed you can alter these > values to make sure they aren't triggered that often. > > You can also start monitoring and updating the cluster with a preferred > leader election setting for your partitions using CruiseControl to minimize > the load on your brokers. > > I do recommend to use keys with the hash based partitioner only if the > order of messages for those keys MUST be guaranteed, else you can use a > different Partitioner. > Then there will be a more uniform distribution of records on the topic, and > a better distribution of offset commits. > > Do note that there is an issue with the UniformStickyPartitioner that is > being worked on right now, see > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-794%3A+Strictly+Uniform+Sticky+Partitioner > > I hope this helps > > Kind regards, > > > Richard Bosch > > Developer Advocate > > Axual BV > > E : richard.bo...@axual.com > M : +31 6 11 850 846 > W : www.axual.com > > > On Thu, Apr 27, 2023 at 4:19 PM Fares Oueslati <oueslati.fa...@gmail.com> > wrote: > > > Hello Kafka users, > > > > I’m facing an issue with a Kafka cluster, specifically with the > > __consumer_offsets topic. > > There seems to be an imbalance in the number of commit messages across > its > > partitions. Most of the commit messages are concentrated in a single > > partition, which is causing high CPU usage on the broker handling that > > partition. > > I have already verified that the topic partitions’ leaders are > > well-balanced across the six brokers. > > However, a specific consumer group (The largest one, with many members > > consuming from multiple topics, based on Spring Kafka) generates a large > > number of commit messages, and they all end up in the same partition #37. > > My understanding is that, by default, all commit messages sent by a > > particular consumer group for a specific topic partition are directed to > a > > single partition of the __consumer_offsets topic, determined by hashing > the > > consumer group id and the topic partition. In our case, this default > > partitioning strategy seems to be causing the imbalance, even though I > > don’t understand why exactly. > > Could you please help me understand why there’s such an imbalance in the > > number of messages across the __consumer_offsets partitions and why the > > large number of commit messages from the large consumer group are not > > spread well across the partitions of the __consumer_offsets topic? Are > > there any recommendations or best practices to address this issue? > > > > Any guidance would be greatly appreciated. > > > > Best Regards, > > Fares > > >