Hi Johnny, As you already mentioned, it depends on the group.id which broker will be the group leader. You can change the group.id to modify which _consumer_offsets partition the group will belong to, thus change which broker will manage a group. You can check which partition a group.id is assigned using
Utils.toPositive(Utils.murmur2(groupIdAsByteArray)) % partitionCount consumer group is a way to distribute work across equivalent consumers. I would assume it is a good idea but it depends on your architecture and use case. Best regards, Andras On Sat, Mar 17, 2018 at 12:55 PM, Johnny Luo <john...@campaignmonitor.com> wrote: > Hello, > > We are running a 16 nodes kafka cluster on AWS, each node is a m4.xLarge > EC2 instance, with 2TB EBS(ST1) disk. Kafka version is 0.10.1.0, we have > about 100 topics at the moment. Some busy topics will have about 2 billion > events every day, some low volume topics will only have thousands per day. > > Most of our topics use an UUID as the partition key when we produce the > message, so the partitions are quite evenly distributed. > > We have quite a lot consumer consume from this cluster using consumer > group. Each consumer has a unique group id. Some consumer group commit > offsets every 500ms, some will commit offsets in sync as soon as it > finishes processing a batch of messages. > > Recently we observed a behaviour that some of the brokers are far busier > than the others. With some digging, we find out, it is actually quite a > lot traffic go to "__consumer_offsets", thus we created a tool to see the > high watermark of each partitions in "__consumer_offsets", which reveal > that the partitions are very uneven distributed. > > Based on this link "Consumer offset management in Kafka" > > It seems it is an intended behaviour, each consumer group only have one > leader, thus committed offsets all need to go to this leader, and also only > use “group.Id” to decide the partition. > > Given the fact that we have some consumers consume from those very busy > topics, thus the commit offsets will cause a lot traffic to > "__consumer_offsets" topic on the broker that handle the consumer group. > > My questions are : > 1. Is there a way we can make sure that the consumer groups that consume > from busy topics doesn't fall on to the same broker? Don’t' want to create > a hotspot. > 2. For consumers that consumer from busy topics (topics have billions > messages per day), is it a good idea to use consumer group? > > Thanks in advance > > Johnny Luo >