Tom Coupland created KAFKA-4884: ----------------------------------- Summary: __consumer_offsets topic processing consuming all resources Key: KAFKA-4884 URL: https://issues.apache.org/jira/browse/KAFKA-4884 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.10.1.0 Environment: Mesos cluster, coreos Reporter: Tom Coupland
Since this morning it appears that the processing for the __consumer_offsets topic is consuming all the resources in our test cluster. There are no other messages being dispatch through other topics, yet the brokers are using all their cpu and lot of network. A clear sign that the problem is with the special topic, is when I deleted some test topics (leaving three topic remaining, including __consumer_offsets) the network load decreased somewhat, not enough to be fixed, be enough to point the figure firmly in this direction. The rate of offsets for the consumer-offsets topic seems overly high. I'm summing the total offset across all 50 partitions and it's on the order of 22000 every ten seconds, dropping to 17000 when I deleted the spare test topics. These are time-stamps and total offsets for all partitions summed from before test topic deletion: Fri 10 Mar 18:57:38 GMT 2017 114700933 Fri 10 Mar 18:57:56 GMT 2017 114727290 Fri 10 Mar 18:58:12 GMT 2017 114750030 Fri 10 Mar 18:58:31 GMT 2017 114776560 There is nothing in the broker logs pointing to any errors, in fact, there is little to go on. Attempting to attach a consumer to topic just results in a hanging process. It feels like the topic is being looped back on itself, creating offset updates for its own updates or something like that. I'm leaving the cluster up for the weekend so we can continue to diagnose, but it seem's like there must be a bug at the root of this. -- This message was sent by Atlassian JIRA (v6.3.15#6346)