Tom Coupland created KAFKA-4884:
-----------------------------------

             Summary: __consumer_offsets topic processing consuming all 
resources
                 Key: KAFKA-4884
                 URL: https://issues.apache.org/jira/browse/KAFKA-4884
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 0.10.1.0
         Environment: Mesos cluster, coreos
            Reporter: Tom Coupland


Since this morning it appears that the processing for the __consumer_offsets 
topic is consuming all the resources in our test cluster. There are no other 
messages being dispatch through other topics, yet the brokers are using all 
their cpu and lot of network.

A clear sign that the problem is with the special topic, is when I deleted some 
test topics (leaving three topic remaining, including __consumer_offsets) the 
network load decreased somewhat, not enough to be fixed, be enough to point the 
figure firmly in this direction.

The rate of offsets for the consumer-offsets topic seems overly high. I'm 
summing the total offset across all 50 partitions and it's on the order of 
22000 every ten seconds, dropping to 17000 when I deleted the spare test topics.

These are time-stamps and total offsets for all partitions summed from before 
test topic deletion:

Fri 10 Mar 18:57:38 GMT 2017 
114700933 

Fri 10 Mar 18:57:56 GMT 2017 
114727290 

Fri 10 Mar 18:58:12 GMT 2017 
114750030 

Fri 10 Mar 18:58:31 GMT 2017 
114776560

There is nothing in the broker logs pointing to any errors, in fact, there is 
little to go on. Attempting to attach a consumer to topic just results in a 
hanging process.

It feels like the topic is being looped back on itself, creating offset updates 
for its own updates or something like that. I'm leaving the cluster up for the 
weekend so we can continue to diagnose, but it seem's like there must be a bug 
at the root of this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to