A couple of questions: how many partitions in the cluster and what are your broker configs?
On Thu, Sep 21, 2017 at 1:58 PM, Elliot Crosby-McCullough < elliot.crosby-mccullo...@freeagent.com> wrote: > Hello, > > We've been trying to debug an issue with our kafka cluster for several days > now and we're close to out of options. > > We have 3 kafka brokers associated with 3 zookeeper nodes and 3 registry > nodes, plus a few streams clients and a ruby producer. > > Two of the three brokers are pinning a core and have been for days, no > amount of restarting, debugging, or clearing out of data seems to help. > > We've got the logs at DEBUG level which shows a constant flow much like > this: https://gist.github.com/elliotcm/e66a1ca838558664bab0c91549acb251 > > As best as we can tell the brokers are up to date on replication and the > leaders are well-balanced. The cluster is receiving no traffic; no > messages are being sent in and the consumers/streams are shut down. > > From our profiling of the JVM it looks like the CPU is mostly working in > replication threads and SSL traffic (it's a secured cluster) but that > shouldn't be treated as gospel. > > Any advice would be greatly appreciated. > > All the best, > Elliot >