One thing worth trying is hooking up to 1 or more of the brokers via JMX and 
examining the running threads;  If that doesn't elucidate the cause, you could 
move onto sampling or profiling via JMX to see what's taking up all that CPU.

- Jordan Pilat

On 2017-09-21 07:58, Elliot Crosby-McCullough 
<elliot.crosby-mccullo...@freeagent.com> wrote: 
> Hello,
> 
> We've been trying to debug an issue with our kafka cluster for several days
> now and we're close to out of options.
> 
> We have 3 kafka brokers associated with 3 zookeeper nodes and 3 registry
> nodes, plus a few streams clients and a ruby producer.
> 
> Two of the three brokers are pinning a core and have been for days, no
> amount of restarting, debugging, or clearing out of data seems to help.
> 
> We've got the logs at DEBUG level which shows a constant flow much like
> this: https://gist.github.com/elliotcm/e66a1ca838558664bab0c91549acb251
> 
> As best as we can tell the brokers are up to date on replication and the
> leaders are well-balanced.  The cluster is receiving no traffic; no
> messages are being sent in and the consumers/streams are shut down.
> 
> From our profiling of the JVM it looks like the CPU is mostly working in
> replication threads and SSL traffic (it's a secured cluster) but that
> shouldn't be treated as gospel.
> 
> Any advice would be greatly appreciated.
> 
> All the best,
> Elliot
> 

Reply via email to