I just upgraded one of our test clusters from 2.3.1 to 2.4.0 and the system
CPU usage very noticeably increased (from approximately 35% of a CPU to
approximately 50% of a CPU). The cluster is very lightly loaded (around 150
messages/sec in aggregate). This CPU time increase is visible symmetrically
on all three nodes in the cluster.

The CPU consumption did not return to normal after I did the second restart
to bump the log and inter-broker protocol versions to 2.4, so I don't think
it has anything to do with down-converting to the 2.3 protocols.

No settings were changed, nor was anything about the JVM changed. There is
nothing interesting being written to the logs. There's no sign of any
instability (partitions aren't being reassigned, etc).

The best guess I have for the increased CPU usage is that the number of
garbage collections increased by approximately 30%, suggesting that
something is churning a lot more garbage inside Kafka. This is a small
cluster, so it's only got a 3GB heap allocated to Kafka on each node; we're
using G1GC with some light tuning and are on Java 8 if that helps.

Are these regressions expected? Should I expect them to be constant or
worse when we upgrade something with more load on it?
-- 
James Brown
Engineer

Reply via email to