Re: High CPU usage of Crc32 on Kafka broker

Jan Filipiak Sun, 22 Feb 2015 02:18:54 -0800

I just want to bring up that idea of no server side de/recompressionagain. Features like KAFKA-1499<https://issues.apache.org/jira/browse/KAFKA-1499> seem to steer theproject into a different direction and the fact that tickets likeKAFKA-845 <https://issues.apache.org/jira/browse/KAFKA-845> are notgetting much attention gives the same impression. This is something myhead keeps spinning around almost 24/7 recently.

The problem I see is that CPU's are not the cheapest part of a newserver and if you can spare a gigahertz or some cores by just makingsure your configs are the same across all producers I would always optfor the operational overhead instead of the bigger servers. I think thiswill usually decrease the tco's of kafka installations.

I am currently not familiar enough with the codebase to judge if serverside decompression happens before acknowledge. If so, these would besome additional milliseconds to respond faster if we could sparede/recompression.

Those are my thoughts about server side de/recompression. It would begreat if I could get some responses and thoughts back.


Jan



On 07.11.2014 00:23, Jay Kreps wrote:

I suspect it is possible to save and reuse the CRCs though it might be a
bit of an invasive change. I suspect the first usage is when we are
checking the validity of the messages and the second is from when we
rebuild the compressed message set (I'm assuming you guys are using
compression because I think we optimize this out otherwise). Technically I
think the CRCs stay the same.

An alternative approach, though, would be working to remove the need for
recompression entirely on the broker side by making the offsets in the
compressed message relative to the base offset of the message set. This is
a much more invasive change but potentially better as it would also remove
the recompression done on the broker which is also CPU heavy.

-Jay

On Thu, Nov 6, 2014 at 2:36 PM, Allen Wang <aw...@netflix.com.invalid>
wrote:

Sure. Here is the link to the screen shot of jmc with the JTR file loaded:

http://picpaste.com/fligh-recorder-crc.png



On Thu, Nov 6, 2014 at 2:12 PM, Neha Narkhede <neha.narkh...@gmail.com>
wrote:

Allen,

Apache mailing lists don't allow attachments. Could you please link to a
pastebin or something?

Thanks,
Neha

On Thu, Nov 6, 2014 at 12:02 PM, Allen Wang <aw...@netflix.com.invalid>
wrote:

After digging more into the stack trace got from flight recorder (which

is

attached), it seems that Kafka (0.8.1.1) can optimize the usage of

Crc32.

The stack trace shows that Crc32 is invoked twice from Log.append().

First

is from the line number 231:

val appendInfo = analyzeAndValidateMessageSet(messages)

The second time is from line 252 in the same function:

validMessages = validMessages.assignOffsets(offset, appendInfo.codec)

If one of the Crc32 invocation can be eliminated, we are looking at

saving

at least 7% of CPU usage.

Thanks,
Allen




On Wed, Nov 5, 2014 at 6:32 PM, Allen Wang <aw...@netflix.com> wrote:

Hi,

Using flight recorder, we have observed high CPU usage of CRC32
(kafka.utils.Crc32.update()) on Kafka broker. It uses as much as 25%

of

CPU

on an instance. Tracking down stack trace, this method is invoked by
ReplicaFetcherThread.

Is there any tuning we can do to reduce this?

Also on the topic of CPU utilization, we observed that overall CPU
utilization is proportional to AllTopicsBytesInPerSec metric. Does

this

metric include incoming replication traffic?

Thanks,
Allen

Re: High CPU usage of Crc32 on Kafka broker

Reply via email to