Fabian Lange created KAFKA-9387: ----------------------------------- Summary: LZ4 Compression creates significant unnecessary CPU usage Key: KAFKA-9387 URL: https://issues.apache.org/jira/browse/KAFKA-9387 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 2.4.0 Reporter: Fabian Lange Attachments: Screenshot 2020-01-08 at 16.52.38.png
KafkaLZ4BlockOutputStream and KafkaLZ4BlockInputStream perform checksumming on 3 bytes in the header. This is potentially quite unnecessary, but this ticket proposes a solution to improve the performance 10x. {{kafka-downstream-0 id=152 state=RUNNABLE at net.jpountz.xxhash.XXHashJNI.XXH32(Native Method) at net.jpountz.xxhash.XXHash32JNI.hash(XXHash32JNI.java:30) at org.apache.kafka.common.record.KafkaLZ4BlockOutputStream.writeHeader(KafkaLZ4BlockOutputStream.java:156) at org.apache.kafka.common.record.KafkaLZ4BlockOutputStream.<init>(KafkaLZ4BlockOutputStream.java:85) at org.apache.kafka.common.record.KafkaLZ4BlockOutputStream.<init>(KafkaLZ4BlockOutputStream.java:125) at org.apache.kafka.common.record.CompressionType$4.wrapForOutput(CompressionType.java:101) at org.apache.kafka.common.record.MemoryRecordsBuilder.<init>(MemoryRecordsBuilder.java:130) at org.apache.kafka.common.record.MemoryRecordsBuilder.<init>(MemoryRecordsBuilder.java:166) at org.apache.kafka.common.record.MemoryRecords.builder(MemoryRecords.java:534) at org.apache.kafka.common.record.MemoryRecords.builder(MemoryRecords.java:516) at org.apache.kafka.common.record.MemoryRecords.builder(MemoryRecords.java:464) at org.apache.kafka.clients.producer.internals.RecordAccumulator.recordsBuilder(RecordAccumulator.java:245) at org.apache.kafka.clients.producer.internals.RecordAccumulator.append(RecordAccumulator.java:222) at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:917) at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:856) at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:743)}} by default Kafka doesn't do checksumming on blocks (blockChecksum=false) but it does doe checksumming on the header The header however is static, so its checksumming the same 6 or 2 bytes over and over again. Currently it uses the {{XXHashFactory.fastestInstance().hash32()}} but this will be a JNI one. For 2 bytes however, this is 10x slower than the java one, so we should replace it with {{fastestJavaInstance}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)