[ https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dana Powers updated KAFKA-3160: ------------------------------- Description: KAFKA-1493 partially implements the LZ4 framing specification, but it incorrectly calculates the header checksum. This causes KafkaLZ4BlockInputStream to raise an error [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly framed LZ4 data, which means clients decoding LZ4 messages from kafka will always receive incorrectly framed data. Specifically, the current implementation includes the 4-byte MagicNumber in the checksum, which is incorrect. http://cyan4973.github.io/lz4/lz4_Frame_format.html Third-party clients that attempt to use off-the-shelf lz4 framing find that brokers reject messages as having a corrupt checksum. So currently non-java clients must 'fixup' lz4 packets to deal with the broken checksum. Magnus first identified this issue in librdkafka; kafka-python has the same problem. was: KAFKA-1493 partially implements the LZ4 framing specification, but it incorrectly calculates the header checksum. This causes KafkaLZ4BlockInputStream to raise an error [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed LZ4 data. It also causes the kafka broker to always return incorrectly framed LZ4 data to clients. Specifically, the current implementation includes the 4-byte MagicNumber in the checksum, which is incorrect. http://cyan4973.github.io/lz4/lz4_Frame_format.html Third-party clients that attempt to use off-the-shelf lz4 framing find that brokers reject messages as having a corrupt checksum. So currently non-java clients must 'fixup' lz4 packets to deal with the broken checksum. Magnus first identified this issue in librdkafka; kafka-python has the same problem. > Kafka LZ4 framing code miscalculates header checksum > ---------------------------------------------------- > > Key: KAFKA-3160 > URL: https://issues.apache.org/jira/browse/KAFKA-3160 > Project: Kafka > Issue Type: Bug > Components: compression > Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2, 0.9.0.1 > Reporter: Dana Powers > Assignee: Magnus Edenhill > Labels: compatibility, compression, lz4 > > KAFKA-1493 partially implements the LZ4 framing specification, but it > incorrectly calculates the header checksum. This causes > KafkaLZ4BlockInputStream to raise an error > [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed > LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly > framed LZ4 data, which means clients decoding LZ4 messages from kafka will > always receive incorrectly framed data. > Specifically, the current implementation includes the 4-byte MagicNumber in > the checksum, which is incorrect. > http://cyan4973.github.io/lz4/lz4_Frame_format.html > Third-party clients that attempt to use off-the-shelf lz4 framing find that > brokers reject messages as having a corrupt checksum. So currently non-java > clients must 'fixup' lz4 packets to deal with the broken checksum. > Magnus first identified this issue in librdkafka; kafka-python has the same > problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)