[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

ASF GitHub Bot (JIRA) Wed, 03 May 2017 14:02:10 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995673#comment-15995673
 ]


ASF GitHub Bot commented on KAFKA-5150:
---------------------------------------

GitHub user xvrl opened a pull request:

    https://github.com/apache/kafka/pull/2967

    KAFKA-5150 reduce lz4 decompression overhead

    - reuse decompression buffers, keeping one per thread
    - switch lz4 input stream to operate directly on ByteBuffers
    - more tests with both compressible / incompressible data, multiple
      blocks, and various other combinations to increase code coverage
    - fixes bug that would cause EOFException instead of invalid block size
      for invalid incompressible blocks
    
    Overall this improves LZ4 decompression performance by up to 23x for small 
batches.
    Most improvements are seen for batches of size 1 with messages on the order 
of ~100B.
    At least 10x improvements for for batch sizes of < 10 messages, with 
messages of < 10kB
    
    See benchmark code and results here
    https://gist.github.com/xvrl/05132e0643513df4adf842288be86efd

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xvrl/kafka kafka-5150

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2967.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2967
    
----
commit 0efc6e7f15b6994a6665da5975e69c77426cf904
Author: Xavier Léauté <xav...@confluent.io>
Date:   2017-05-03T20:40:45Z

    KAFKA-5150 reduce lz4 decompression overhead
    
    - reuse decompression buffers, keeping one per thread
    - switch lz4 input stream to operate directly on ByteBuffers
    - more tests with both compressible / incompressible data, multiple
      blocks, and various other combinations to increase code coverage
    - fixes bug that would cause EOFException instead of invalid block size
      for invalid incompressible blocks

----


> LZ4 decompression is 4-5x slower than Snappy on small batches / messages
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-5150
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5150
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.8.2.2, 0.9.0.1, 0.11.0.0, 0.10.2.1
>            Reporter: Xavier Léauté
>            Assignee: Xavier Léauté
>
> I benchmarked RecordsIteratorDeepRecordsIterator instantiation on small batch 
> sizes with small messages after observing some performance bottlenecks in the 
> consumer. 
> For batch sizes of 1 with messages of 100 bytes, LZ4 heavily underperforms 
> compared to Snappy (see benchmark below). Most of our time is currently spent 
> allocating memory blocks in KafkaLZ4BlockInputStream, due to the fact that we 
> default to larger 64kB block sizes. Some quick testing shows we could improve 
> performance by almost an order of magnitude for small batches and messages if 
> we re-used buffers between instantiations of the input stream.
> [Benchmark 
> Code|https://github.com/xvrl/kafka/blob/small-batch-lz4-benchmark/clients/src/test/java/org/apache/kafka/common/record/DeepRecordsIteratorBenchmark.java#L86]
> {code}
> Benchmark                                              (compressionType)  
> (messageSize)   Mode  Cnt       Score       Error  Units
> DeepRecordsIteratorBenchmark.measureSingleMessage                    LZ4      
>       100  thrpt   20   84802.279 ±  1983.847  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage                 SNAPPY      
>       100  thrpt   20  407585.747 ±  9877.073  ops/s
> DeepRecordsIteratorBenchmark.measureSingleMessage                   NONE      
>       100  thrpt   20  579141.634 ± 18482.093  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (KAFKA-5150) LZ4 decompression is 4-5x slower than Snappy on small batches / messages

Reply via email to