[ 
https://issues.apache.org/jira/browse/KAFKA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249601#comment-15249601
 ] 

Ismael Juma edited comment on KAFKA-3565 at 4/20/16 10:19 AM:
--------------------------------------------------------------

I ran the tests a couple of times with various settings to check if my previous 
results are reprocible and included three linger_ms parameters (0ms, 10ms, 
100ms).

I paste below one configuration and will follow-up with full results.

Test name and base parameters: test_producer_throughput with 
replication_factor=3, message_size=100, replication_factor=3, num_producers=1, 
acks=1

Additional parameters: linger_ms=0

Run 1
{code}
no compression 0.9.0.1: {"records_per_sec": 315361.137218, "mb_per_sec": 30.08}
no compression trunk: {"records_per_sec": 297798.313734, "mb_per_sec": 28.4}
snappy 0.9.0.1: {"records_per_sec": 553246.908491, "mb_per_sec": 52.76}
snappy trunk: {"records_per_sec": 577280.430108, "mb_per_sec": 55.05}
gzip 0.9.0.1: {"records_per_sec": 77354.44643, "mb_per_sec": 7.38}
gzip trunk: {"records_per_sec": 62830.118903, "mb_per_sec": 5.99}
{code}

Run 2
{code}
no compression 0.9.0.1: {"records_per_sec": 315955.037665, "mb_per_sec": 30.13}
no compression trunk: {"records_per_sec": 300464.965301, "mb_per_sec": 28.65}
snappy 0.9.0.1: {"records_per_sec": 613146.185473, "mb_per_sec": 58.47}
snappy trunk: {"records_per_sec": 566080.556727, "mb_per_sec": 53.99}
gzip 0.9.0.1: {"records_per_sec": 79531.701825, "mb_per_sec": 7.58}
gzip trunk: {"records_per_sec": 64608.501011, "mb_per_sec": 6.16}
{code}

Additional parameters: linger_ms=10

Run 1
{code}
no compression 0.9.0.1: {"records_per_sec": 321710.690316, "mb_per_sec": 30.68}
no compression trunk: {"records_per_sec": 295894.400353, "mb_per_sec": 28.22}
snappy 0.9.0.1: {"records_per_sec": 626892.573564, "mb_per_sec": 59.79}
snappy trunk: {"records_per_sec": 583555.217391, "mb_per_sec": 55.65}
gzip 0.9.0.1: {"records_per_sec": 101564.66137, "mb_per_sec": 9.69}
gzip trunk: {"records_per_sec": 93290.957114, "mb_per_sec": 8.9}
{code}

Run 2
{code}
no compression 0.9.0.1: {"records_per_sec": 322871.541977, "mb_per_sec": 30.79}
no compression trunk: {"records_per_sec": 297139.03033, "mb_per_sec": 28.34}
snappy 0.9.0.1: {"records_per_sec": 655040.019522, "mb_per_sec": 62.47}
snappy trunk: {"records_per_sec": 584571.864111, "mb_per_sec": 55.75}
gzip 0.9.0.1: {"records_per_sec": 106699.817156, "mb_per_sec": 10.18}
gzip trunk: {"records_per_sec": 93577.145646, "mb_per_sec": 8.92}
{code}

Additional parameters: linger_ms=100

Run 1
{code}
no compression 0.9.0.1: {"records_per_sec": 318958.412548, "mb_per_sec": 30.42}
no compression trunk: {"records_per_sec": 289574.325782, "mb_per_sec": 27.62}
snappy 0.9.0.1: {"records_per_sec": 654401.267674, "mb_per_sec": 62.41}
snappy trunk: {"records_per_sec": 533244.735797, "mb_per_sec": 50.85}
gzip 0.9.0.1: {"records_per_sec": 108845.754602, "mb_per_sec": 10.38}
gzip trunk: {"records_per_sec": 95630.708942, "mb_per_sec": 9.12}
{code}

Run 2
{code}
no compression 0.9.0.1: {"records_per_sec": 322561.163182, "mb_per_sec": 30.76}
no compression trunk: {"records_per_sec": 291524.10947, "mb_per_sec": 27.8}
snappy 0.9.0.1: {"records_per_sec": 626599.906629, "mb_per_sec": 59.76}
snappy trunk: {"records_per_sec": 568719.067797, "mb_per_sec": 54.24}
gzip 0.9.0.1: {"records_per_sec": 108660.70272, "mb_per_sec": 10.36}
gzip trunk: {"records_per_sec": 94786.511299, "mb_per_sec": 9.04}
{code}


was (Author: ijuma):
I ran the tests a couple of times with various settings to check if my previous 
results are reprocible and included three linger_ms parameters (0ms, 10ms, 
100ms).

I paste below one configuration and will follow-up with full results.

Test name and base parameters: test_producer_throughput with 
replication_factor=3, message_size=100, replication_factor=3, num_producers=1, 
acks=1

Additional parameters: linger_ms=0

Run 1
no compression 0.9.0.1: {"records_per_sec": 315361.137218, "mb_per_sec": 30.08}
no compression trunk: {"records_per_sec": 297798.313734, "mb_per_sec": 28.4}
snappy 0.9.0.1: {"records_per_sec": 553246.908491, "mb_per_sec": 52.76}
snappy trunk: {"records_per_sec": 577280.430108, "mb_per_sec": 55.05}
gzip 0.9.0.1: {"records_per_sec": 77354.44643, "mb_per_sec": 7.38}
gzip trunk: {"records_per_sec": 62830.118903, "mb_per_sec": 5.99}

Run 2
no compression 0.9.0.1: {"records_per_sec": 315955.037665, "mb_per_sec": 30.13}
no compression trunk: {"records_per_sec": 300464.965301, "mb_per_sec": 28.65}
snappy 0.9.0.1: {"records_per_sec": 613146.185473, "mb_per_sec": 58.47}
snappy trunk: {"records_per_sec": 566080.556727, "mb_per_sec": 53.99}
gzip 0.9.0.1: {"records_per_sec": 79531.701825, "mb_per_sec": 7.58}
gzip trunk: {"records_per_sec": 64608.501011, "mb_per_sec": 6.16}

Additional parameters: linger_ms=10

Run 1
no compression 0.9.0.1: {"records_per_sec": 321710.690316, "mb_per_sec": 30.68}
no compression trunk: {"records_per_sec": 295894.400353, "mb_per_sec": 28.22}
snappy 0.9.0.1: {"records_per_sec": 626892.573564, "mb_per_sec": 59.79}
snappy trunk: {"records_per_sec": 583555.217391, "mb_per_sec": 55.65}
gzip 0.9.0.1: {"records_per_sec": 101564.66137, "mb_per_sec": 9.69}
gzip trunk: {"records_per_sec": 93290.957114, "mb_per_sec": 8.9}

Run 2
no compression 0.9.0.1: {"records_per_sec": 322871.541977, "mb_per_sec": 30.79}
no compression trunk: {"records_per_sec": 297139.03033, "mb_per_sec": 28.34}
snappy 0.9.0.1: {"records_per_sec": 655040.019522, "mb_per_sec": 62.47}
snappy trunk: {"records_per_sec": 584571.864111, "mb_per_sec": 55.75}
gzip 0.9.0.1: {"records_per_sec": 106699.817156, "mb_per_sec": 10.18}
gzip trunk: {"records_per_sec": 93577.145646, "mb_per_sec": 8.92}

Additional parameters: linger_ms=100

Run 1
no compression 0.9.0.1: {"records_per_sec": 318958.412548, "mb_per_sec": 30.42}
no compression trunk: {"records_per_sec": 289574.325782, "mb_per_sec": 27.62}
snappy 0.9.0.1: {"records_per_sec": 654401.267674, "mb_per_sec": 62.41}
snappy trunk: {"records_per_sec": 533244.735797, "mb_per_sec": 50.85}
gzip 0.9.0.1: {"records_per_sec": 108845.754602, "mb_per_sec": 10.38}
gzip trunk: {"records_per_sec": 95630.708942, "mb_per_sec": 9.12}

Run 2
no compression 0.9.0.1: {"records_per_sec": 322561.163182, "mb_per_sec": 30.76}
no compression trunk: {"records_per_sec": 291524.10947, "mb_per_sec": 27.8}
snappy 0.9.0.1: {"records_per_sec": 626599.906629, "mb_per_sec": 59.76}
snappy trunk: {"records_per_sec": 568719.067797, "mb_per_sec": 54.24}
gzip 0.9.0.1: {"records_per_sec": 108660.70272, "mb_per_sec": 10.36}
gzip trunk: {"records_per_sec": 94786.511299, "mb_per_sec": 9.04}

> Producer's throughput lower with compressed data after KIP-31/32
> ----------------------------------------------------------------
>
>                 Key: KAFKA-3565
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3565
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Ismael Juma
>            Priority: Critical
>             Fix For: 0.10.0.0
>
>
> Relative offsets were introduced by KIP-31 so that the broker does not have 
> to recompress data (this was previously required after offsets were 
> assigned). The implicit assumption is that reducing CPU usage required by 
> recompression would mean that producer throughput for compressed data would 
> increase.
> However, this doesn't seem to be the case:
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:    
> 2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status:     PASS
> run time:   59.030 seconds
> {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
> {code}
> Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:    
> 2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status:     PASS
> run time:   1 minute 0.243 seconds
> {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
> {code}
> Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
> The difference for the uncompressed case is smaller (and within what one 
> would expect given the additional size overhead caused by the timestamp 
> field):
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:    
> 2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status:     PASS
> run time:   1 minute 4.176 seconds
> {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
> {code}
> Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:    
> 2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status:     PASS
> run time:   1 minute 5.079 seconds
> {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
> {code}
> Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to