[ 
https://issues.apache.org/jira/browse/KAFKA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254550#comment-15254550
 ] 

Jiangjie Qin commented on KAFKA-3565:
-------------------------------------

[~jkreps] [~ijuma]

I ran the tests a few more times and updated the results in the previous google 
sheet. The 8th run was using a fixed timestamp 0L for trunk.

A brief summary of the cases where 0.9 wins in the eight runs we have is 
following.
("2nd" means 0.9 wins, followed by the configurations, followed by the trunk 
throughput and 0.9 throughput in MB/sec, followed by the throughput difference 
in percentage.)
{noformat}
Run 1:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.04   <  2.38,   16%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=1000,  compression.type=gzip    (3.18   <  3.49,   9%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.43   <  2.46,   
1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=1000,  compression.type=gzip    (3.16   <  3.49,   
10%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=1000,  compression.type=gzip    (3.09   <  3.49,   
12%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.52   <  2.55,   
1%)

Run 2:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (1.82   <  2.00,   9%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=snappy  (19.32  <  20.37,  5%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=gzip    (1.80   <  1.86,   
3%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.87   <  2.01,   
7%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.01   <  2.14,   
6%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.05   <  2.13,   
3%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.25   <  2.28,   1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  linger.ms=0, 
   messageSize=100,   compression.type=snappy  (22.14  <  23.38,  5%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.23   <  2.29,   
2%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=100,   compression.type=snappy  (22.68  <  23.55,  
3%)

Run 3:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (1.88   <  2.04,   8%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=gzip    (1.86   <  2.02,   
8%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=snappy  (20.23  <  20.44,  
1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.97   <  2.01,   
2%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=snappy  (20.34  <  21.15,  
3%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.07   <  2.13,   2%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.06   <  2.09,   
1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.06   <  2.14,   
3%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.20   <  2.30,   4%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  linger.ms=0, 
   messageSize=100,   compression.type=snappy  (21.04  <  23.68,  12%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=100,   compression.type=snappy  (22.91  <  23.39,  
2%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=1000,  compression.type=snappy  (38.99  <  39.80,  
2%)

Run 4:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (1.82   <  2.03,   11%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=snappy  (16.74  <  20.99,  25%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=gzip    (1.74   <  2.05,   
17%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=snappy  (17.12  <  21.15,  
23%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.72   <  2.04,   
18%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=snappy  (18.42  <  20.64,  
12%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   linger.ms=0, 
   messageSize=100,   compression.type=gzip    (1.95   <  2.08,   6%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=10,   messageSize=100,   compression.type=gzip    (1.90   <  2.13,   
12%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.90   <  2.14,   
12%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.27   <  2.30,   1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.26   <  2.29,   
1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.17   <  2.27,   
4%)

Run 5:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (1.96   <  2.06,   5%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=snappy  (21.01  <  21.33,  1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=gzip    (1.94   <  2.03,   
4%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.89   <  1.99,   
5%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.07   <  2.12,   
2%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.21   <  2.26,   2%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.21   <  2.29,   
3%)

Run 6:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.00   <  2.03,   1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=snappy  (20.99  <  21.05,  
0%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.86   <  2.01,   
8%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.06   <  2.14,   
3%)

Run 7:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (1.81   <  2.08,   14%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=snappy  (20.46  <  21.15,  3%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=gzip    (1.91   <  2.04,   
6%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=snappy  (20.90  <  21.07,  
0%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.92   <  2.10,   
9%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.28   <  2.29,   0%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.23   <  2.29,   
2%)

Run 8:
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    linger.ms=0, 
   messageSize=100,   compression.type=gzip    (1.95   <  2.04,   4%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=gzip    (1.96   <  1.99,   
1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=10,   messageSize=100,   compression.type=snappy  (19.19  <  21.03,  
9%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=gzip    (1.91   <  2.02,   
5%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=500,    
linger.ms=100,  messageSize=100,   compression.type=snappy  (19.54  <  20.65,  
5%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   linger.ms=0, 
   messageSize=100,   compression.type=gzip    (2.01   <  2.14,   6%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.09   <  2.12,   
1%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=5000,   
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.03   <  2.12,   
4%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=100,   compression.type=gzip    (2.20   <  2.28,   
3%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=10,   messageSize=100,   compression.type=snappy  (23.38  <  24.08,  
2%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=100,  messageSize=100,   compression.type=gzip    (2.19   <  2.29,   
4%)
2nd:  max.in.flight.requests.per.connection=5,  valueBound=50000,  
linger.ms=100,  messageSize=100,   compression.type=snappy  (21.38  <  22.53,  
5%)
{noformat}

In almost all the cases where 0.9 wins, 
max.in.flight.requests.per.connection=5, messageSize=100. In these cases, the 
throughput depends more on the user thread because the sender thread is 
pipelining. 

It seems that we can say if the producers are sending small messages and the 
user thread is the bottleneck, the trunk producer will have less throughput 
compared with 0.9. The magnitude of the throughput reduce fluctuates.

I am not sure if we should change the default setting in the producer. We may 
be able to change the default setting for the producer but It seems for best 
performance the users need to tune the producer anyways.

> Producer's throughput lower with compressed data after KIP-31/32
> ----------------------------------------------------------------
>
>                 Key: KAFKA-3565
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3565
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Ismael Juma
>            Priority: Critical
>             Fix For: 0.10.0.0
>
>
> Relative offsets were introduced by KIP-31 so that the broker does not have 
> to recompress data (this was previously required after offsets were 
> assigned). The implicit assumption is that reducing CPU usage required by 
> recompression would mean that producer throughput for compressed data would 
> increase.
> However, this doesn't seem to be the case:
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:    
> 2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status:     PASS
> run time:   59.030 seconds
> {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
> {code}
> Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:    
> 2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status:     PASS
> run time:   1 minute 0.243 seconds
> {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
> {code}
> Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
> The difference for the uncompressed case is smaller (and within what one 
> would expect given the additional size overhead caused by the timestamp 
> field):
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:    
> 2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status:     PASS
> run time:   1 minute 4.176 seconds
> {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
> {code}
> Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:    
> 2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status:     PASS
> run time:   1 minute 5.079 seconds
> {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
> {code}
> Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to