Hello, First here's what I have:
- 5 node (m4.2xlarge, 8 vCPU, 32G RAM) Kafka cluster - running version: 0.11.0 - each broker has 4 dedicated ebs storage - a single topic: 40 partitions, repl-factor=3 I'm using the kafka-producer-perf-test to benchmark it. --------------------- Test 1: - record-size: 1000 bytes - producer setting: batch.size=40000, linger.ms=40 Throughput: 53K records/sec (51MBsec) ---------------------- Test 2: - same as Test 1 but also added: compression.type=lz4 Throughput: 316K records/sec (300MB/sec) ---------------------- So far so good! And pretty impressive! But problem starts when I increase the record-size to 48K bytes. Of course, I realize the throughput will decrease as record size gets bigger. But what I'm seeing appears to be beyond the normal expected behavior. I appear to be hitting on some resource/blocking issues. But I can't figure out what. Here's the benchmark result. Test 3: - record-size: 48000 bytes - producer setting: batch.size=40,000, linger.ms=40, compression.type=lz4 Throughput: 1160 records/sec (53MB/sec) And I've systematically changed tried many different combinations of above parameters (i.e. increasing batch.size gradually from 40,000 to 4,000,000. linger.ms from 40 to 4000, etc and their combinations). For every test, the throughput would more or less remain the same at between 1160 to 1350 records/sec. Now, if I run 2 producers in parallel, each producer will still manage to produce throughput of 1160 rec/sec. Even with 3 producers in parallel, same thing. So obviously, it's not the kafka cluster issue. There appears to be something blocking on the producer side. And it doesn't appear to be the cpu or the memory. Could it be the kafka-producer-perf-tool script that's the problem? If anyone has a suggestion to help me investigate this further, I would very much appreciate it. Thank you! Btw, was the "--threads" option removed from the tool? Why? regards, Sunny