Hi all,

Since my company is considering adopting Kafka as our message bus, I
have been assigned the task to perform some benchmark tests.  I
basically followed what Jay wrote on this article
<http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines>

The benchmarks were set up using 4 nodes with one node acting as both
producer and consumer while the rest function as Kafka brokers.

This is the baseline (50M messages (100 bytes each) ,64MB buffer
memory, and 8192 batch size)

bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100
-1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864
batch.size=8192

which on our setup yielded the result of

50000000 records sent, 265939.057406 records/sec (25.36 MB/sec)

However, by doubling the buffer.memory to 128M

bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000
-1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864
batch.size=8192

The throughput dropped significantly.

50000000 records sent, 93652.601295 records/sec (8.93 MB/sec)

Anyone able to interpret why the throughput degraded so much?

Likewise, when performing benchmarks using 3 partitions across 3
nodes, the maximum throughput shown is roughly 33.2MB/sec, whereas a
single partition (on a single node) yields 100MB/sec.

My guess is that on a 3 nodes setup, I need to multiply the 33.2
MB/sec reading by 3 since the the 33.2MB/sec reading only represents
the bandwidth available to one single node.

Again, anyone out there willing to shed some lights on how to
interpret the numbers correctly?

Cheers,

Paul

Reply via email to