The second command you give actually doesn't seem to double the memory (maybe just a typo?). I can't explain why doubling buffer memory would decrease throughput. The only effect of adding memory would be if you run out, and then running out of memory would cause you to block and hence lower throughput. So more memory should only be able to help (or have no effect). I wonder if something else was different between the tests?
For the second test is it possible that you are on 1 gigabit ethernet? 1 gigabit ~= 100mb once you account for the protocol overhead (TCP and Kafka's protocol). -Jay On Mon, Aug 31, 2015 at 3:14 AM, explorer <jind...@gmail.com> wrote: > Hi all, > > Since my company is considering adopting Kafka as our message bus, I > have been assigned the task to perform some benchmark tests. I > basically followed what Jay wrote on this article > < > http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > > > The benchmarks were set up using 4 nodes with one node acting as both > producer and consumer while the rest function as Kafka brokers. > > This is the baseline (50M messages (100 bytes each) ,64MB buffer > memory, and 8192 batch size) > > bin/kafka-run-class.sh > org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100 > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864 > batch.size=8192 > > which on our setup yielded the result of > > 50000000 records sent, 265939.057406 records/sec (25.36 MB/sec) > > However, by doubling the buffer.memory to 128M > > bin/kafka-run-class.sh > org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000 > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864 > batch.size=8192 > > The throughput dropped significantly. > > 50000000 records sent, 93652.601295 records/sec (8.93 MB/sec) > > Anyone able to interpret why the throughput degraded so much? > > Likewise, when performing benchmarks using 3 partitions across 3 > nodes, the maximum throughput shown is roughly 33.2MB/sec, whereas a > single partition (on a single node) yields 100MB/sec. > > My guess is that on a 3 nodes setup, I need to multiply the 33.2 > MB/sec reading by 3 since the the 33.2MB/sec reading only represents > the bandwidth available to one single node. > > Again, anyone out there willing to shed some lights on how to > interpret the numbers correctly? > > Cheers, > > Paul >