In general increasing message size should increase bytes/sec throughput since much of the work is on a per-message basis. I think the question remains why raising the buffer size with fixed message size would drop the throughput. Sounds like a bug if you can reproduce it consistently. Want to file a JIRA and see if others can reproduce the same thing?
For the multi-server test I may have misread your email. When you say you see 33MB/sec across 3 servers does that mean an aggregate of ~100MB/sec? I was assuming yes and what you were seeing was that you were maxing out the client's bandwidth so as you added servers each server got a smaller chunk of the ~100MB/sec client bandwidth. Maybe that's not what you're saying, though. -Jay On Mon, Aug 31, 2015 at 9:49 AM, explorer <jind...@gmail.com> wrote: > H Jay, > > Thanks for the response. > > The second command was indeed a typo. It should have been > > bin/kafka-run-class.sh > org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100 > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=134217728 > batch.size=8192 > > And the throughput would drop to ~9MB/sec. > > But if I increase the message size, say 10,000 bytes per message > > bin/kafka-run-class.sh > org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000 > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=134217728 > batch.size=8192 > > The throughput would bounce back to ~33MB/sec. > > I am playing with these numbers just to get a pattern as to what kind of > combination > would serve us the best as far as message size goes. So it would help > better if we > can safely say that higher buffer memory gives better performance but only > to certain extent. > > But in our test context, I get to see lowered throughput with higher memory > buffer. But > once I increase the message size, then the throughput seems normal again. > This is the > confusing point. > > For the second part, I am indeed on a 1 gigabit Ethernet. I just feel > confused why > > A single partition (on a single broker) test yields 100MB/sec throughput > > while > > 3 partitions on 3 brokers (all on different physical server) gave me the > reading of 33MB/sec > > and to make it more clear > > 2 partitions on 2 brokers (on different physical server too) gave me the > reading of 25MB/sec > > I just wanna know how to interpret these numbers so I can draw a pattern > but so far this is > not very consistent (more partitions = less throughput?) > > Cheers, > > Paul > > > > On Tue, Sep 1, 2015 at 12:09 AM, Jay Kreps <j...@confluent.io> wrote: > > > The second command you give actually doesn't seem to double the memory > > (maybe just a typo?). I can't explain why doubling buffer memory would > > decrease throughput. The only effect of adding memory would be if you run > > out, and then running out of memory would cause you to block and hence > > lower throughput. So more memory should only be able to help (or have no > > effect). I wonder if something else was different between the tests? > > > > For the second test is it possible that you are on 1 gigabit ethernet? 1 > > gigabit ~= 100mb once you account for the protocol overhead (TCP and > > Kafka's protocol). > > > > -Jay > > > > On Mon, Aug 31, 2015 at 3:14 AM, explorer <jind...@gmail.com> wrote: > > > > > Hi all, > > > > > > Since my company is considering adopting Kafka as our message bus, I > > > have been assigned the task to perform some benchmark tests. I > > > basically followed what Jay wrote on this article > > > < > > > > > > http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > > > > > > > > > The benchmarks were set up using 4 nodes with one node acting as both > > > producer and consumer while the rest function as Kafka brokers. > > > > > > This is the baseline (50M messages (100 bytes each) ,64MB buffer > > > memory, and 8192 batch size) > > > > > > bin/kafka-run-class.sh > > > org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100 > > > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864 > > > batch.size=8192 > > > > > > which on our setup yielded the result of > > > > > > 50000000 records sent, 265939.057406 records/sec (25.36 MB/sec) > > > > > > However, by doubling the buffer.memory to 128M > > > > > > bin/kafka-run-class.sh > > > org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000 > > > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864 > > > batch.size=8192 > > > > > > The throughput dropped significantly. > > > > > > 50000000 records sent, 93652.601295 records/sec (8.93 MB/sec) > > > > > > Anyone able to interpret why the throughput degraded so much? > > > > > > Likewise, when performing benchmarks using 3 partitions across 3 > > > nodes, the maximum throughput shown is roughly 33.2MB/sec, whereas a > > > single partition (on a single node) yields 100MB/sec. > > > > > > My guess is that on a 3 nodes setup, I need to multiply the 33.2 > > > MB/sec reading by 3 since the the 33.2MB/sec reading only represents > > > the bandwidth available to one single node. > > > > > > Again, anyone out there willing to shed some lights on how to > > > interpret the numbers correctly? > > > > > > Cheers, > > > > > > Paul > > > > > >