In general increasing message size should increase bytes/sec throughput
since much of the work is on a per-message basis. I think the question
remains why raising the buffer size with fixed message size would drop the
throughput. Sounds like a bug if you can reproduce it consistently. Want to
file a JIRA and see if others can reproduce the same thing?

For the multi-server test I may have misread your email. When you say you
see 33MB/sec across 3 servers does that mean an aggregate of ~100MB/sec? I
was assuming yes and what you were seeing was that you were maxing out the
client's bandwidth so as you added servers each server got a smaller chunk
of the ~100MB/sec client bandwidth. Maybe that's not what you're saying,
though.

-Jay

On Mon, Aug 31, 2015 at 9:49 AM, explorer <jind...@gmail.com> wrote:

> H Jay,
>
> Thanks for the response.
>
> The second command was indeed a typo.  It should have been
>
> bin/kafka-run-class.sh
> org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100
> -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=134217728
> batch.size=8192
>
> And the throughput would drop to ~9MB/sec.
>
> But if I increase the message size, say 10,000 bytes per message
>
> bin/kafka-run-class.sh
> org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000
> -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=134217728
> batch.size=8192
>
> The throughput would bounce back to ~33MB/sec.
>
> I am playing with these numbers just to get a pattern as to what kind of
> combination
> would serve us the best as far as message size goes.  So it would help
> better if we
> can safely say that higher buffer memory gives better performance but only
> to certain extent.
>
> But in our test context, I get to see lowered throughput with higher memory
> buffer.  But
> once I increase the message size, then the throughput seems normal again.
> This is the
> confusing point.
>
> For the second part, I am indeed on a 1 gigabit Ethernet.  I just feel
> confused why
>
> A single partition (on a single broker) test yields 100MB/sec throughput
>
> while
>
> 3 partitions on 3 brokers (all on different physical server) gave me the
> reading of 33MB/sec
>
> and to make it more clear
>
> 2 partitions on 2 brokers (on different physical server too) gave me the
> reading of 25MB/sec
>
> I just wanna know how to interpret these numbers so I can draw a pattern
> but so far this is
> not very consistent (more partitions = less throughput?)
>
> Cheers,
>
> Paul
>
>
>
> On Tue, Sep 1, 2015 at 12:09 AM, Jay Kreps <j...@confluent.io> wrote:
>
> > The second command you give actually doesn't seem to double the memory
> > (maybe just a typo?). I can't explain why doubling buffer memory would
> > decrease throughput. The only effect of adding memory would be if you run
> > out, and then running out of memory would cause you to block and hence
> > lower throughput. So more memory should only be able to help (or have no
> > effect). I wonder if something else was different between the tests?
> >
> > For the second test is it possible that you are on 1 gigabit ethernet? 1
> > gigabit ~= 100mb once you account for the protocol overhead (TCP and
> > Kafka's protocol).
> >
> > -Jay
> >
> > On Mon, Aug 31, 2015 at 3:14 AM, explorer <jind...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Since my company is considering adopting Kafka as our message bus, I
> > > have been assigned the task to perform some benchmark tests.  I
> > > basically followed what Jay wrote on this article
> > > <
> > >
> >
> http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> > > >
> > >
> > > The benchmarks were set up using 4 nodes with one node acting as both
> > > producer and consumer while the rest function as Kafka brokers.
> > >
> > > This is the baseline (50M messages (100 bytes each) ,64MB buffer
> > > memory, and 8192 batch size)
> > >
> > > bin/kafka-run-class.sh
> > > org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100
> > > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864
> > > batch.size=8192
> > >
> > > which on our setup yielded the result of
> > >
> > > 50000000 records sent, 265939.057406 records/sec (25.36 MB/sec)
> > >
> > > However, by doubling the buffer.memory to 128M
> > >
> > > bin/kafka-run-class.sh
> > > org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000
> > > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864
> > > batch.size=8192
> > >
> > > The throughput dropped significantly.
> > >
> > > 50000000 records sent, 93652.601295 records/sec (8.93 MB/sec)
> > >
> > > Anyone able to interpret why the throughput degraded so much?
> > >
> > > Likewise, when performing benchmarks using 3 partitions across 3
> > > nodes, the maximum throughput shown is roughly 33.2MB/sec, whereas a
> > > single partition (on a single node) yields 100MB/sec.
> > >
> > > My guess is that on a 3 nodes setup, I need to multiply the 33.2
> > > MB/sec reading by 3 since the the 33.2MB/sec reading only represents
> > > the bandwidth available to one single node.
> > >
> > > Again, anyone out there willing to shed some lights on how to
> > > interpret the numbers correctly?
> > >
> > > Cheers,
> > >
> > > Paul
> > >
> >
>

Reply via email to