H Jay,

Thanks for the response.

The second command was indeed a typo.  It should have been

bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100
-1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=134217728
batch.size=8192

And the throughput would drop to ~9MB/sec.

But if I increase the message size, say 10,000 bytes per message

bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000
-1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=134217728
batch.size=8192

The throughput would bounce back to ~33MB/sec.

I am playing with these numbers just to get a pattern as to what kind of
combination
would serve us the best as far as message size goes.  So it would help
better if we
can safely say that higher buffer memory gives better performance but only
to certain extent.

But in our test context, I get to see lowered throughput with higher memory
buffer.  But
once I increase the message size, then the throughput seems normal again.
This is the
confusing point.

For the second part, I am indeed on a 1 gigabit Ethernet.  I just feel
confused why

A single partition (on a single broker) test yields 100MB/sec throughput

while

3 partitions on 3 brokers (all on different physical server) gave me the
reading of 33MB/sec

and to make it more clear

2 partitions on 2 brokers (on different physical server too) gave me the
reading of 25MB/sec

I just wanna know how to interpret these numbers so I can draw a pattern
but so far this is
not very consistent (more partitions = less throughput?)

Cheers,

Paul



On Tue, Sep 1, 2015 at 12:09 AM, Jay Kreps <j...@confluent.io> wrote:

> The second command you give actually doesn't seem to double the memory
> (maybe just a typo?). I can't explain why doubling buffer memory would
> decrease throughput. The only effect of adding memory would be if you run
> out, and then running out of memory would cause you to block and hence
> lower throughput. So more memory should only be able to help (or have no
> effect). I wonder if something else was different between the tests?
>
> For the second test is it possible that you are on 1 gigabit ethernet? 1
> gigabit ~= 100mb once you account for the protocol overhead (TCP and
> Kafka's protocol).
>
> -Jay
>
> On Mon, Aug 31, 2015 at 3:14 AM, explorer <jind...@gmail.com> wrote:
>
> > Hi all,
> >
> > Since my company is considering adopting Kafka as our message bus, I
> > have been assigned the task to perform some benchmark tests.  I
> > basically followed what Jay wrote on this article
> > <
> >
> http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> > >
> >
> > The benchmarks were set up using 4 nodes with one node acting as both
> > producer and consumer while the rest function as Kafka brokers.
> >
> > This is the baseline (50M messages (100 bytes each) ,64MB buffer
> > memory, and 8192 batch size)
> >
> > bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance test1 50000000 100
> > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864
> > batch.size=8192
> >
> > which on our setup yielded the result of
> >
> > 50000000 records sent, 265939.057406 records/sec (25.36 MB/sec)
> >
> > However, by doubling the buffer.memory to 128M
> >
> > bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance test1 500000 10000
> > -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864
> > batch.size=8192
> >
> > The throughput dropped significantly.
> >
> > 50000000 records sent, 93652.601295 records/sec (8.93 MB/sec)
> >
> > Anyone able to interpret why the throughput degraded so much?
> >
> > Likewise, when performing benchmarks using 3 partitions across 3
> > nodes, the maximum throughput shown is roughly 33.2MB/sec, whereas a
> > single partition (on a single node) yields 100MB/sec.
> >
> > My guess is that on a 3 nodes setup, I need to multiply the 33.2
> > MB/sec reading by 3 since the the 33.2MB/sec reading only represents
> > the bandwidth available to one single node.
> >
> > Again, anyone out there willing to shed some lights on how to
> > interpret the numbers correctly?
> >
> > Cheers,
> >
> > Paul
> >
>

Reply via email to