WellŠ not to be contrarian, but latency depends much more on the latency
between the producer and the broker that is the leader for the partition
you are publishing to.  At least when your brokers are not saturated with
messages, and acks are set to 1.  If acks are set to ALL, latency on an
non-saturated kafka cluster will be: Round Trip Latency from producer to
leader for partition + Max( slowest Round Trip Latency to a replicas of
that partition).  If a cluster is saturated with messages, we have to
assume that all partitions receive an equal distribution of messages to
avoid linear algebra and queueing theory models.  I don¹t like linear
algebra :P  

Since you are probably putting all your latencies into a single histogram
per producer, or worse, just an average, this pattern would have been
obscured.  Obligatory lecture about measuring latency by Gil Tene
(https://www.youtube.com/watch?v=9MKY4KypBzg).  To verify this hypothesis,
you should re-write the benchmark to plot the latencies for each write to
a partition for each producer into a histogram. (HRD histogram is pretty
good for that).  This would give you producers*partitions histograms,
which might be unwieldy for that many producers. But wait, there is hope!

To verify that this hypothesis holds, you just have to see that there is a
significant difference between different partitions on a SINGLE producing
client. So, pick one producing client at random and use the data from
that. The easy way to do that is just plot all the partition latency
histograms on top of each other in the same plot, that way you have a
pretty plot to show people.  If you don¹t want to setup plotting, you can
just compare the medians (50¹th percentile) of the partitions¹ histograms.
 If there is a lot of variance, your latency anomaly is explained by
brokers 4-7 being slower than nodes 0-3!  If there isn¹t a lot of variance
at 50%, look at higher percentiles.  And if higher percentiles for all the
partitions look the same, this hypothesis is disproved.

If you want to make a general statement about latency of writing to kafka,
you can merge all the histograms into a single histogram and plot that.

To Yuheng¹s credit, more brokers always results in more throughput. But
throughput and latency are two different creatures.  Its worth noting that
kafka is designed to be high throughput first and low latency second.  And
it does a really good job at both.

Disclaimer: I might not like linear algebra, but I do like statistics.
Let me know if there are topics that need more explanation above that
aren¹t covered by Gil¹s lecture.
-Erik

On 9/4/15, 9:03 AM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote:

>When I using 32 partitions, the 4 brokers latency becomes larger than the
>8
>brokers latency.
>
>So is it always true that using more brokers can give less latency when
>the
>number of partitions is at least the size of the brokers?
>
>Thanks.
>
>On Thu, Sep 3, 2015 at 10:45 PM, Yuheng Du <yuheng.du.h...@gmail.com>
>wrote:
>
>> I am running a producer latency test. When using 92 producers in 92
>> physical node publishing to 4 brokers, the latency is slightly lower
>>than
>> using 8 brokers, I am using 8 partitions for the topic.
>>
>> I have rerun the test and it gives me the same result, the 4 brokers
>> scenario still has lower latency than the 8 brokers scenarios.
>>
>> It is weird because I tested 1broker, 2 brokers, 4 brokers, 8 brokers,
>>16
>> brokers and 32 brokers. For the rest of the case the latency decreases
>>as
>> the number of brokers increase.
>>
>> 4 brokers/8 brokers is the only pair that doesn't satisfy this rule.
>>What
>> could be the cause?
>>
>> I am using a 200 bytes message, the test let each producer publishes
>>500k
>> messages to a given topic. Every test run when I change the number of
>> brokers, I use a new topic.
>>
>> Thanks for any advices.
>>

Reply via email to