Yes, and that can really hurt average performance. All the partitions were nearly identical up to the 99%’ile, and had very good performance at that level hovering around a few milli’s. But when looking beyond the 99%’ile, there was that clear fork in the distribution where a set of 3 partitions surged upwards. This could be for a dozen different reasons: Network blips, noisy networks, location in the network, resource contention on that broker, etc. But it effected that one broker more than others. And the reasons for my cluster displaying this behavior could be very different than the reason for any other cluster.
Its worth noting that this was mostly a latency test over a stress test. There was a single kafka producer object, very small message sizes (100 bytes), and it was only pushing through around 5MB/s worth of data. And the client was configured to minimize the amount of data that would be on the internal queue/buffer waiting to be sent. The messages that were being sent were compromised of 10 byte ascii ‘words’ selected randomly from a dictionary of 1000 words, which benefits compression while still resulting in likely unique messages. And the test I ran was only for 6 min, and I did not do the work required to see if there was a burst of slower messages which caused this behavior, or if it was a consistent issue with that node. -Erik On 9/9/15, 2:24 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: >So are you suggesting that the long delays happened in %1 percentile >happens in the slower partitions that are further away? Thanks. > >On Wed, Sep 9, 2015 at 3:15 PM, Helleren, Erik ><erik.helle...@cmegroup.com> >wrote: > >> So, I did my own latency test on a cluster of 3 nodes, and there is a >> significant difference around the 99%’ile and higher for partitions when >> measuring the the ack time when configured for a single ack. The graph >> that I wish I could attach or post clearly shows that around 1/3 of the >> partitions significantly diverge from the other two. So, at least in my >> case, one of my brokers is further than the others. >> -Erik >> >> On 9/4/15, 1:06 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: >> >> >No problem. Thanks for your advice. I think it would be fun to >>explore. I >> >only know how to program in java though. Hope it will work. >> > >> >On Fri, Sep 4, 2015 at 2:03 PM, Helleren, Erik >> ><erik.helle...@cmegroup.com> >> >wrote: >> > >> >> I thing the suggestion is to have partitions/brokers >=1, so 32 >>should >> >>be >> >> enough. >> >> >> >> As for latency tests, there isn’t a lot of code to do a latency test. >> >>If >> >> you just want to measure ack time its around 100 lines. I will try >>to >> >> push out some good latency testing code to github, but my company is >> >> scared of open sourcing code… so it might be a while… >> >> -Erik >> >> >> >> >> >> On 9/4/15, 12:55 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: >> >> >> >> >Thanks for your reply Erik. I am running some more tests according >>to >> >>your >> >> >suggestions now and I will share with my results here. Is it >>necessary >> >>to >> >> >use a fixed number of partitions (32 partitions maybe) for my test? >> >> > >> >> >I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are >> >>running >> >> >on individual physical nodes. So I think using at least 32 >>partitions >> >>will >> >> >make more sense? I have seen latencies increase as the number of >> >> >partitions >> >> >goes up in my experiments. >> >> > >> >> >To get the latency of each event data recorded, are you suggesting >> >>that I >> >> >rewrite my own test program (in Java perhaps) or I can just modify >>the >> >> >standard test program provided by kafka ( >> >> >https://gist.github.com/jkreps/c7ddb4041ef62a900e6c )? I guess I >>need >> >>to >> >> >rebuild the source if I modify the standard java test program >> >> >ProducerPerformance provided in kafka, right? Now this standard >>program >> >> >only has average latencies and percentile latencies but no per event >> >> >latencies. >> >> > >> >> >Thanks. >> >> > >> >> >On Fri, Sep 4, 2015 at 1:42 PM, Helleren, Erik >> >> ><erik.helle...@cmegroup.com> >> >> >wrote: >> >> > >> >> >> That is an excellent question! There are a bunch of ways to >>monitor >> >> >> jitter and see when that is happening. Here are a few: >> >> >> >> >> >> - You could slice the histogram every few seconds, save it out >>with a >> >> >> timestamp, and then look at how they compare. This would be >>mostly >> >> >> manual, or you can graph line charts of the percentiles over time >>in >> >> >>excel >> >> >> where each percentile would be a series. If you are using HDR >> >> >>Histogram, >> >> >> you should look at how to use the Recorder class to do this >>coupled >> >> >>with a >> >> >> ScheduledExecutorService. >> >> >> >> >> >> - You can just save the starting timestamp of the event and the >> >>latency >> >> >>of >> >> >> each event. If you put it into a CSV, you can just load it up >>into >> >> >>excel >> >> >> and graph as a XY chart. That way you can see every point during >>the >> >> >> running of your program and you can see trends. You want to be >> >>careful >> >> >> about this one, especially of writing to a file in the callback >>that >> >> >>kfaka >> >> >> provides. >> >> >> >> >> >> Also, I have noticed that most of the very slow observations are >>at >> >> >> startup. But don’t trust me, trust the data and share your >>findings. >> >> >> Also, having a 99.9 percentile provides a pretty good standard for >> >> >>typical >> >> >> poor case performance. Average is borderline useless, 50%’ile is >>a >> >> >>better >> >> >> typical case because that’s the number that says “half of events >> >>will be >> >> >> this slow or faster”, or for values that are high like 99.9%’ile, >> >>“0.1% >> >> >>of >> >> >> all events will be slower than this”. >> >> >> -Erik >> >> >> >> >> >> On 9/4/15, 12:05 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: >> >> >> >> >> >> >Thank you Erik! That's is helpful! >> >> >> > >> >> >> >But also I see jitters of the maximum latencies when running the >> >> >> >experiment. >> >> >> > >> >> >> >The average end to acknowledgement latency from producer to >>broker >> >>is >> >> >> >around 5ms when using 92 producers and 4 brokers, and the 99.9 >> >> >>percentile >> >> >> >latency is 58ms, but the maximum latency goes up to 1359 ms. How >>to >> >> >>locate >> >> >> >the source of this jitter? >> >> >> > >> >> >> >Thanks. >> >> >> > >> >> >> >On Fri, Sep 4, 2015 at 10:54 AM, Helleren, Erik >> >> >> ><erik.helle...@cmegroup.com> >> >> >> >wrote: >> >> >> > >> >> >> >> WellŠ not to be contrarian, but latency depends much more on >>the >> >> >>latency >> >> >> >> between the producer and the broker that is the leader for the >> >> >>partition >> >> >> >> you are publishing to. At least when your brokers are not >> >>saturated >> >> >> >>with >> >> >> >> messages, and acks are set to 1. If acks are set to ALL, >>latency >> >>on >> >> >>an >> >> >> >> non-saturated kafka cluster will be: Round Trip Latency from >> >> >>producer to >> >> >> >> leader for partition + Max( slowest Round Trip Latency to a >> >>replicas >> >> >>of >> >> >> >> that partition). If a cluster is saturated with messages, we >> >>have to >> >> >> >> assume that all partitions receive an equal distribution of >> >>messages >> >> >>to >> >> >> >> avoid linear algebra and queueing theory models. I don¹t like >> >>linear >> >> >> >> algebra :P >> >> >> >> >> >> >> >> Since you are probably putting all your latencies into a single >> >> >> >>histogram >> >> >> >> per producer, or worse, just an average, this pattern would >>have >> >>been >> >> >> >> obscured. Obligatory lecture about measuring latency by Gil >>Tene >> >> >> >> (https://www.youtube.com/watch?v=9MKY4KypBzg). To verify this >> >> >> >>hypothesis, >> >> >> >> you should re-write the benchmark to plot the latencies for >>each >> >> >>write >> >> >> >>to >> >> >> >> a partition for each producer into a histogram. (HRD histogram >>is >> >> >>pretty >> >> >> >> good for that). This would give you producers*partitions >> >>histograms, >> >> >> >> which might be unwieldy for that many producers. But wait, >>there >> >>is >> >> >> >>hope! >> >> >> >> >> >> >> >> To verify that this hypothesis holds, you just have to see that >> >>there >> >> >> >>is a >> >> >> >> significant difference between different partitions on a SINGLE >> >> >> >>producing >> >> >> >> client. So, pick one producing client at random and use the >>data >> >>from >> >> >> >> that. The easy way to do that is just plot all the partition >> >>latency >> >> >> >> histograms on top of each other in the same plot, that way you >> >>have a >> >> >> >> pretty plot to show people. If you don¹t want to setup >>plotting, >> >>you >> >> >> >>can >> >> >> >> just compare the medians (50¹th percentile) of the partitions¹ >> >> >> >>histograms. >> >> >> >> If there is a lot of variance, your latency anomaly is >>explained >> >>by >> >> >> >> brokers 4-7 being slower than nodes 0-3! If there isn¹t a lot >>of >> >> >> >>variance >> >> >> >> at 50%, look at higher percentiles. And if higher percentiles >>for >> >> >>all >> >> >> >>the >> >> >> >> partitions look the same, this hypothesis is disproved. >> >> >> >> >> >> >> >> If you want to make a general statement about latency of >>writing >> >>to >> >> >> >>kafka, >> >> >> >> you can merge all the histograms into a single histogram and >>plot >> >> >>that. >> >> >> >> >> >> >> >> To Yuheng¹s credit, more brokers always results in more >> >>throughput. >> >> >>But >> >> >> >> throughput and latency are two different creatures. Its worth >> >>noting >> >> >> >>that >> >> >> >> kafka is designed to be high throughput first and low latency >> >>second. >> >> >> >>And >> >> >> >> it does a really good job at both. >> >> >> >> >> >> >> >> Disclaimer: I might not like linear algebra, but I do like >> >> >>statistics. >> >> >> >> Let me know if there are topics that need more explanation >>above >> >>that >> >> >> >> aren¹t covered by Gil¹s lecture. >> >> >> >> -Erik >> >> >> >> >> >> >> >> On 9/4/15, 9:03 AM, "Yuheng Du" <yuheng.du.h...@gmail.com> >>wrote: >> >> >> >> >> >> >> >> >When I using 32 partitions, the 4 brokers latency becomes >>larger >> >> >>than >> >> >> >>the >> >> >> >> >8 >> >> >> >> >brokers latency. >> >> >> >> > >> >> >> >> >So is it always true that using more brokers can give less >> >>latency >> >> >>when >> >> >> >> >the >> >> >> >> >number of partitions is at least the size of the brokers? >> >> >> >> > >> >> >> >> >Thanks. >> >> >> >> > >> >> >> >> >On Thu, Sep 3, 2015 at 10:45 PM, Yuheng Du >> >> >><yuheng.du.h...@gmail.com> >> >> >> >> >wrote: >> >> >> >> > >> >> >> >> >> I am running a producer latency test. When using 92 >>producers >> >>in >> >> >>92 >> >> >> >> >> physical node publishing to 4 brokers, the latency is >>slightly >> >> >>lower >> >> >> >> >>than >> >> >> >> >> using 8 brokers, I am using 8 partitions for the topic. >> >> >> >> >> >> >> >> >> >> I have rerun the test and it gives me the same result, the 4 >> >> >>brokers >> >> >> >> >> scenario still has lower latency than the 8 brokers >>scenarios. >> >> >> >> >> >> >> >> >> >> It is weird because I tested 1broker, 2 brokers, 4 brokers, >>8 >> >> >> >>brokers, >> >> >> >> >>16 >> >> >> >> >> brokers and 32 brokers. For the rest of the case the latency >> >> >> >>decreases >> >> >> >> >>as >> >> >> >> >> the number of brokers increase. >> >> >> >> >> >> >> >> >> >> 4 brokers/8 brokers is the only pair that doesn't satisfy >>this >> >> >>rule. >> >> >> >> >>What >> >> >> >> >> could be the cause? >> >> >> >> >> >> >> >> >> >> I am using a 200 bytes message, the test let each producer >> >> >>publishes >> >> >> >> >>500k >> >> >> >> >> messages to a given topic. Every test run when I change the >> >> >>number of >> >> >> >> >> brokers, I use a new topic. >> >> >> >> >> >> >> >> >> >> Thanks for any advices. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>