Re: [go-nuts] Re: Different latency inside and outside

a . petrovsky Sun, 19 Mar 2017 21:28:06 -0700


воскресенье, 19 марта 2017 г., 20:46:16 UTC+3 пользователь Jesper Louis 
Andersen написал:
>
> On Sun, Mar 19, 2017 at 4:58 PM Alexander Petrovsky <askj...@gmail.com 
> <javascript:>> wrote:
>
>>  
>>
> * The 99th percentile ignores the 40 slowest queries. What does the 99.9, 
>>> 9.99, ... and max percentiles look like?
>>>
>>
>> I'v have no answer to this question. And I don't know how it can help me?
>>
>
> Usually, the maximum latency is a better indicator of trouble than a 99th 
> percentile in my experience. If you improve the worst case, then surely the 
> other cases are likely to follow. However, there are situations where this 
> will hurt the median 50th percentile latency. Usually this trade-off is 
> okay, but there are a few situations where it might not be.
>


Got it!
 

>
>> Yep, it's the also the main question! I'm log and graph nginx 
>> $request_time, and log and graph internal function time. What is between, I 
>> can't log, it's:
>>  - local network (TCP);
>>  - work in kernel/user space;
>>  - golang GC and other run-time;
>>  - golang fasthttp machinery before call my http handler.
>>
>
> The kernel and GC can be dynamically inspected. I'd seriously consider 
> profiling as well in a laboratory environment. Your hypothesis is that none 
> of these have a discrepancy, but they may have.
>

If I'm understood your correctly, I think the problem is something there...
 

>  
>>
>>> * Caches have hit/miss rates that looks about right.
>>>
>>
>> In my application this is not true caches, it real it's dictionary loaded 
>> from database, and user in calculation.
>>
>
> Perhaps the code in https://godoc.org/golang.org/x/text is of use for 
> this? It tends to be faster than maps because it utilizes compact string 
> representations and tries. Of course, it requires you show that the problem 
> is with the caching sublayer first.
>

The dictionary is not words dictionary, the dictionary in database terms — 
some keys (ids), and values or multiple values.

>  
>
>> * 15% CPU load means we are spending ample amounts of time waiting. What 
>>> are we waiting on?
>>>
>>
>> Maybe, or maybe the 32 core can process the 4k rps. How can I find out, 
>> what my app is waiting on?
>>
>
> blockprofile is my guess at what I would grab first. Perhaps the tracing 
> functionality as well. You can also adds metrics on each blocking point in 
> order to get an idea of where the system is going off. Functionality like 
> dtrace would be nice, but I'm not sure Go has it, unfortunately.
>

Thanks a lot, I will!
 

>  
>
>>  
>>
>>> * Are we measuring the right thing in the internal measurements? If the 
>>> window between external/internal is narrow, then chances are we are doing 
>>> the wrong thing on the internal side.
>>>
>>
>> Could you explain this?
>>
>
> There may be a bug in the measurement code, so you should probably go over 
> it again. One common fault of mine is to place the measurement around the 
> wrong functions, so I think they are detecting more than they are. A single 
> regular expression that is only hit in corner-cases can be enough to mess 
> with a performance profile. Another common mistake is to not have a 
> appropriate decay parameter on your latency measurements, so older requests 
> eventually gets removed from the latency graph[0]
>  
> In general, as the amount of work a system processes goes up, it gets more 
> sensitive to fluctuations in latency. So even at a fairly low CPU load, you 
> may still have some spiky behavior hidden by a smoothing of the CPU load 
> measure, and this can contribute to added congestion.
>
> [0] A decaying Vitter's algorithm R implementation, or Tene's HdrHistogram 
> is preferable. HdrHistogram is interesting in that it uses a floating-point 
> representation for its counters: one array for exponents, one array for 
> mantissa. It allows very fast accounting (nanoseconds) and provides precise 
> measurements around 0 at the expense of precision at, say, 1 hour. It is 
> usually okay because if you waited 1 hour, you don't care if it was really 
> 1 hour and 3 seconds. But at 1us, you really care about being precise.
>

Could you explain please, what does you mean when you say - "Another common 
mistake is to not have a appropriate decay parameter on your latency 
measurements, so older requests eventually gets removed from the latency 
graph[0]"? Why the older requests should remove from latency graph? As I 
know the hdrHistogram is good fit for high precision measurements and 
graphs with different order and magnitude. How it can help me? 
 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [go-nuts] Re: Different latency inside and outside

Reply via email to