Jesper, 
thanks a lot for your email, your answer was a hand in the dark forest of 
doubts.

I will start trying the load generator wrk2.

About "instrument, profile, observe", yes, I added the gops agent but until 
now I don't have any conclusion related to that information.

Regards.


On Sunday, October 25, 2020 at 8:07:58 AM UTC-3 jesper.lou...@gmail.com 
wrote:

> On Sat, Oct 24, 2020 at 7:30 PM JuanPablo AJ <jpab...@gmail.com> wrote:
>  
>
>> I have some doubts related to the HTTP client.
>>
>
> First, if you have unexplained efficiency concerns in a program, you 
> should profile and instrument. Make the system tell you what is happening 
> rather than making guesses as to why. With that said, I have some hunches 
> and experiments you might want to try out.
>
> When you perform a load test, you have a SUT, or system-under-test. That 
> is the whole system, including infrastructure around it. I can be a single 
> program, or a cluster of machines. You also have a load generator, which 
> generates load on your SUT in order to test different aspects of the SUT: 
> bandwidth usage, latency in response, capacity limits, resource limits, 
> etc[1]. Your goal is to figure out if the data you are seeing are within an 
> acceptable range for your use case, or if you have to work more on the 
> system to make it fall within the acceptable window. 
>
> Your test is about RTT latency of requests. This will become important.
>
> One particular problem in your test is that the load generator and the SUT 
> runs in the same environment. If the test is simple and you are trying to 
> stress the system maximally, chances are that the load generator impacts 
> the SUT. That means the latency will rise due to time sharing in the 
> operating system.
>
> Second, when measuring latency you should look out for the problem Gil 
> Tene coined as "coordinated omission". In CO, the problem is that the load 
> generator and the SUT cooperates in order to deliver the wrong latency 
> counts. This is especially true if you just fire as many requests as 
> possible on 50 connections. Under an overload situation, the system will 
> suffer in latency since that is the only way the system can alleviate 
> pressure. The problem with CO is that a server can decide to park a couple 
> of requests and handle the other requests as fast as possible. This can 
> load to a high number of requests on the active connections, and the 
> stalled connections become noise in the statistics. You can look up Tene's 
> `wrk2` project, but I think the ideas were baked back into Will 
> Glozers wrk at a later point in time (memory eludes me).
>
> The third point is about the sensitivity of your tests: when you measure 
> things at the millisecond, microsecond or nanosecond range, your test 
> becomes far more susceptible to foreign impact. You can generally use 
> statistical bootstrapping to measure the impact this has on test variance, 
> which I've done in the past. You start finding all kinds of interesting 
> corner cases that perturb your benchmarks. Among the more surprising ones:
>
> * CPU Scaling governors
> * Turbo boosting: one core can be run at a higher clock frequency than a 
> cluster. GC in Go is multicore, so even for a single-core program, this 
> might have an effect
> * CPU heat. Laptop CPUs have miserable thermal cooling compared to a 
> server or desktop. They can run fast in small bursts, but not for longer 
> stretches
> * Someone using the computer while doing the benchmark
> * An open browser window which runs some Javascript in the background
> * An open electron app with a rendering of a .gif or .webm file
> * Playing music while performing the benchmark, yielding CPU power to the 
> MP3, Vorbis or AAC decoder
> * Amount of incoming network traffic to process for a benchmark that has 
> nothing to do with the network
>
> Finally, asynchronous goroutines are still work the program needs to 
> execute. It isn't free. So as the system is stressed with a higher load you 
> run higher against the capacity limit, thus incurring slower response 
> times. In the case where you perform requests in the background to another 
> HTTP server, you are taking a slice of the available resources. You are 
> also generating as much work internally as is coming in externally. In a 
> real world server, this is usually a bad idea and you must put a resource 
> limit in place. Otherwise an aggressive client can overwhelm your server. 
> The trick is to slow the caller down by *not* responding right away if you 
> are overloaded internally.
>
> You should check your kernel. When you perform a large amount of requests 
> on the same machine, you can run into limits in the number of TCP source 
> ports if they are rotated too fast. It is a common problem when the load 
> generator and SUT are on the same host.
>
> You should check your HTTP client configuration as well. One way to avoid 
> the above problem is to maximize connection reuse, but then you risk 
> head-of-line blocking on the connections, even (or perhaps even more so) in 
> the HTTP/2 case.
>
> But above all: instrument, profile, observe. Nothing beats data and plots.
>
> [1] SLI, SLOs etc. A good starting point is 
> https://landing.google.com/sre/sre-book/chapters/service-level-objectives/ 
> but that book is worth it for a full read. 
> https://landing.google.com/sre/books/ too!
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/17ce5bef-3c11-4284-a994-7a11375cd3d5n%40googlegroups.com.

Reply via email to