This is probably due to improvements in preemption.

Garbage collectors often need some linearizable checkpoint (or an atomic
commit point) where every CPU core agrees on a state. For instance,
enabling a write barrier on the heap.

Back in the day, this was achieved on communication via channels, network
communication, system calls and so on. In particular a goroutine doing
computation could defer the invocation of the checkpoint. That meant the
system would hang for every other CPU core and not do any productive work.

A later version of Go improved this. Every function call needs to check if
the stack needs extension. By manipulating the extension point, the GC
could signal that a checkpoint was needed: the stack extension check fails,
and the goroutine enters the stack extension routine. But then it first
checks if this is due to a GC signal. If it is, it enters the checkpoint.

With Go 1.14, preemption has been further improved to use OS signals. This
means even loops with no function calls (as the one you have gathering
logins) can now be preempted.

Your example is the worst possible outcome. But there are other situations
which are almost equally bad in production systems. You can have sudden
productivity halts where the program isn't able to continue for several
hundred milliseconds. These look like GC pauses, but it is a bit of a
philosophical discussion if they really are, since they live in the limbo
between GC and preemption.

As an interesting observation: functional languages use recursion for
loops, so they don't theoretically have to preempt loops as every loop has
a function call in it. However, many functional languages also compile tail
calls into loops for efficiency reasons, so the world becomes a bit more
blurry. The usual way to get good preemption is to check on memory
allocation, which is common in functional languages, and also the major
reason they tend to use generational GCs with a two-space copying allocator
in the young generation. However, the memory allocation check is also
quickly becoming blurry as you can often use escape/liveness analysis and
move many of these allocations onto the stack.


On Sat, Jul 25, 2020 at 9:38 AM Groups Discussion <drakkan1...@gmail.com>
wrote:

> Hi all,
>
> writing a stress test case for one of my apps I noticed a very strange
> thing: my test case works well on go 1.14 but it doesn't work on go 1.13.
>
> I wrote a minimal reproducer
>
> https://play.golang.org/p/uHkKMINncUB
>
> to make it work on go 1.13 I have to add the sleep at line 41. In Go
> playground it timeouts, without sleeping, even on 1.14.6, on real hw I
> tried it 1000 times, without sleeping, with no issues on go 1.14 while it
> fails every time on go 1.13 (tested 1.13.12 and 1.13.14)
>
> I'm just curious to understand if there is something wrong with my code,
> even if it is not idiomatic, and if a such busy synchronization wait is
> expected to work in 1.14 only
>
> thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/0dbc0ba3-0b4d-42bd-8edf-076e92507251o%40googlegroups.com
> <https://groups.google.com/d/msgid/golang-nuts/0dbc0ba3-0b4d-42bd-8edf-076e92507251o%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
J.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAGrdgiUuhSiFO_cZ%3Dh5_UsqVzogC9oLyLdvsKnW77mhtHrh%3DEA%40mail.gmail.com.

Reply via email to