Re: [go-nuts] Re: Would this race condition be considered a bug?

Devon H. O'Dell Sun, 18 Mar 2018 10:21:53 -0700

2018-03-18 9:34 GMT-07:00 Michael Jones <michael.jo...@gmail.com>:
> Subtle observation--it is not fundamentally the memory model theorized in Go
> or C++ specifications that causes this--rather it is the least strong
> guarantee of the hardware on which these languages promise that standard
> programs will work as specified, and following that, the decision of
> language designers how efficiently they want to use the fastest of that
> hardware.


If I might reflect this differently: it's a combination of what
language designers want to target based on the hardware platforms they
intend to support. Is that understanding your point correctly?

> If one is designing a language that is more demanding than the execution
> hardware promises, that's fine. It just means that the generated/evaluated
> code must virtualize what appears to be simple things. For example, if we
> want multiple threads to be able to do "i++" against the same variable, we
> can generate an implicit mutex around every access to every "i" that could
> ever be visible to a concurrent thread. There is nothing wrong with this,
> but the resulting code will not generally be as fast as the code of other
> languages where the offer is less general, say, requiring developers to put
> locks around critical sections or use atomic increments when they know
> memory could be accessed concurrently.

Some (maybe less academic) examples of where Go does this:

 * There is no way to force at least 16 byte alignment of data per the
language spec, so there is no way to implement DCAS (notably missing
from sync/atomic) on amd64 (cmpxchg16b requires 16 byte alignment of
its target operand). DCAS is useful in lock-free protocols to avoid
ABA problems. These issues can be avoided in Go by copying, but this
usually inefficient (and maybe ends up holding locks anyway if you
need to allocate to copy)
.
 * The sync.atomic interface implements full memory barriers around
all operations. This is designed to obviate the need for understanding
memory barriers, but this is inefficient if all you need is (for
example) a load fence to guarantee loads observed in some order.

(I get that the applications for such things are "clever" and folks
are encouraged not to be "clever" in this way. But if one has a
relatively good understanding of this stuff, it's rather unfortunate.)

> As it happens, the "auto-magic" mode can be just as fast in some cases. On
> IBM POWER9+ there are special memory ranges where the synchronization is
> hardware facilitated. On the CPUs most people have, though, the code would
> be much slower and this would likely be seen as a flaw.

This is an excellent point. I wish that people would more often talk
about applications for these things (like lock-freedom) in terms of
system predictability (and therefore scalability) than performance
(and therefore speed). Scalability doesn't automatically imply
performance, but it sometimes results in improved performance, and is
sometimes more preferable (predictability is hugely important for
correct capacity modeling) even when it does not improve performance.

[1] provides another example of this kind of thing. Table 4 shows
uncontended latency for push/pop operations on a stack on x86-64 vs
POWER7. Spinlocks are easier to acquire on x86-64 than on POWER7, so
if writing for that platform, you might protect mostly-uncontended
access with spinlocks instead of using a lock-free stack. You'd come
to the opposite conclusion on POWER7. And so targeting either
platform, one might be likely to see the less performant approach as a
flaw. But at the end of the day, Figure 6 (just below) does a great
job at illustrating what you actually get from the progress guarantees
of lock-freedom, once you introduce contention.

(I'd really like the ability to do this in Go...)

--dho

[1] https://queue.acm.org/detail.cfm?id=2492433 "Nonblocking
Algorithms and Scalable Multicore Programming", Samy Al Bahra, 2013

> On Sun, Mar 18, 2018 at 2:30 AM, Devon H. O'Dell <devon.od...@gmail.com>
> wrote:
>>
>> 2018-03-18 1:42 GMT-07:00 David Anderson <d...@natulte.net>:
>> > There's a difference though. The program that uses sync/atomic will
>> > behave
>> > non-deterministically, but within a small set of possible outcomes. The
>> > one
>>
>> While a fair point, I'd also note the set of possible outcomes is not
>> small and my point is more that the term "race" doesn't just mean
>> "access that violates a particular language's memory model."
>>
>> > without could do anything. In addition to producing an incorrect numeric
>> > result, it could deadlock, segfault, jump execution to some random place
>> > in
>> > the program code, ...
>>
>> "Incorrect" is subjective WRT the counters themselves; there is no
>> concurrent access to those. I agree that, per the Go specification,
>> the program is not guaranteed to halt without the use of sync/atomic;
>> I should've worded that better. But the non-determinism in the
>> behavior of the program is precisely because of racing. (The term
>> itself comes from EE where effectively the length of wire can cause
>> races in signals.)
>>
>> > I highly recommend
>> >
>> > https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
>> > . It's an excellent article (by the author of the go race detector) that
>> > explains why safe, benign data races are never safe or benign. The
>> > examples
>> > are in C++, but the same applies to Go - the compiler makes assumptions
>> > about code execution in the absence of synchronization points, and even
>> > simple optimizations can have hilarious consequences on code that
>> > violates
>> > those assumptions.
>>
>> It's a good article; I've read it several times before and should've
>> qualified things better. Some points related to the article:
>>
>>  * Prior to C11, C had no concurrency model and as such, any
>> multithreaded program exhibits undefined behavior when compared to the
>> specification.
>>  * POSIX avoids a formal specification of a memory model entirely and
>> instead encourages what can effectively be summed up as "best
>> practices" for concurrent access, so it's no help either.
>>  * Both C and C++ have adopted memory models that are provably
>> inconsistent (though I seem to have lost links to the papers).
>>
>> So while it's easy for this sort of thing to be "mis"-compiled, it's
>> still very much true that we're already relying on compiler behavior
>> (as opposed to some formal specification) in many languages.
>>
>> While the languages themselves may leave this behavior undefined, it's
>> easy enough to link against things written in other languages in most
>> of them. So we can take my invalid program, link it with some
>> racy_stop.s that does (for example) MOVB $1, (DI), relying on whatever
>> ordering and visibility semantics the machine provides, and implement
>> a similar interface for reading that memory. At that point which point
>> language specification is moot, and my crappy version is guaranteed to
>> halt again.
>>
>> Anyway, I had hoped to not get into the weeds here. Ignoring the
>> version that is not guaranteed to halt when specified in pure Go,
>> non-determinism in both other examples occurs due to racing on data.
>> "Any race is a bug" is not true, and that was really the only point I
>> was trying to make.
>>
>> > - Dave
>>
>> --dho
>>
>> > On Sun, Mar 18, 2018 at 1:19 AM, Devon H. O'Dell <devon.od...@gmail.com>
>> > wrote:
>> >>
>> >> Perhaps better illustrated:
>> >>
>> >> package main
>> >>
>> >> import (
>> >>         "fmt"
>> >>         "time"
>> >> )
>> >>
>> >> type info struct {
>> >>         count int
>> >>         id    int
>> >> }
>> >>
>> >> var stop int32
>> >>
>> >> func runner(id int, c chan info) {
>> >>         info := info{id: id}
>> >>         for stop == 0 {
>> >>                 info.count++
>> >>                 time.Sleep(1 * time.Microsecond)
>> >>         }
>> >>         c <- info
>> >> }
>> >>
>> >> func main() {
>> >>         c := make(chan info)
>> >>         for i := 0; i < 10; i++ {
>> >>                 go runner(i, c)
>> >>         }
>> >>
>> >>         time.Sleep(10 * time.Millisecond)
>> >>         stop = 1
>> >>
>> >>         for i := 0; i < 10; i++ {
>> >>                 fmt.Printf("Returned %+v\n", <-c)
>> >>         }
>> >> }
>> >>
>> >> This program is guaranteed to halt, regardless of any GOMAXPROCS
>> >> setting. If you use the race detector on this program, it will
>> >> complain. You can "fix" "the race" by changing the loop condition to
>> >> "atomic.LoadInt32(&stop) == 0" and use atomic.StoreInt32(&stop, 1) as
>> >> your signal. Is the program still racy? Yep. Is either version of this
>> >> program buggy? That just depends. It's not too hard to think of cases
>> >> where this behavior might be desirable; even lossy counters are a
>> >> thing.
>> >>
>> >> --dho
>> >>
>> >> 2018-03-18 0:43 GMT-07:00 Devon H. O'Dell <devon.od...@gmail.com>:
>> >> > 2018-03-16 2:16 GMT-07:00 Jérôme Champion <champ...@gmail.com>:
>> >> >> Any race is a bug. When there is a race, the compiler is free to do
>> >> >> whatever
>> >> >> it wants.
>> >> >
>> >> > This is a false statement; whether or not a race constitutes a bug
>> >> > depends on the encapsulating protocol. Consider the following:
>> >> >
>> >> > package main
>> >> >
>> >> > import (
>> >> >         "fmt"
>> >> >         "sync"
>> >> >         "sync/atomic"
>> >> > )
>> >> >
>> >> > var wg sync.WaitGroup
>> >> >
>> >> > type Spinlock struct {
>> >> >         state uint32
>> >> > }
>> >> >
>> >> > func (s *Spinlock) Lock() {
>> >> >         for atomic.SwapUint32(&s.state, 1) == 1 {
>> >> >         }
>> >> > }
>> >> >
>> >> > func (s *Spinlock) Unlock() {
>> >> >         atomic.StoreUint32(&s.state, 0)
>> >> > }
>> >> >
>> >> > func locker(s *Spinlock, i int) {
>> >> >         defer wg.Done()
>> >> >         s.Lock()
>> >> >         fmt.Printf("Locker %d acquired lock\n", i)
>> >> >         s.Unlock()
>> >> > }
>> >> >
>> >> > func main() {
>> >> >         s := &Spinlock{}
>> >> >         for i := 0; i < 10; i++ {
>> >> >                 wg.Add(1)
>> >> >                 go locker(s, i)
>> >> >         }
>> >> >         wg.Wait()
>> >> > }
>> >> >
>> >> > This program does not have deterministic output because each
>> >> > goroutine
>> >> > races to acquire the lock, and which gets it first relies on
>> >> > scheduling, which is effectively stoachastic. (Playground caches
>> >> > output, so no use illustrating it there.) Despite this, the program
>> >> > compiles just fine and behaves correctly for any reasonable
>> >> > definition
>> >> > of correct.
>> >> >
>> >> > I recognize this is not the issue mentioned in OP, but hopefully it's
>> >> > clear that this is not just a semantics argument (race doesn't just
>> >> > mean "only things the race detector can find") and that the statement
>> >> > "any race is a bug" is false. This race only constitutes a bug when
>> >> > the requirements of this program require deterministic output.
>> >> >
>> >> > --dho
>> >> >
>> >> >> What do you want to do? Do you want to println before or after
>> >> >> cancelContext? In your example, println could even be executed
>> >> >> during
>> >> >> cancelContext!
>> >> >> Your example is not enough, such program :
>> >> >> https://play.golang.org/p/KJsYwu-ivjb would fix your problem, but I
>> >> >> don't
>> >> >> think that's what you asked for :)
>> >> >>
>> >> >>
>> >> >> Le vendredi 16 mars 2018 03:11:49 UTC+1, Anmol Sethi a écrit :
>> >> >>>
>> >> >>> https://play.golang.org/p/82Um1jSntBo
>> >> >>>
>> >> >>> Please run with the race detector to see the race.
>> >> >>>
>> >> >>> This race condition arises because private variables are read via
>> >> >>> reflection by the fmt package. Is there any elegant way to avoid
>> >> >>> such
>> >> >>> a
>> >> >>> race?
>> >> >>>
>> >> >>> --
>> >> >>> Best,
>> >> >>> Anmol
>> >> >>>
>> >> >> --
>> >> >> You received this message because you are subscribed to the Google
>> >> >> Groups
>> >> >> "golang-nuts" group.
>> >> >> To unsubscribe from this group and stop receiving emails from it,
>> >> >> send
>> >> >> an
>> >> >> email to golang-nuts+unsubscr...@googlegroups.com.
>> >> >> For more options, visit https://groups.google.com/d/optout.
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google
>> >> Groups
>> >> "golang-nuts" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send
>> >> an
>> >> email to golang-nuts+unsubscr...@googlegroups.com.
>> >> For more options, visit https://groups.google.com/d/optout.
>> >
>> >
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> --
> Michael T. Jones
> michael.jo...@gmail.com

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [go-nuts] Re: Would this race condition be considered a bug?

Reply via email to