Re: [go-nuts] implementation of sync.atomic primitives

'Keith Randall' via golang-nuts Mon, 19 Mar 2018 16:55:01 -0700


On Monday, March 19, 2018 at 9:30:39 AM UTC-7, thepud...@gmail.com wrote:
>
> Hi Ian,
>
> I know you were not giving any type of definitive treatise on how go 
> treats atomics across different processors...
>
> but is a related aspect restricting instruction reordering by the compiler 
> itself?
>
>
Yes, the compiler needs to treat atomic loads differently from normal loads 
with respect to any instruction reordering it does.  So although *p and 
atomic.LoadUint32(p) both compile to a single MOVL on amd64, internally the 
compiler represents those two operations differently.


> I don't know what the modern go compiler does at this point, but I think 
> at least circa go 1.5 there was a nop function that seemed to be used to 
> help prevent the compiler from inlining and then doing instruction 
> re-ordering (first snippet below), and I think I've seen you make related 
> comments more recently (e.g., FreeBSD atomics discussion snippet I included 
> at the end of this post)?
>
> I haven't followed the more recent atomics related changes (including it 
> seems in 1.10 there might have been some work around intrinsics such as CL 
> 28076: "cmd/compile: intrinsify sync/atomic for amd64"?)...
>
> And yes, on the one hand the answer is "respect the memory model and get a 
> clean report from the race detector, etc., etc."... but of course sometimes 
> the performance aspect of the current compiler does matter beyond just mere 
> natural curiosity about how the go compiler does what it does (where 
> performance was the context I had looked at this more closely in the past).
>
> Two related snippets:
>
> ====================================================
> from go 1.5 
> https://github.com/golang/go/blob/release-branch.go1.5/src/runtime/atomic_amd64x.go#L11
> ====================================================
> // The calls to nop are to keep these functions from being inlined.
> // If they are inlined we have no guarantee that later rewrites of the
> // code by optimizers will preserve the relative order of memory accesses.
>
> //go:nosplit
> func atomicload(ptr *uint32) uint32 {
> nop()
> return *ptr
> }
> ====================================================
>
> ====================================================
> Ian Lance Taylor response to question on FreeBSD atomics discussion on 
> golang-dev: https://groups.google.com/forum/#!topic/golang-dev/f3PS8hp4Jfs
> ====================================================
>
> *> The second issue I have is translating FreeBSD atomic operations to 
> runtime *
> *> atomic ops. *
> *> If I understand it correctly then atomic_load_acq_32 has weaker 
> requirements *
> *> compared to runtime/internal/atomic.Load. *
> *> On x86 the FreeBSD variant is just a compiler barrier to prevent it *
> *> re-oredering instructions. *
>
> The Go compiler does reorder instructions.  But it doesn't reorder 
> instructions across a non-inlined function call.  On x86 a simple 
> memory load suffices for atomic.Load because x86 has a fairly strict 
> memory order in any case.  Most other processors are more lenient, and 
> require more work in the atomic operation. 
>
> ====================================================
>
> --thepudds
>
> On Monday, March 19, 2018 at 1:55:07 AM UTC-4, Ian Lance Taylor wrote:
>>
>> On Sun, Mar 18, 2018 at 9:47 PM, shivaram via golang-nuts 
>> <golan...@googlegroups.com> wrote: 
>> > 
>> > I noticed that internally, the language implementation seems to rely on 
>> the 
>> > atomicity of reads to single-word values: 
>> > 
>> > 
>> https://github.com/golang/go/blob/bd859439e72a0c48c64259f7de9f175aae3b9c37/src/runtime/chan.go#L160
>>  
>>
>> In the machine level, words like "atomicity" are overloaded with 
>> different meanings.  I think what you are saying is that the runtime 
>> package is assuming that a load of a machine word will never read an 
>> interleaving of two different store of a machine word.  It will always 
>> read the value written by a single store, though exactly which store 
>> it sees is unknown.  This is true on all the processors that Go 
>> supports. 
>>
>>
>> > As I understand it, this atomicity is provided by the cache coherence 
>> > algorithms of modern architectures. Accordingly, the implementations in 
>> > sync.atomic of word-sized loads (e.g., LoadUint32 on 386 and LoadUint64 
>> on 
>> > amd64) use ordinary MOV instructions: 
>> > 
>> > 
>> https://github.com/golang/go/blob/bd859439e72a0c48c64259f7de9f175aae3b9c37/src/sync/atomic/asm_386.s#L146
>>  
>> > 
>> > 
>> https://github.com/golang/go/blob/bd859439e72a0c48c64259f7de9f175aae3b9c37/src/sync/atomic/asm_amd64.s#L103
>>  
>> > 
>> > However, word-sized stores on these architectures use special 
>> instructions: 
>> > 
>> > 
>> https://github.com/golang/go/blob/bd859439e72a0c48c64259f7de9f175aae3b9c37/src/sync/atomic/asm_amd64.s#L133
>>  
>> > 
>> > Given that the APIs being implemented don't provide any global ordering 
>> > guarantees, what's the reason they can't be implemented solely with 
>> MOV? 
>>
>> You are not giving the correct reason for why atomic.LoadUint32 and 
>> LoadUint64 can use ordinary MOV instructions on x86 processors.  The 
>> LoadUint32, etc., functions guarantee much more than that they read a 
>> value that is not an interleaving a multiple writes.  They are also 
>> load-acquire operations, meaning that when the function completes, the 
>> caller will see not only the value that was loaded but also all other 
>> values that some other processor core wrote before writing to the 
>> address being loaded (assuming the write was done using StoreUint32, 
>> etc.).  It happens that on x86 you can implement load-acquire using a 
>> simple MOV instruction.  Most other multicore processors use a more 
>> complex memory model, and their sync/atomic implementations are 
>> accordingly more complex. 
>>
>> Ian 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [go-nuts] implementation of sync.atomic primitives

Reply via email to