On 01/15/2015 12:13 AM, Jakub Jelinek wrote:
> On Thu, Jan 15, 2015 at 07:46:18AM +0100, Richard Biener wrote:
>>>> That last line means the compiler is free to delete a non-volatile
>>>> asm with a memory clobber if that asm is not needed for dataflow.  Or
>>>> that is how I read it; it is trying to indicate that if you want to
>>>> prevent the memory clobber from being deleted (together with the rest
>>>> of the asm), you need to make the asm volatile.
>>>>
>>>> So as far as I can see the compiler can CSE two identical
>>> non-volatile
>>>> asms with memory clobber just fine.  Older GCC (I tried 4.7.2) does
>>> do
>>>> this; current mainline doesn't.  I think it should.
>>> No, it should not CSE those two cases.  That's simply wrong and if an 
>>> older version did that optimization, that's a bug.
>>
>> I think segher has a point here.  If the asm with memory clobber would store 
>> to random memory and the point would be to preserve that then the whole 
>> distinction with volatile doesn't make much sense (after all without 
>> volatile we happily DCE such asm if the regular outputs are not needed).
>>
>> This doesn't mean 'memory' is a well-designed thing, of course. Just its
>> effects are effectively limited to reads without volatile(?)
> 
> Segher's mails talk about "memory" being a write but not read.
> If we even can't agree on what non-volatile "memory" means, I think
> we should treat it more conservatively, because every user (and there are
> lots of people using non-volatile asm with "memory" in the wild) treats it
> differently.  Just trying to grep for a few:
> glibc:
> ./sysdeps/alpha/bits/atomic.h:# define atomic_full_barrier()  __asm ("mb" : : 
> : "memory");
> ./sysdeps/alpha/bits/atomic.h:# define atomic_read_barrier()  __asm ("mb" : : 
> : "memory");
> ./sysdeps/alpha/bits/atomic.h:# define atomic_write_barrier() __asm ("wmb" : 
> : : "memory");
> ./sysdeps/sparc/sparc32/bits/atomic.h:# define atomic_full_barrier() __asm 
> ("" ::: "memory")
> ./sysdeps/powerpc/powerpc32/bits/atomic.h:# define atomic_read_barrier()      
> __asm ("lwsync" ::: "memory")
> ./sysdeps/powerpc/powerpc32/bits/atomic.h:# define atomic_read_barrier()      
> __asm ("sync" ::: "memory")
> ./sysdeps/powerpc/powerpc64/bits/atomic.h:#define atomic_read_barrier()       
> __asm ("lwsync" ::: "memory")
> ./sysdeps/powerpc/bits/atomic.h:#define atomic_full_barrier() __asm ("sync" 
> ::: "memory")
> ./sysdeps/powerpc/bits/atomic.h:#define atomic_write_barrier()        __asm 
> ("eieio" ::: "memory")
> ./sysdeps/generic/malloc-machine.h:# define atomic_full_barrier() __asm ("" 
> ::: "memory")

I think that it's uses like these -- which may well have been written
by folks that also work on gcc -- that are proof that we have at least
intended to support a memory clobber to be a full read+write barrier,
and thus we must consider a memory clobber to be both a read and write.

(The fact that all of these are automatically volatile and would never be CSEd
is beside the point.  If the semantics of a memory clobber differ based on the
volatile flag on the asm, I think that would be too ill-defined to actually
support.)

In the interest of progressing wrt the current regression, I think that
Jakub's patch should go in as-is for now, and then we iterate on how we
think the memory cse ought (or ought not) to occur.

As for my own thoughts on whether two non-volatile asms with memory clobbers
should be CSE'd, in absence of other stores to memory in between, are
complicated and probably not well-formed.  I'll think about it some more.


r~

Reply via email to