> On Sep 22, 2020, at 5:37 PM, Segher Boessenkool <seg...@kernel.crashing.org> 
> wrote:
> 
> Hi!
> 
> On Tue, Sep 22, 2020 at 06:06:30PM +0100, Richard Sandiford wrote:
>> Qing Zhao <qing.z...@oracle.com> writes:
>>> Okay, thanks for the info. 
>>> then, what’s the current definition of UNSPEC_VOLATILE? 
>> 
>> I'm not sure it's written down anywhere TBH.  rtl.texi just says:
>> 
>>  @code{unspec_volatile} is used for volatile operations and operations
>>  that may trap; @code{unspec} is used for other operations.
>> 
>> which seems like a cyclic definition: volatile expressions are defined
>> to be expressions that are volatile.
> 
> volatile_insn_p returns true for unspec_volatile (and all other volatile
> things).  Unfortunately the comment on this function is just as confused
> as pretty much everything else :-/
> 
>> But IMO the semantics are that unspec_volatile patterns with a given
>> set of inputs and outputs act for dataflow purposes like volatile asms
>> with the same inputs and outputs.  The semantics of asm volatile are
>> at least slightly more well-defined (if only by example); see extend.texi
>> for details.  In particular:
>> 
>>  Note that the compiler can move even @code{volatile asm} instructions 
>> relative
>>  to other code, including across jump instructions. For example, on many 
>>  targets there is a system register that controls the rounding mode of 
>>  floating-point operations. Setting it with a @code{volatile asm} statement,
>>  as in the following PowerPC example, does not work reliably.
>> 
>>  @example
>>  asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>>  sum = x + y;
>>  @end example
>> 
>>  The compiler may move the addition back before the @code{volatile asm}
>>  statement. To make it work as expected, add an artificial dependency to
>>  the @code{asm} by referencing a variable in the subsequent code, for
>>  example:
>> 
>>  @example
>>  asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
>>  sum = x + y;
>>  @end example
>> 
>> which is very similar to the unspec_volatile case we're talking about.
> 
> So just like volatile memory accesses, they have an (unknown) side
> effect, which means they have to execute on the real machine as on the
> abstract machine (wrt sequence points).  All side effects have to happen
> exactly as often as proscribed, and in the same order.  Just like
> volatile asm, too.
Don’t quite understand the above, what do you mean by “they have to 
execute on the real machine as on the abstract machine”?

> 
> And there is no magic to it, there are no other effects.
> 
>> To take an x86 example:
>> 
>>  void
>>  f (char *x)
>>  {
>>    asm volatile ("");
>>    x[0] = 0;
>>    asm volatile ("");
>>    x[1] = 0;
>>    asm volatile ("");
>>  }
>> 
>> gets optimised to:
>> 
>>        xorl    %eax, %eax
>>        movw    %ax, (%rdi)
> 
> (If you use "#" or "#smth" you can see those in the generated asm --
> completely empty asm is helpfully (uh...) not printed.)

Can you explain this in more details?

> 
>> with the two stores being merged.  The same thing is IMO valid for
>> unspec_volatile.  In both cases, you would need some kind of memory
>> clobber to prevent the move and merge from happening.
> 
> Even then, x[] could be optimised away completely (with whole program
> optimisation, or something).  The only way to really prevent the
> compiler from optimising memory accesses is to make it not see the
> details (with an asm or an unspec, for example).
You mean with a asm volatile (“” “memory”)?

> 
>> The above is conservatively correct.  But not all passes do it.
>> E.g. combine does have a similar approach:
>> 
>>  /* If INSN contains volatile references (specifically volatile MEMs),
>>     we cannot combine across any other volatile references.
> 
> And this is correct, and the *minimum* to do even (this could change the
> order of the side effects, depending how combine places the resulting
> insns in I2 and I3).

Please clarify what “L2 and L3” are?
> 
>>     Even if INSN doesn't contain volatile references, any intervening
>>     volatile insn might affect machine state.  */
> 
> Confusingly stated, but essentially correct (it is possible we place
> the volatile at I2, and everything would still be sequenced correctly,
> but combine does not guarantee that).

thanks.

Qing
> 
>>  is_volatile_p = volatile_refs_p (PATTERN (insn))
>>    ? volatile_refs_p
>>    : volatile_insn_p;
> 
> Too much subtlety in there, heh.
> 
> 
> Segher

Reply via email to