> On Sep 22, 2020, at 5:37 PM, Segher Boessenkool <seg...@kernel.crashing.org>
> wrote:
>
> Hi!
>
> On Tue, Sep 22, 2020 at 06:06:30PM +0100, Richard Sandiford wrote:
>> Qing Zhao <qing.z...@oracle.com> writes:
>>> Okay, thanks for the info.
>>> then, what’s the current definition of UNSPEC_VOLATILE?
>>
>> I'm not sure it's written down anywhere TBH. rtl.texi just says:
>>
>> @code{unspec_volatile} is used for volatile operations and operations
>> that may trap; @code{unspec} is used for other operations.
>>
>> which seems like a cyclic definition: volatile expressions are defined
>> to be expressions that are volatile.
>
> volatile_insn_p returns true for unspec_volatile (and all other volatile
> things). Unfortunately the comment on this function is just as confused
> as pretty much everything else :-/
>
>> But IMO the semantics are that unspec_volatile patterns with a given
>> set of inputs and outputs act for dataflow purposes like volatile asms
>> with the same inputs and outputs. The semantics of asm volatile are
>> at least slightly more well-defined (if only by example); see extend.texi
>> for details. In particular:
>>
>> Note that the compiler can move even @code{volatile asm} instructions
>> relative
>> to other code, including across jump instructions. For example, on many
>> targets there is a system register that controls the rounding mode of
>> floating-point operations. Setting it with a @code{volatile asm} statement,
>> as in the following PowerPC example, does not work reliably.
>>
>> @example
>> asm volatile("mtfsf 255, %0" : : "f" (fpenv));
>> sum = x + y;
>> @end example
>>
>> The compiler may move the addition back before the @code{volatile asm}
>> statement. To make it work as expected, add an artificial dependency to
>> the @code{asm} by referencing a variable in the subsequent code, for
>> example:
>>
>> @example
>> asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
>> sum = x + y;
>> @end example
>>
>> which is very similar to the unspec_volatile case we're talking about.
>
> So just like volatile memory accesses, they have an (unknown) side
> effect, which means they have to execute on the real machine as on the
> abstract machine (wrt sequence points). All side effects have to happen
> exactly as often as proscribed, and in the same order. Just like
> volatile asm, too.
Don’t quite understand the above, what do you mean by “they have to
execute on the real machine as on the abstract machine”?
>
> And there is no magic to it, there are no other effects.
>
>> To take an x86 example:
>>
>> void
>> f (char *x)
>> {
>> asm volatile ("");
>> x[0] = 0;
>> asm volatile ("");
>> x[1] = 0;
>> asm volatile ("");
>> }
>>
>> gets optimised to:
>>
>> xorl %eax, %eax
>> movw %ax, (%rdi)
>
> (If you use "#" or "#smth" you can see those in the generated asm --
> completely empty asm is helpfully (uh...) not printed.)
Can you explain this in more details?
>
>> with the two stores being merged. The same thing is IMO valid for
>> unspec_volatile. In both cases, you would need some kind of memory
>> clobber to prevent the move and merge from happening.
>
> Even then, x[] could be optimised away completely (with whole program
> optimisation, or something). The only way to really prevent the
> compiler from optimising memory accesses is to make it not see the
> details (with an asm or an unspec, for example).
You mean with a asm volatile (“” “memory”)?
>
>> The above is conservatively correct. But not all passes do it.
>> E.g. combine does have a similar approach:
>>
>> /* If INSN contains volatile references (specifically volatile MEMs),
>> we cannot combine across any other volatile references.
>
> And this is correct, and the *minimum* to do even (this could change the
> order of the side effects, depending how combine places the resulting
> insns in I2 and I3).
Please clarify what “L2 and L3” are?
>
>> Even if INSN doesn't contain volatile references, any intervening
>> volatile insn might affect machine state. */
>
> Confusingly stated, but essentially correct (it is possible we place
> the volatile at I2, and everything would still be sequenced correctly,
> but combine does not guarantee that).
thanks.
Qing
>
>> is_volatile_p = volatile_refs_p (PATTERN (insn))
>> ? volatile_refs_p
>> : volatile_insn_p;
>
> Too much subtlety in there, heh.
>
>
> Segher