Re: [RFC] Do not consider volatile asms as optimization barriers #1

Richard Sandiford Mon, 03 Mar 2014 00:02:24 -0800

Eric Botcazou <ebotca...@adacore.com> writes:
>> Thanks for doing this.  FWIW I agree it's probably the best stop-gap fix.
>> But the implication seems to be that unspec_volatile and volatile asms
>> are volatile in different ways.  IMO they're volatile in the same way
>> and the problems for volatile asms apply to unspec_volatile too.
>
> I disagree, we need a simple way for the RTL middle-end as well as the back-
> ends to block most optimizations across a specific point (e.g. a non-local 
> label as in HP's fix) and UNSPEC_VOLATILE is the best candidate, at least in 
> the short term.


I don't agree it's the best candidate if...

>> E.g. although cse.c will flush the table for unspec_volatile,
>> it isn't the case that unspec_volatile forces a containing function
>> to save all call-saved registers.  That would be excessive for a plain
>> blockage instruction.  So again we seem to be assuming one thing in places
>> like cse.c and another in the register allocator.  Code that uses the DF
>> framework will also assume that registers are not implicitly clobbered
>> by an unspec_volatile:
>> [...]
>> Also, ira-lives.c (which tracks the liveness of both pseudo and hard
>> registers) doesn't mention "volatile" at all.
>
> Yes, the definition of a blockage instruction is somewhat vague and I agree 
> that it shoudn't cause registers to be spilled.  But it needs to block most, 
> if not all, optimizations.

...it's so loosely defined.  If non-local labels are the specific problem,
I think it'd be better to limit the flush to that.

I'm back to throwing examples around, sorry, but take the MIPS testcase:

    volatile int x = 1;

    void foo (void)
    {
      x = 1;
      __builtin_mips_set_fcsr (0);
      x = 2;
    }

where __builtin_mips_set_fcsr is a handy way of getting unspec_volatile.
(I'm not interested in what the function does here.)  Even at -O2,
the cse.c code successfully prevents %hi(x) from being shared,
as you'd expect:

        li      $3,1                    # 0x1
        lui     $2,%hi(x)
        sw      $3,%lo(x)($2)
        move    $2,$0
        ctc1    $2,$31
        li      $3,2                    # 0x2
        lui     $2,%hi(x)
        sw      $3,%lo(x)($2)
        j       $31
        nop

But put it in a loop:

    void frob (void)
    {
      for (;;)
        {
          x = 1;
          __builtin_mips_set_fcsr (0);
          x = 2;
        }
    }

and we get the rather bizarre code:

        lui     $2,%hi(x)
        li      $6,1                    # 0x1
        move    $5,$0
        move    $4,$2
        li      $3,2                    # 0x2
        .align  3
.L3:
        sw      $6,%lo(x)($2)
        ctc1    $5,$31
        sw      $3,%lo(x)($4)
        j       .L3
        lui     $2,%hi(x)

Here the _second_ %hi(x), the 1 and the 2 have been hoisted but the first
%hi(x) is reloaded each time.  So what's the correct behaviour here?
Should the hoisting of the second %hi(x) have been disabled because the
loop contains an unspec_volatile?  What about the 1 (from the first store)
and the 2?

If instead it was:

   for (i = 0; i < 100; i++)

would converting to a hardware do-loop be acceptable?

>> So most passes assume that no pseudos or hard registers will be
>> implicitly clobbered by unspec_volatile, just like for a volatile asm.
>> And IMO that's right.  I think the rule should be the same for volatile
>> asms and unspec_volatiles, and the same for registers as it already is for
>> memory: if the instruction clobbers something, it should say so explicitly.
>
> IMO that would buy us nothing and, on the contrary, would add complexity 
> where 
> there currently isn't.  We really need a simple blockage instruction.
>
>> Volatile itself should:
>> 
>> (a) prevent deletion or duplication of the operation
>> (b) prevent reordering wrt other volatiles
>> (c) prevent the operation from being considered equivalent to any other
>>     operation (even if it's structurally identical and has the same inputs)
>> 
>> but nothing beyond that.
>
> Maybe UNSPEC_VOLATILE is a misnomer then and we should allow volatile UNSPECs 
> along the above lines.

That'd be fine with me, especially since with the patch it sounds like
using volatile asm would produce better code than a built-in function
that expands to an unspec_volatile, whereas IMO the opposite should
always be true.

But I still think we need a similar list for what unspec_volatile
means now, if we decide to keep something with the current meaning.

Thanks,
Richard

Re: [RFC] Do not consider volatile asms as optimization barriers #1

Reply via email to