Eric Botcazou <ebotca...@adacore.com> writes:
>> One of the big grey areas is what should happen for floating-point ops
>> that depend on the current rounding mode.  That isn't really modelled
>> properly yet though.  Again, it affects calls as well as volatile asms.
>
> There is an explicit comment about this in the scheduler:
>
>     case ASM_OPERANDS:
>     case ASM_INPUT:
>       {
>       /* Traditional and volatile asm instructions must be considered to use
>          and clobber all hard registers, all pseudo-registers and all of
>          memory.  So must TRAP_IF and UNSPEC_VOLATILE operations.
>
>          Consider for instance a volatile asm that changes the fpu rounding
>          mode.  An insn should not be moved across this even if it only uses
>          pseudo-regs because it might give an incorrectly rounded result.  */
>       if (code != ASM_OPERANDS || MEM_VOLATILE_P (x))
>         reg_pending_barrier = TRUE_BARRIER;

But here too the point is that we don't assume the same thing at the
tree level or during register allocation.  It seems a bit silly for
the scheduler to assume that all hard registers are clobbered when the
register allocator itself doesn't assume that.  And most rtl passes
assume that changes to pseudo registers are explicitly modelled via SETs
or CLOBBERs.

E.g. to slightly adjust my previous example:

  void foo (float *dest, float x, float y)
  {
    dest[0] = x + y;
    asm volatile ("# foo");
    dest[1] = x + y;
  }

gives:

        addss   %xmm1, %xmm0
        movss   %xmm0, (%rdi)
#APP
# 4 "/tmp/foo.c" 1
        # foo
# 0 "" 2
#NO_APP
        movss   %xmm0, 4(%rdi)
        ret

And we certainly don't consider volatile asms to clobber memory in other
places.  E.g.:

  void foo (float *dest, float x, float y)
  {
    dest[0] = x + y;
    asm volatile ("# foo");
    dest[1] = dest[0];
  }

gives the same code as above.  In contrast:

  void foo (float *dest, float x, float y)
  {
    dest[0] = x + y;
    asm volatile ("# foo" ::: "memory");
    dest[1] = dest[0];
  }

_does_ force dest[0] to be reloaded.

As far as it being a scheduling barrier, consider something like:

  void foo (float *dest, float x, float y)
  {
    int i;
    for (i = 0; i < 100; i++)
      {
        asm volatile ("# a");
        dest[i] = x + y;
        asm volatile ("# b");
      }
  }

At -O2 the addition is hoisted out of the loop.

Obviously the scheduler is free to be extra conservative but I don't
think the comment describes the semantics of volatile asms.

Thanks,
Richard

Reply via email to