Eric Botcazou <ebotca...@adacore.com> writes: >> One of the big grey areas is what should happen for floating-point ops >> that depend on the current rounding mode. That isn't really modelled >> properly yet though. Again, it affects calls as well as volatile asms. > > There is an explicit comment about this in the scheduler: > > case ASM_OPERANDS: > case ASM_INPUT: > { > /* Traditional and volatile asm instructions must be considered to use > and clobber all hard registers, all pseudo-registers and all of > memory. So must TRAP_IF and UNSPEC_VOLATILE operations. > > Consider for instance a volatile asm that changes the fpu rounding > mode. An insn should not be moved across this even if it only uses > pseudo-regs because it might give an incorrectly rounded result. */ > if (code != ASM_OPERANDS || MEM_VOLATILE_P (x)) > reg_pending_barrier = TRUE_BARRIER;
But here too the point is that we don't assume the same thing at the tree level or during register allocation. It seems a bit silly for the scheduler to assume that all hard registers are clobbered when the register allocator itself doesn't assume that. And most rtl passes assume that changes to pseudo registers are explicitly modelled via SETs or CLOBBERs. E.g. to slightly adjust my previous example: void foo (float *dest, float x, float y) { dest[0] = x + y; asm volatile ("# foo"); dest[1] = x + y; } gives: addss %xmm1, %xmm0 movss %xmm0, (%rdi) #APP # 4 "/tmp/foo.c" 1 # foo # 0 "" 2 #NO_APP movss %xmm0, 4(%rdi) ret And we certainly don't consider volatile asms to clobber memory in other places. E.g.: void foo (float *dest, float x, float y) { dest[0] = x + y; asm volatile ("# foo"); dest[1] = dest[0]; } gives the same code as above. In contrast: void foo (float *dest, float x, float y) { dest[0] = x + y; asm volatile ("# foo" ::: "memory"); dest[1] = dest[0]; } _does_ force dest[0] to be reloaded. As far as it being a scheduling barrier, consider something like: void foo (float *dest, float x, float y) { int i; for (i = 0; i < 100; i++) { asm volatile ("# a"); dest[i] = x + y; asm volatile ("# b"); } } At -O2 the addition is hoisted out of the loop. Obviously the scheduler is free to be extra conservative but I don't think the comment describes the semantics of volatile asms. Thanks, Richard