Eric Botcazou <ebotca...@adacore.com> writes:
> There seems to be a sufficiently large consensus that volatile asms should 
> not 
> be treated as full optimization barriers by the compiler.  This patch is a 
> first step towards implementing this conclusion and aims only at addressing 
> the code quality regressions introduced by
>   http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=193802
> on the 4.8 branch and mainline for volatile asms.
>
> It introduces a new, temporary predicate really_volatile_insn_p and invokes 
> it 
> from the 3 places in cse.c, cselib.c and dse.c which were touched above.  But 
> this comes with a side effect: the "blockage" standard pattern needs to be 
> adjusted to always use an UNSPEC_VOLATILE.  That's already the case for all 
> the architectures that define it, but 21 architectures (aarch64 arc avr bfin 
> cr16 cris epiphany h8300 m32c mcore mmix mn10300 moxie msp430 nds32 picochip 
> rl78 rx score v850 xtensa) don't define it and therefore need to be adjusted.
>
> Tested on x86_64-suse-linux and by building cc1 and compiling a relevant 
> testcase for the 21 aforementioned architectures.

Thanks for doing this.  FWIW I agree it's probably the best stop-gap fix.
But the implication seems to be that unspec_volatile and volatile asms
are volatile in different ways.  IMO they're volatile in the same way
and the problems for volatile asms apply to unspec_volatile too.

E.g. although cse.c will flush the table for unspec_volatile,
it isn't the case that unspec_volatile forces a containing function
to save all call-saved registers.  That would be excessive for a plain
blockage instruction.  So again we seem to be assuming one thing in places
like cse.c and another in the register allocator.  Code that uses the DF
framework will also assume that registers are not implicitly clobbered
by an unspec_volatile:

    case ASM_OPERANDS:
    case UNSPEC_VOLATILE:
    case TRAP_IF:
    case ASM_INPUT:
      {
        /* Traditional and volatile asm instructions must be
           considered to use and clobber all hard registers, all
           pseudo-registers and all of memory.  So must TRAP_IF and
           UNSPEC_VOLATILE operations.

           Consider for instance a volatile asm that changes the fpu
           rounding mode.  An insn should not be moved across this
           even if it only uses pseudo-regs because it might give an
           incorrectly rounded result.

           However, flow.c's liveness computation did *not* do this,
           giving the reasoning as " ?!? Unfortunately, marking all
           hard registers as live causes massive problems for the
           register allocator and marking all pseudos as live creates
           mountains of uninitialized variable warnings."

           In order to maintain the status quo with regard to liveness
           and uses, we do what flow.c did and just mark any regs we
           can find in ASM_OPERANDS as used.  In global asm insns are
           scanned and regs_asm_clobbered is filled out.

           For all ASM_OPERANDS, we must traverse the vector of input
           operands.  We can not just fall through here since then we
           would be confused by the ASM_INPUT rtx inside ASM_OPERANDS,
           which do not indicate traditional asms unlike their normal
           usage.  */
        if (code == ASM_OPERANDS)
          {
            int j;

            for (j = 0; j < ASM_OPERANDS_INPUT_LENGTH (x); j++)
              df_uses_record (collection_rec, &ASM_OPERANDS_INPUT (x, j),
                              DF_REF_REG_USE, bb, insn_info, flags);
            return;
          }
        break;
      }

Also, ira-lives.c (which tracks the liveness of both pseudo and hard
registers) doesn't mention "volatile" at all.

So most passes assume that no pseudos or hard registers will be
implicitly clobbered by unspec_volatile, just like for a volatile asm.
And IMO that's right.  I think the rule should be the same for volatile
asms and unspec_volatiles, and the same for registers as it already is for
memory: if the instruction clobbers something, it should say so explicitly.
Volatile itself should:

(a) prevent deletion or duplication of the operation
(b) prevent reordering wrt other volatiles
(c) prevent the operation from being considered equivalent to any other
    operation (even if it's structurally identical and has the same inputs)

but nothing beyond that.

So in terms of where we go from here, I'd hope we'd add something like
fake hard registers to track any processor mode of interest, such as FP
rounding mode once that is modelled properly.  Then anything that relies
on a specific mode would use that register and anything that changes the
mode would set the register, meaning we have a proper dependency chain.
I think we should do the same for whatever unspec_volatiles caused the
RTL CSE & co. to be so conservative.  Trying to leave it implicit seems
too error-prone, because then you have to make sure that every rtl pass
agrees on what the implicit behaviour is.

Thanks,
Richard


Reply via email to