Robert Dewar wrote:
> Dave Korn wrote:
>> Robert Dewar wrote:
>> 
> 
>>   Isn't it pretty much implied by point 1, "Not more than one volatile
>> memory ref appears in the instructions being considered"?
> 
> No, that allows a volatile reference to be combined with something else.

  Ah, I misunderstood you... when you said "you never combine volatile
references" I thought you meant "with each other"!

> I think this is a mistake, because people often think of volatile as
> guaranteeing what Ada would call an atomic access, one instruction
> accessing just the variable and nothing else.

  Well, yes, they do.  So do I, more-or-less!

  Remember, however, that in the particular case I was concerned with, we can
show that the second insn is actually a no-op (because we have LOAD_EXTEND_OP
telling us so), and the only reason I want them combined is so that the second
insn can be entirely eliminated.  So I think it's reasonable to fold a
volatile mem-ref into what amounts to a nop-move.

  The thing is that the load-byte insn on my target clears the upper 24 bits
of the register, and gcc takes advantage of that when (e.g.) you load bytes
into registers prior to doing a comparison.  But if those bytes are loaded
through volatile pointers, it needlessly adds an AND with 0xFF immediate after
the load.  This is a common enough situation in our application to be worth
fixing in the compiler.

  So, I think I'm going to go with my current approach, which is:

- If the insn isn't recognized, but allowing volatile_ok causes it to be
recognized
- and the insns are consecutive SETs
- and the SET_SRC for i2 is a mem ref and the SET_DEST is a reg
- and the SET_SRC for i3 is a zero_extend or sign_extend
- and the operand of the zero or sign extend in i3 matches either the SET_DEST
of i2 or the SET_SRC of i2 [*] 
- and LOAD_EXTEND_OP for the mem ref's mode matches the code of the SET_SRC in
i3
- and the reg from the SET_DEST of i2 has a REG_DEAD note on i3

then it's OK to combine.  Unless of course someone points out some major flaw
in my reasoning.

  Note the slightly funny test at [*].  That's because i3 has already been
partially combined/simplified at the point in try_combine
(validate_replacement:) where I'll be doing this test.  I want to discriminate
from the case where these are actually two entirely separate references to the
same memory location, one of which just happens to be inside a zero/sign
extend.

  This should be specific enough to catch my volatile byte loads without
borking any other construct, I hope.  At any rate, if there's a loophole in
that set of rules I can't see it.  I had hoped to make a more general
improvement to combine, though.  I guess that any time I have a volatile load
followed by an ALU-style op that uses the dest of that load and leaves the
load's dest reg_dead, I could in theory combine it validly - but there's no
point doing so except in the case where LOAD_EXTEND_OP (or any other reason)
means that part of the resulting construction can be eliminated, rather than
just splitting the thing back to separate insns sometime in final code
generation.

    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....

Reply via email to