Robert Dewar wrote: > Dave Korn wrote: >> Robert Dewar wrote: >> > >> Isn't it pretty much implied by point 1, "Not more than one volatile >> memory ref appears in the instructions being considered"? > > No, that allows a volatile reference to be combined with something else.
Ah, I misunderstood you... when you said "you never combine volatile references" I thought you meant "with each other"! > I think this is a mistake, because people often think of volatile as > guaranteeing what Ada would call an atomic access, one instruction > accessing just the variable and nothing else. Well, yes, they do. So do I, more-or-less! Remember, however, that in the particular case I was concerned with, we can show that the second insn is actually a no-op (because we have LOAD_EXTEND_OP telling us so), and the only reason I want them combined is so that the second insn can be entirely eliminated. So I think it's reasonable to fold a volatile mem-ref into what amounts to a nop-move. The thing is that the load-byte insn on my target clears the upper 24 bits of the register, and gcc takes advantage of that when (e.g.) you load bytes into registers prior to doing a comparison. But if those bytes are loaded through volatile pointers, it needlessly adds an AND with 0xFF immediate after the load. This is a common enough situation in our application to be worth fixing in the compiler. So, I think I'm going to go with my current approach, which is: - If the insn isn't recognized, but allowing volatile_ok causes it to be recognized - and the insns are consecutive SETs - and the SET_SRC for i2 is a mem ref and the SET_DEST is a reg - and the SET_SRC for i3 is a zero_extend or sign_extend - and the operand of the zero or sign extend in i3 matches either the SET_DEST of i2 or the SET_SRC of i2 [*] - and LOAD_EXTEND_OP for the mem ref's mode matches the code of the SET_SRC in i3 - and the reg from the SET_DEST of i2 has a REG_DEAD note on i3 then it's OK to combine. Unless of course someone points out some major flaw in my reasoning. Note the slightly funny test at [*]. That's because i3 has already been partially combined/simplified at the point in try_combine (validate_replacement:) where I'll be doing this test. I want to discriminate from the case where these are actually two entirely separate references to the same memory location, one of which just happens to be inside a zero/sign extend. This should be specific enough to catch my volatile byte loads without borking any other construct, I hope. At any rate, if there's a loophole in that set of rules I can't see it. I had hoped to make a more general improvement to combine, though. I guess that any time I have a volatile load followed by an ALU-style op that uses the dest of that load and leaves the load's dest reg_dead, I could in theory combine it validly - but there's no point doing so except in the case where LOAD_EXTEND_OP (or any other reason) means that part of the resulting construction can be eliminated, rather than just splitting the thing back to separate insns sometime in final code generation. cheers, DaveK -- Can't think of a witty .sigline today....