Ian Lance Taylor wrote:
Georg-Johann Lay writes:
If general_operand can be perceived as
(define_predicate "general_operand"
(ior (match_operand 0 "memory_operand")
(match_operand 0 "register_operand")
(match_operand 0 "immediate_operand")))
how can low_io_mem ever match?
Oh, I see, I did misunderstand your question. Sorry about that.
In general combine is just going to honor what your backend is asking
for. If your backend says that it is OK to combine a volatile memory
operand, then that is what combine is going to do.
Supplying an insn is the backend's way of telling that it is capable of
performing a specific operation.
In the present case, the backend tells that it has a "Conditional Jump
depending on a bit in the I/O area".
It does *not* say "It is ok to infringe volatile correctness" and move
one volatile memory access across an other one or similar memory
annotation like memory clobber by means of inline assembly or built-in
barrier.
It's certainly OK in
general to combine across a volatile memory operand, as is happening
here. I guess you are asking whether combine should have another check:
if some operand in the insn is a volatile MEM, and it will cross a
volatile MEM not mentioned in the combine, should combine reject that
combination.
It's never correct to exchange volatile accesses.
It's a case that will never ordinarily arise. It only
arises for you because you are specifically trying to combine a volatile
MEM. I don't know if it makes sense for the general purpose combine to
try to handle such an unusual special case.
It's not unusual because:
* It's not unusual to write down SFRs as violatile memory like
(*((volatile unsigned int*) 0x1234)) or as a cast to a composite
that reflects the SFRs bit(field)s.
* It's not unusual that microcontrollers can access specific parts
of memory -- in particular I/O -- with special instructions.
And then: combine does not operate in empty space ;-) Just like other
general purpose parts of the compiler that have to care for exceptions
for this hardware here and that hardware there like in reload, in
scheduler, and many other places.
BTW, I see this bug only with 4.6. 4.7 does not combine across the
volatile access. But it also fails to combine the jump-on-I/O-bit if the
load from I/O immediately preceedes the conditional jump-on-bit.
Didn't yet look into that.
Have you tried using a peephole instead of a combine pattern?
Ian
I don't like RTL/text peepholes. They are just a last resort to fix
stuff that wasn't optimized properly in other passes and to clean up
remains from reload.
In this case, a jump-on-I/O-bit does not require a register whereas
loading and then jump-on-bit-in-register will allocate a register even
though it's optimized out after relaod. And if reload has fun and spills
for some reason, peephole will never get a chance to fix it.
For setting a bit in I/O it's even worse. This can be accomplisheb by
one instruction and is an atomic operation.
A load-bitop-write operation, however, is no more atomic and requires a
register of a special class to perform bitop.
In order to get it atomic again, you'd have to turn IRQs off which makes
it even more costly up to factor of 6 because it is slower, more code,
needs a scratch to store the interrupt-flag and increases IRQ respond times.
What about that predicate?
(define_predicate "low_io_mem"
(and (match_code "mem")
(match_test "low_io_address_operand (XEXP (op, 0))")))
Guess it operates the same because the drag-volatile-over-volatile
happens not in recog(_for_combine) but already when combine synthesizes
new patterns. As combine juggles with data flow, it must not exchange
volatiles. That's all. The backend will know if it's ok to place a
specific operand into a specific insn or not.
And skipping optimizations like these altogether because they are
uncomfortable to handle is the wrong direction, IMHO.
Johann