On 09/08/14 11:21, Wilco Dijkstra wrote:
Thanks! Jakub noticed a potential problem in this area a while back,
but I never came up with any code to trigger and have kept that issue on
my todo list ever since.
Rather than ensuring the inserted copy write a single register, it seems
to me we're better off ensuring that the number of hard registers hasn't
changed. Obviously that's not much of an issue for things like aarch64,
x86_64, but for the embedded 32 bit parts it's likely much more important.
So I think the test is something like
x = get_extended_src_reg (SET_SRC (PATTERN (cand->insn)))
if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
!= HARD_REGNO_NREGS (REGNO (new_reg), GET_MODE (SET_DEST (*orig_set))
return false;
I'm not 100% sure the rest of the code will correctly deal with multiple
registers in all possible combinations. I think adding more asserts would help
avoid similar silent codegeneration failures. For example, should widening
from 1 to 2 registers:
(set (reg:TI x2) (zero_extend:TI (reg:DI x2)))
be regarded as a "copy_needed"?
No in this case.
And what about TI x2 = zero_extend DI x3?
yes in this case.
Both cases have overlaps, but copy_needed is set differently...
Right. copy_needed isn't a property of whether or not we have an
overlap. Any time the source and destination refer to different
registers (regardless of overlap), then need_copy should be set.
I also noticed there are more places where get_extended_src_reg is not used
(eg. in the code that emits the final copy).
It's certainly possible I missed additional cases to use that helper
when it was added. Thanks for taking care of it.
Feel free to pass along asserts you think are appropriate.
Anyway here is the modified check:
Thanks. Just needs an updated ChangeLog and it's good for the trunk.
jeff