https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88714
Matthew Malcomson <matmal01 at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |matmal01 at gcc dot gnu.org --- Comment #29 from Matthew Malcomson <matmal01 at gcc dot gnu.org> --- Hi Jakub, I've been working on a patch that does very similar to the draft patch posted above, and I notice a few things I've tried to avoid in it. I doubt there are any actual bugs, since I don't know if the patterns that trigger actual faults can occur at the moment. -------- Using the `address_operand` predicate and 'p' constraint to ensure the address is a valid address would use the mode SImode of the operand rather than checking it's valid for the DImode of the emitted ldrd. If this happens we generate an ICE in the `adjust_address` call just before `output_move_double`. I don't know if such a pattern can actually be generated, but we could use `arm_legitimate_address_p (DImode, XEXP (operands[1], 0), true)` in the condition to avoid it just in case. -------- There's a similar problem to the `address_operand` one above with using the `arm_count_output_move_double_insns` function. It's called on the original operands, which means it eventually calls `output_move_double` with the first two operands (which are in SImode). This function has some calls to `reg_overlap_mentioned_p`, which depends on the number of hard registers for a given registers mode. I've only found cases where the `arm_count_output_move_double_insns` function returns something other than what it should in cases that only match because of the `address_operand` problem above. This could be replaced by a wrapper that generates DImode registers specifically for checking this. ------- I think generation of patterns of the form (plus:SI (plus:SI (reg) (const_int)) (const_int)) which can happen with these peepholes isn't very nice. I can't find any constraint against these patterns in the canonicalization rules (maybe there should be?) so I can't say this is an actual problem. As an example: the following int __RTL (startwith ("peephole2")) foo_x4 (int *a) { (function "foo_x4" (insn-chain (cnote 1 NOTE_INSN_DELETED) (block 2 (edge-from entry (flags "FALLTHRU")) (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) (cinsn 101 (set (reg:SI r2) (mem/c:SI (plus:SI (reg:SI r0) (const_int 8)) [0 S4 A64])) "/home/matmal01/test.c":18) (cinsn 102 (set (reg:SI r3) (mem/c:SI (plus:SI (reg:SI r0) (const_int 12)) [0 S4 A32])) "/home/matmal01/test.c":18) (cinsn 103 (set (reg:SI r0) (plus:SI (reg:SI r2) (reg:SI r3))) "/home/matmal01/test.c":18) (edge-to exit (flags "FALLTHRU")) ) ;; block 2 ) ;; insn-chain (crtl (return_rtx (reg/i:SI r0) ) ;; return_rtx ) ;; crtl ) ;; function "main" } Produces (insn 104 3 103 2 (parallel [ (set (reg:SI 2 r2) (mem/c:SI (plus:SI (reg:SI 0 r0) (const_int 8 [0x8])) [0 S4 S4 A64])) (set (reg:SI 3 r3) (mem/c:SI (plus:SI (plus:SI (reg:SI 0 r0) (const_int 8 [0x8])) (const_int 4 [0x4])) [0 S4 S4 A32])) ]) -1 (nil)) Maybe we could use the existing operands, and match with `rtx_equal_p (..., plus_constant (...))` so that the plus_constant can take care of adding the constants together. This is what we do in the load_pair patterns for aarch64. -------- There are a few other tidy-up points around the define_insn patterns, but overall I believe they can be merged into one pattern. The difference between the 'q' and 'r' constraints are using either CORE_REGS or GENERAL_REGS, where CORE_REGS allows r13 and GENERAL_REGS doesn't. I guess this is from a line in infocenter that mentions r12 is strongly recommended to not be used as the first register for ldrdb, as this is stopped by requiring both the first and second register to not be r13. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Cihjffga.html