https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115883
Bug ID: 115883 Summary: [15 Regression] late-combine exposing LRA problems Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hp at gcc dot gnu.org Target Milestone: --- Since the introduction of the late-combine optimization in r15-1579-g792f97b44ffc5e, test-results for CRIS have shown regressions for the following tests, all getting an ICE: gcc.sum gcc.target/cris/rld-legit1.c gcc.sum gcc.target/cris/rld-legit2.c gcc.sum gcc.target/cris/torture/pr24750-2.c Those are all basically the same code tested from different angles. In the code, an address, using two registers, is generated as desirable and valid. While it's valid for hard-registers, there's pressure from an asm clobbering all hard-regs that can be part of an address, so spills are introduced. Typically there's one spill to a special register, the other to stack. This code used to be a coverage test for the CRIS implementation of the reload macro LEGITIMIZE_RELOAD_ADDRESS. To wit, the RTX for the spilled memory expression looked like this: (mem:QI (plus:SI (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8])) (reg/v/f:SI 33 [ y ])) [0 *_3+0 S1 A8]). Though, for brevity, let's use the "compact" representation: [sign_extend([r32:SI])+r33:SI] (sadly the inner mode, HImode is missing in the compact representation; probably just a bug). The original reload problem is similar to what happens now (cf. PR24750): reload (now LRA) splits the address at the mem instead of at the sign_extend, and [sign_extend(r32:SI)+r33:SI] is *not* valid, whereas [r32:SI+r33:SI] is, in addition to the original. The ICE now, is that the former address is generated by LRA, and later not recognized: /src/gcc/gcc/testsuite/gcc.target/cris/rld-legit1.c: In function 'f': /src/gcc/gcc/testsuite/gcc.target/cris/rld-legit1.c:21:1: error: insn does not satisfy its constraints: ````` (insn 14 21 15 2 (parallel [ (set (reg/i:SI 10 r10) (sign_extend:SI (mem:QI (plus:SI (sign_extend:SI (reg:HI 9 r9 [39])) (reg:SI 13 r13 [37])) [0 *_3+0 S1 A8]))) (clobber (reg:CC 19 ccr)) ]) "/x/gcc/gcc/testsuite/gcc.target/cris/rld-legit1.c":21:1 52 {extendqisi\ 2} (nil)) during RTL pass: reload ''''' (The astute reader is aware that LRA identifies as reload in dump files.) With LRA, there's no mechanism corresponding to LEGITIMIZE_RELOAD_ADDRESS: LRA seems to split-up spilled parts of the memory address. These tests are instead now coverage for peephole2 patterns that cobbles together the split-up parts to emit the same assembly as for reload, for the intended addressing mode. This transformation is not performance-critical, the peephole2 patterns were added because failing to do so would have constituted a regression for the reload-to-LRA transition. These patterns are naturally brittle, matching one of the possible split-up sequences from LRA. They have remained surprisingly stable since the LRA transition, up to late-combine. Thankfully, these tests seem to be the only thing functionally failing with the late-combine introduction. This PR is about the presumed LRA bug, not about the peephole2 patterns incidentally no longer matching, or of performance for CRIS with late-combine. I've analyzed as far as the difference being due to a lack of REG_POINTER for the "base" register in the address, which seems to be due to an oversight in combine.cc, only forward-propagated by the first of the late-combine passes. It seems this is just exposing a flaw in LRA though; the invalid address should not have been generated, as REG_POINTER is not a stable attribute (IIUC); its presence exposes optimization opportunities but its absence should not cause invalid code or ICE.