[Bug rtl-optimization/115883] New: [15 Regression] late-combine exposing LRA problems

hp at gcc dot gnu.org via Gcc-bugs Thu, 11 Jul 2024 15:22:15 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115883


            Bug ID: 115883
           Summary: [15 Regression] late-combine exposing LRA problems
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hp at gcc dot gnu.org
  Target Milestone: ---

Since the introduction of the late-combine optimization in
r15-1579-g792f97b44ffc5e, test-results for CRIS have shown regressions for the
following tests, all getting an ICE:
gcc.sum gcc.target/cris/rld-legit1.c
gcc.sum gcc.target/cris/rld-legit2.c
gcc.sum gcc.target/cris/torture/pr24750-2.c

Those are all basically the same code tested from different angles.  In the
code, an address, using two registers, is generated as desirable and valid. 
While it's valid for hard-registers, there's pressure from an asm clobbering
all hard-regs that can be part of an address, so spills are introduced. 
Typically there's one spill to a special register, the other to stack.  This
code used to be a coverage test for the CRIS implementation of the reload macro
LEGITIMIZE_RELOAD_ADDRESS. To wit, the RTX for the spilled memory expression
looked like this:
 (mem:QI (plus:SI (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2
A8])) (reg/v/f:SI 33 [ y ])) [0 *_3+0 S1 A8]).
Though, for brevity, let's use the "compact" representation:
[sign_extend([r32:SI])+r33:SI] (sadly the inner mode, HImode is missing in the
compact representation; probably just a bug).

The original reload problem is similar to what happens now (cf. PR24750):
reload (now LRA) splits the address at the mem instead of at the sign_extend,
and
[sign_extend(r32:SI)+r33:SI] is *not* valid, whereas [r32:SI+r33:SI] is, in
addition to the original.  The ICE now, is that the former address is generated
by LRA, and later not recognized:
/src/gcc/gcc/testsuite/gcc.target/cris/rld-legit1.c: In function 'f':
/src/gcc/gcc/testsuite/gcc.target/cris/rld-legit1.c:21:1: error: insn does not
satisfy its constraints:
`````
(insn 14 21 15 2 (parallel [
            (set (reg/i:SI 10 r10)
                (sign_extend:SI (mem:QI (plus:SI (sign_extend:SI (reg:HI 9 r9
[39]))
                            (reg:SI 13 r13 [37])) [0 *_3+0 S1 A8])))
            (clobber (reg:CC 19 ccr))
        ]) "/x/gcc/gcc/testsuite/gcc.target/cris/rld-legit1.c":21:1 52
{extendqisi\
2}
     (nil))
during RTL pass: reload
'''''
(The astute reader is aware that LRA identifies as reload in dump files.)

With LRA, there's no mechanism corresponding to LEGITIMIZE_RELOAD_ADDRESS: LRA
seems to split-up spilled parts of the memory address.  These tests are instead
now coverage for peephole2 patterns that cobbles together the split-up parts to
emit the same assembly as for reload, for the intended addressing mode.  This
transformation is not performance-critical, the peephole2 patterns were added
because failing to do so would have constituted a regression for the
reload-to-LRA transition.  These patterns are naturally brittle, matching one
of the possible split-up sequences from LRA.  They have remained surprisingly
stable since the LRA transition, up to late-combine.  Thankfully, these tests
seem to be the only thing functionally failing with the late-combine
introduction.

This PR is about the presumed LRA bug, not about the peephole2 patterns
incidentally no longer matching, or of performance for CRIS with late-combine.

I've analyzed as far as the difference being due to a lack of REG_POINTER for
the "base" register in the address, which seems to be due to an oversight in
combine.cc, only forward-propagated by the first of the late-combine passes. 
It seems this is just exposing a flaw in LRA though; the invalid address should
not have been generated, as REG_POINTER is not a stable attribute (IIUC); its
presence exposes optimization opportunities but its absence should not cause
invalid code or ICE.

[Bug rtl-optimization/115883] New: [15 Regression] late-combine exposing LRA problems

Reply via email to