On Thu, May 29, 2025 at 05:59:44PM +0100, Richard Sandiford wrote: > Sorry for the slow reply. > > Dimitar Dimitrov <dimi...@dinux.eu> writes: > > On Fri, May 16, 2025 at 06:14:30PM +0100, Richard Sandiford wrote: > >> Dimitar Dimitrov <dimi...@dinux.eu> writes: ... > >> It might still be worth trying to use simplify_subreg_regno and > >> seeing what breaks. Any fallaout would at least let us expand > >> the comments to explain the constraints. > > > > I tried simplify_subreg_regno, and some tests regressed for > > x86_64-pc-linux-gnu: > > > > check-gcc-c trunk patched change > > -------------------------------+-------------------+------------+----- > > # of expected passes 216431 216358 -73 > > # of unexpected failures 117 197 +80 > > # of unexpected successes 25 25 0 > > # of expected failures 1552 1552 0 > > # of unresolved testcases 0 33 +33 > > # of unsupported tests 3888 3888 0 > > > > For reference, here are a few of the newly failing tests: > > FAIL: gcc.dg/asan/pr82517.c -O0 (internal compiler error: in > > gen_reg_rtx, at emit-rtl.cc:1188) > > FAIL: c-c++-common/cpp/embed-28.c -Wc++-compat (internal compiler error: > > in gen_reg_rtx, at emit-rtl.cc:1188) > > FAIL: gcc.dg/torture/stackalign/pr16660-3.c -Os -mforce-drap -fpic > > (internal compiler error: in gen_reg_rtx, at emit-rtl.cc:1188) > > > > The stack pointer register is explicitly rejected by simplify_subreg_regno. > > So gen_lowpart fails to return a regno for 'sp' in > > config/i386/i386.md:28561: > > > > ;; Attempt to always use XOR for zeroing registers (including FP modes). > > (define_peephole2 > > [(set (match_operand 0 "general_reg_operand") > > (match_operand 1 "const0_operand"))] > > "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD > > && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ()) > > && peep2_regno_dead_p (0, FLAGS_REG)" > > [(parallel [(set (match_dup 0) (const_int 0)) > > (clobber (reg:CC FLAGS_REG))])] > > "operands[0] = gen_lowpart (word_mode, operands[0]);") > > > > peephole2 tries to generate the following, but gen_lowpart cannot create the > > subreg: > > > > (insn 29 8 12 2 (parallel [ > > (set (subreg:DI (reg/v:SI 7 sp [ a ]) 0) > > (const_int 0 [0])) > > (clobber (reg:CC 17 flags)) > > ]) > > > > ICE backtrace: > > .../gcc/gcc/testsuite/gcc.dg/asan/pr82517.c:34:1: internal compiler > > error: in gen_reg_rtx, at emit-rtl.cc:1188 > > 0x22d3d1f internal_error(char const*, ...) > > > > /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic-global-context.cc:517 > > 0x669157 fancy_abort(char const*, int, char const*) > > > > /home/dinux/projects/pru/local-workspace/gcc/gcc/diagnostic.cc:1815 > > 0x46df3d gen_reg_rtx(machine_mode) > > /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:1188 > > 0x946c80 copy_to_reg(rtx_def*) > > /home/dinux/projects/pru/local-workspace/gcc/gcc/explow.cc:622 > > 0xd5c64f gen_lowpart_general(machine_mode, rtx_def*) > > /home/dinux/projects/pru/local-workspace/gcc/gcc/rtlhooks.cc:56 > > 0x1998ca9 gen_split_318(rtx_insn*, rtx_def**) > > > > /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:12877 > > 0x1d8a2b9 split_insns(rtx_def*, rtx_insn*) > > > > /home/dinux/projects/pru/local-workspace/gcc/gcc/config/i386/i386.md:20885 > > 0x930f8b try_split(rtx_def*, rtx_insn*, int) > > /home/dinux/projects/pru/local-workspace/gcc/gcc/emit-rtl.cc:3952 > > 0xd18ab5 split_insn > > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3479 > > 0xd1e877 split_all_insns() > > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:3583 > > 0xd1e928 execute > > /home/dinux/projects/pru/local-workspace/gcc/gcc/recog.cc:4507 > > As an experiment, it would be interesting to see whether subregs of > the stack, frame, and argument pointers are the only exceptions, > in which case we could add a special case.
Yes, with added exceptions for those registers, x86_64-pc-linux-gnu has no regressions (see attached patch). Sorry, I cannot provide test results for other primary targets. I'm struggling to setup a cross-linux toolchain build and qemu dejagnu testing. > > But if the PR has already been fixed, then I supose there's not much > incentive to rework such a sensitive piece of code. :) I admit, CI going back to green for pru-unknown-elf certainly removed the pressure :) Andrew Pinski commented in PR119966 that the rootcause was probably not fixed. Issue instead became latent again. But right now I don't seem to have the expertise needed to rework validate_subreg. Thanks, Dimitar > > Thanks, > Richard
>From ea23fe8e509131c86149593464cfc7a8e69db7cb Mon Sep 17 00:00:00 2001 From: Dimitar Dimitrov <dimi...@dinux.eu> Date: Mon, 2 Jun 2025 08:10:16 +0300 Subject: [RFC PATCH] emit-rtl: Use simplify_subreg_regno to validate hardware subregs [PR119966] To: gcc-patches@gcc.gnu.org The simplify_subreg_regno performs more validity checks than the simple info.representable_p, most importantly with the calls to targetm.hard_regno_mode_ok. The checks for stack-related registers is bypassed because the i386 backend generates them, in a seemingly valid peephole optimization. This is just an experiment to investigate validate_subreg. No regressions were detected for C and C++ on x86_64-pc-linux-gnu. Signed-off-by: Dimitar Dimitrov <dimi...@dinux.eu> --- gcc/emit-rtl.cc | 2 +- gcc/rtl.h | 3 ++- gcc/rtlanal.cc | 30 +++++++++++++++++------------- 3 files changed, 20 insertions(+), 15 deletions(-) diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc index 3f453cda67e..acffa5aacc1 100644 --- a/gcc/emit-rtl.cc +++ b/gcc/emit-rtl.cc @@ -987,7 +987,7 @@ validate_subreg (machine_mode omode, machine_mode imode, else if (!REG_CAN_CHANGE_MODE_P (regno, imode, omode)) return false; - return subreg_offset_representable_p (regno, imode, offset, omode); + return simplify_subreg_regno (regno, imode, offset, omode, true) >= 0; } /* Do not allow normal SUBREG with stricter alignment than the inner MEM. diff --git a/gcc/rtl.h b/gcc/rtl.h index 7049ed775e8..803fd9db49d 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -2498,7 +2498,8 @@ extern bool subreg_offset_representable_p (unsigned int, machine_mode, poly_uint64, machine_mode); extern unsigned int subreg_regno (const_rtx); extern int simplify_subreg_regno (unsigned int, machine_mode, - poly_uint64, machine_mode); + poly_uint64, machine_mode, + bool allow_stack_regs = false); extern int lowpart_subreg_regno (unsigned int, machine_mode, machine_mode); extern unsigned int subreg_nregs (const_rtx); diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc index 86a5e473308..e7ff415cc41 100644 --- a/gcc/rtlanal.cc +++ b/gcc/rtlanal.cc @@ -4264,7 +4264,8 @@ subreg_offset_representable_p (unsigned int xregno, machine_mode xmode, int simplify_subreg_regno (unsigned int xregno, machine_mode xmode, - poly_uint64 offset, machine_mode ymode) + poly_uint64 offset, machine_mode ymode, + bool allow_stack_regs) { struct subreg_info info; unsigned int yregno; @@ -4275,20 +4276,23 @@ simplify_subreg_regno (unsigned int xregno, machine_mode xmode, && !REG_CAN_CHANGE_MODE_P (xregno, xmode, ymode)) return -1; - /* We shouldn't simplify stack-related registers. */ - if ((!reload_completed || frame_pointer_needed) - && xregno == FRAME_POINTER_REGNUM) - return -1; + if (!allow_stack_regs) + { + /* We shouldn't simplify stack-related registers. */ + if ((!reload_completed || frame_pointer_needed) + && xregno == FRAME_POINTER_REGNUM) + return -1; - if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM - && xregno == ARG_POINTER_REGNUM) - return -1; + if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM + && xregno == ARG_POINTER_REGNUM) + return -1; - if (xregno == STACK_POINTER_REGNUM - /* We should convert hard stack register in LRA if it is - possible. */ - && ! lra_in_progress) - return -1; + if (xregno == STACK_POINTER_REGNUM + /* We should convert hard stack register in LRA if it is + possible. */ + && ! lra_in_progress) + return -1; + } /* Try to get the register offset. */ subreg_get_info (xregno, xmode, offset, ymode, &info); -- 2.49.0