Hi Christophe, I'm testing the attached patch, but without an aarch64, it'll take a while to figure out the toolchain to reproduce the failure. Neither of the platforms I tested were affected, but I can see it's unsafe to reuse the subreg_promoted_reg idiom from just a few lines earlier. Any help testing the attached patch on an affected target would be much appreciated.
Sorry for the inconvenience. Roger -- -----Original Message----- From: Christophe LYON <christophe.l...@foss.st.com> Sent: 31 August 2021 13:32 To: Roger Sayle <ro...@nextmovesoftware.com>; 'GCC Patches' <gcc-patches@gcc.gnu.org> Subject: Re: [PATCH] Preserve SUBREG_PROMOTED_VAR_P on (extend:HI (subreg/s:QI (reg:SI))) On 29/08/2021 09:46, Roger Sayle wrote: > SUBREG_PROMOTED_VAR_P is a mechanism for tracking that a partial > subreg is correctly zero-extended or sign-extended in the parent > register. For example, the RTL (subreg/s/v:QI (reg/v:SI 23 [ x ]) 0) > indicates that the byte x is zero extended in reg:SI 23, which is useful for > optimization. > An example is that zero extending the above QImode value to HImode can > simply use a wider subreg, i.e. (subreg:HI (reg/v:SI 23 [ x ]) 0). > > This patch addresses the oversight/missed optimization opportunity > that the new HImode subreg above should retain its > SUBREG_PROMOTED_VAR_P annotation as its value is guaranteed to be > correctly extended in the SImode parent. The code below to preserve > SUBREG_PROMOTED_VAR_P is already present in the middle-end (e.g. > simplify-rtx.c:7232-7242) but missing from one or two (precisely three) > places that (accidentally) strip it. > > Whilst there I also added another optimization. If we need to extend > the above QImode value beyond the SImode register holding it, say to > DImode, we can eliminate the SUBREG and simply extend from the SImode > register to DImode. > > This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" > and "make -k check" with no new failures, and on a cross-compiler to > nvptx-none, where the function "long foo(char x) { return x; }" now > requires one less instruction. > > OK for mainline? Hi, This patch causes an ICE when building an aarch64 toolchain: during RTL pass: expand In file included from /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/soft-fp.h:318, from /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/floatditf.c:32: /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/floatditf.c: In function '__floatditf': /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/op-2.h:249:37: internal compiler error: in subreg_promoted_mode, at rtl.h:3132 249 | _FP_PACK_RAW_2_flo.bits.exp = X##_e; \ /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/quad.h:229:33: note: in expansion of macro '_FP_PACK_RAW_2' 229 | # define FP_PACK_RAW_Q(val, X) _FP_PACK_RAW_2 (Q, (val), X) | ^~~~~~~~~~~~~~ /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/soft-fp/floatditf.c:42:3: note: in expansion of macro 'FP_PACK_RAW_Q' 42 | FP_PACK_RAW_Q (a, A); | ^~~~~~~~~~~~~ 0xa0b53a subreg_promoted_mode /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/rtl.h:3132 0xa0b53a convert_modes(machine_mode, machine_mode, rtx_def*, int) /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:699 0xa003bc expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:9091 0xa0765c expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10497 0x9fcef6 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:9798 0xa0765c expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10497 0xa1099e expand_expr /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.h:301 0xa1099e expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**, rtx_def**, expand_modifier) /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:8308 0x9fcdff expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) /tmp/5987050_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10288 Can you check? Thanks, Christophe > > 2021-08-29 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > * expr.c (convert_modes): Preserve SUBREG_PROMOTED_VAR_P when > creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P > subreg. > * simplify-rtx.c (simplify_unary_operation_1) [SIGN_EXTEND]: > Likewise, preserve SUBREG_PROMOTED_VAR_P when creating a (wider) > partial subreg from a SUBREG_PROMOTED_VAR_P subreg. Generate > SIGN_EXTEND of the SUBREG_REG when a subreg would be paradoxical. > [ZERO_EXTEND]: Likewise, preserve SUBREG_PROMOTED_VAR_P when > creating a (wider) partial subreg from a SUBREG_PROMOTED_VAR_P > subreg. Generate ZERO_EXTEND of the SUBREG_REG when a subreg > would be paradoxical. > > Roger > -- >
diff --git a/gcc/expr.c b/gcc/expr.c index 5dd98a9..17f2c2f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -690,17 +690,20 @@ convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp) && SUBREG_CHECK_PROMOTED_SIGN (x, unsignedp)) { scalar_int_mode int_orig_mode; + scalar_int_mode int_inner_mode; machine_mode orig_mode = GET_MODE (x); x = gen_lowpart (int_mode, SUBREG_REG (x)); /* Preserve SUBREG_PROMOTED_VAR_P if the new mode is wider than the original mode, but narrower than the inner mode. */ if (GET_CODE (x) == SUBREG - && GET_MODE_PRECISION (subreg_promoted_mode (x)) - > GET_MODE_PRECISION (int_mode) && is_a <scalar_int_mode> (orig_mode, &int_orig_mode) && GET_MODE_PRECISION (int_mode) - > GET_MODE_PRECISION (int_orig_mode)) + > GET_MODE_PRECISION (int_orig_mode) + && is_a <scalar_int_mode> (GET_MODE (SUBREG_REG (x)), + &int_inner_mode) + && GET_MODE_PRECISION (int_inner_mode) + > GET_MODE_PRECISION (int_mode)) { SUBREG_PROMOTED_VAR_P (x) = 1; SUBREG_PROMOTED_SET (x, unsignedp);