Hi Jennifer, > On 25 Aug 2025, at 12:56, Jennifer Schmitz <jschm...@nvidia.com> wrote: > > When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero > vector and predication is _z, an ICE in vregs occurs, e.g. for > > svuint8_t foo (svbool_t pg, svuint8_t op1) > { > return svsqadd_u8_z (pg, op1, svdup_s8 (0)); > } > > The insn failed to match the pattern (aarch64-sve2.md): > > ;; Predicated binary operations with no reverse form, merging with zero. > ;; At present we don't generate these patterns via a cond_* optab, > ;; so there's no correctness requirement to handle merging with an > ;; independent value. > (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z" > [(set (match_operand:SVE_FULL_I 0 "register_operand") > (unspec:SVE_FULL_I > [(match_operand:<VPRED> 1 "register_operand") > (unspec:SVE_FULL_I > [(match_operand 5) > (unspec:SVE_FULL_I > [(match_operand:SVE_FULL_I 2 "register_operand") > (match_operand:SVE_FULL_I 3 "register_operand")] > SVE2_COND_INT_BINARY_NOREV)] > UNSPEC_PRED_X) > (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")] > UNSPEC_SEL))] > "TARGET_SVE2" > {@ [ cons: =0 , 1 , 2 , 3 ] > [ &w , Upl , 0 , w ] movprfx\t%0.<Vetype>, %1/z, > %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype> > [ &w , Upl , w , w ] movprfx\t%0.<Vetype>, %1/z, > %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype> > } > "&& !CONSTANT_P (operands[5])" > { > operands[5] = CONSTM1_RTX (<VPRED>mode); > } > [(set_attr "movprfx" "yes")] > ) > > because operands[3] and operands[4] were both expanded into the same register > operand containing a zero vector by > > ;; Predicated binary arithmetic with merging. > (define_expand "@cond_<sve_int_op><mode>" > [(set (match_operand:SVE_FULL_I 0 "register_operand") > (unspec:SVE_FULL_I > [(match_operand:<VPRED> 1 "register_operand") > (unspec:SVE_FULL_I > [(match_dup 5) > (unspec:SVE_FULL_I > [(match_operand:SVE_FULL_I 2 "register_operand") > (match_operand:SVE_FULL_I 3 "register_operand")] > SVE2_COND_INT_BINARY)] > UNSPEC_PRED_X) > (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")] > UNSPEC_SEL))] > "TARGET_SVE2" > { > operands[5] = CONSTM1_RTX (<MODE>mode); > } > ) > > This patch fixes the ICE by changing the predicate of operands[3] > in both patterns to aarch64_simd_reg_or_zero. It also adds tests > adjusted from the report in PR121599. > > The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. > OK for trunk? > > Alex Coplan pointed out in the bugzilla ticket that this ICE goes back > to GCC 10. Shall we backport? > > Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> > > gcc/ > PR target/121599 > * config/aarch64/aarch64-sve2.md (@cond_<sve_int_op><mode>): > Set predicate for operand 3 to aarch64_simd_reg_or_zero. > (*cond_<sve_int_op><mode>_z): Likewise. > > gcc/testsuite/ > PR target/121599 > * gcc.target/aarch64/sve2/pr121599.c: New test. > --- > gcc/config/aarch64/aarch64-sve2.md | 4 +-- > .../gcc.target/aarch64/sve2/pr121599.c | 31 +++++++++++++++++++ > 2 files changed, 33 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c > > diff --git a/gcc/config/aarch64/aarch64-sve2.md > b/gcc/config/aarch64/aarch64-sve2.md > index a3cbbce8b31..c6a84fa7f3d 100644 > --- a/gcc/config/aarch64/aarch64-sve2.md > +++ b/gcc/config/aarch64/aarch64-sve2.md > @@ -1021,7 +1021,7 @@ > [(match_dup 5) > (unspec:SVE_FULL_I > [(match_operand:SVE_FULL_I 2 "register_operand") > - (match_operand:SVE_FULL_I 3 "register_operand")] > + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")] > SVE2_COND_INT_BINARY)] > UNSPEC_PRED_X) > (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")] > @@ -1136,7 +1136,7 @@ > [(match_operand 5) > (unspec:SVE_FULL_I > [(match_operand:SVE_FULL_I 2 "register_operand") > - (match_operand:SVE_FULL_I 3 "register_operand")] > + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")] > SVE2_COND_INT_BINARY_NOREV)] > UNSPEC_PRED_X) > (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
I think the ICE happens because operands[4] ends up being a register which the aarch64_simd_imm_zero can’t handle. The pattern *cond_<sve_int_op><mode>_z is also not supposed to be handling register operands, it only expresses a merge with zero. Allowing a zero in operands[3] here is problematic as the output code wouldn’t know how to print it if the midend used a zero vector for it. I think the right solution is to adjust the @cond_<sve_int_op><mode> expander to only allow aarch64_simd_imm_zero for operands[4] for the SVE2_COND_INT_BINARY_NOREV subset of SVE2_COND_INT_BINARY. Thanks, Kyrill > diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c > b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c > new file mode 100644 > index 00000000000..cd80ef9115c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c > @@ -0,0 +1,31 @@ > +/* PR target/121599. */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { check-function-bodies "**" "" "" } } */ > + > +#include <arm_sve.h> > + > +/* > +** foo: > +** movi d([0-9]+), #0 > +** movprfx z0\.b, p0/z, z0\.b > +** usqadd z0\.b, p0/m, z0\.b, z\1\.b > +** ret > +*/ > +svuint8_t foo (svbool_t pg, svuint8_t op1) > +{ > + return svsqadd_u8_z (pg, op1, svdup_s8 (0)); > +} > + > +/* > +** bar: > +** movi d([0-9]+), #0 > +** movprfx z0\.b, p0/z, z0\.b > +** suqadd z0\.b, p0/m, z0\.b, z\1\.b > +** ret > +*/ > +svint8_t bar (svbool_t pg, svint8_t op1) > +{ > + return svuqadd_n_s8_z (pg, op1, 0); > +} > + > -- > 2.34.1 >