The function gimple_folder::fold_pfalse that was introduced in g5289540ed58e42ae66255e31f22afe4ca0a6e15e inappropriately folds 1) svbrka/b with _m if the inactive operand is pfalse 2) svpmov_lane with _m if the non-governing predicate operand is pfalse. To fix this, we added extra checks in the folding logic excluding certain function shapes.
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. OK for trunk? Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> gcc/ PR target/121604 * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::fold_pfalse): Add checks for function shape. gcc/testsuite/ PR target/121604 * gcc.target/aarch64/sve/pr121604_brk.c: New test. * gcc.target/aarch64/sve2/pr121604_pmov.c: Likewise. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 6 +++-- .../gcc.target/aarch64/sve/pr121604_brk.c | 25 +++++++++++++++++++ .../gcc.target/aarch64/sve2/pr121604_pmov.c | 16 ++++++++++++ 3 files changed, 45 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 1764cf8f7e8..41b328d96fc 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -3641,9 +3641,11 @@ gimple_folder::fold_pfalse () inactive vector (arg0), while other function shapes are folded to op1 (arg1). */ tree arg1 = gimple_call_arg (call, 1); - if (is_pfalse (arg1)) + if (is_pfalse (arg1) + && shape != aarch64_sve::shapes::pmov_to_vector_lane) return fold_call_to (arg0); - if (is_pfalse (arg0)) + if (is_pfalse (arg0) + && shape != aarch64_sve::shapes::unary) return fold_call_to (arg1); return nullptr; } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c new file mode 100644 index 00000000000..a474a20554d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include <arm_sve.h> + +/* +** foo: +** ptrue p0\.b, all +** brkb p0\.b, p0/z, p0\.b +** ret +*/ +svbool_t foo () { + return svbrkb_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ()); +} + +/* +** bar: +** ptrue p0\.b, all +** brka p0\.b, p0/z, p0\.b +** ret +*/ +svbool_t bar () { + return svbrka_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ()); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c new file mode 100644 index 00000000000..16844ee4add --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv8.2-a+sve2p1" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include <arm_sve.h> + +/* +** f: +** pfalse p([0-7]+)\.b +** mov z0\.b, #-1 +** pmov z0\[1\], p\1\.d +** ret +*/ +svuint64_t f () { + return svpmov_lane_u64_m (svdup_u64 (~0UL), svpfalse (), 1); +} -- 2.34.1