On 26/08/2025 13:39, Jennifer Schmitz wrote:
> The function gimple_folder::fold_pfalse that was introduced in
> g5289540ed58e42ae66255e31f22afe4ca0a6e15e
> inappropriately folds
> 1) svbrka/b with _m if the inactive operand is pfalse
> 2) svpmov_lane with _m if the non-governing predicate operand is pfalse.
> To fix this, we added extra checks in the folding logic excluding certain
> function shapes.
> 
> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
> OK for trunk?
> 
> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
> 
> gcc/
>       PR target/121604
>       * config/aarch64/aarch64-sve-builtins.cc
>       (gimple_folder::fold_pfalse): Add checks for function shape.
> 
> gcc/testsuite/
>       PR target/121604
>       * gcc.target/aarch64/sve/pr121604_brk.c: New test.
>       * gcc.target/aarch64/sve2/pr121604_pmov.c: Likewise.
> ---
>  gcc/config/aarch64/aarch64-sve-builtins.cc    |  6 +++--
>  .../gcc.target/aarch64/sve/pr121604_brk.c     | 25 +++++++++++++++++++
>  .../gcc.target/aarch64/sve2/pr121604_pmov.c   | 16 ++++++++++++
>  3 files changed, 45 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
> 
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
> b/gcc/config/aarch64/aarch64-sve-builtins.cc
> index 1764cf8f7e8..41b328d96fc 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -3641,9 +3641,11 @@ gimple_folder::fold_pfalse ()
>        inactive vector (arg0), while other function shapes are folded
>        to op1 (arg1).  */
>        tree arg1 = gimple_call_arg (call, 1);
> -      if (is_pfalse (arg1))
> +      if (is_pfalse (arg1)
> +       && shape != aarch64_sve::shapes::pmov_to_vector_lane)
>       return fold_call_to (arg0);
> -      if (is_pfalse (arg0))
> +      if (is_pfalse (arg0)
> +       && shape != aarch64_sve::shapes::unary)
>       return fold_call_to (arg1);
>        return nullptr;
>      }

This fix feels very fragile, like we could easily have the same bug introduced
again for other intrinsics that have non-governing predicate operands; have you
verified there are no other such SVE intrinsics?  Either way, it's
possible that more such intrinsics will get introduced in the future,
and we'll have the exact same bug for those.

Relying on is_pfalse () to infer the position of the GP is just too
fragile IMO.

As I mentioned in the PR, I think it would be better if we explicitly
determined:
(1) the position of the governing predicate, if any, and
(2) the position of the inactive argument (to fold to on pfalse), if any,
and used that information to do the folding.

Either we can do that by dispatching on the shapes (i.e. positively
classify the shapes into groups for the two cases that we know we can
safely optimize), or perhaps that information can be encoded in some
other way.

Thanks,
Alex

> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
> new file mode 100644
> index 00000000000..a474a20554d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include <arm_sve.h>
> +
> +/*
> +** foo:
> +**   ptrue   p0\.b, all
> +**   brkb    p0\.b, p0/z, p0\.b
> +**   ret
> +*/
> +svbool_t foo () {
> +  return svbrkb_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
> +}
> +
> +/*
> +** bar:
> +**   ptrue   p0\.b, all
> +**   brka    p0\.b, p0/z, p0\.b
> +**   ret
> +*/
> +svbool_t bar () {
> +  return svbrka_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c 
> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
> new file mode 100644
> index 00000000000..16844ee4add
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=armv8.2-a+sve2p1" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include <arm_sve.h>
> +
> +/*
> +** f:
> +**   pfalse  p([0-7]+)\.b
> +**   mov     z0\.b, #-1
> +**   pmov    z0\[1\], p\1\.d
> +**   ret
> +*/
> +svuint64_t f () {
> +    return svpmov_lane_u64_m (svdup_u64 (~0UL), svpfalse (), 1);
> +}
> -- 
> 2.34.1

Reply via email to