On Thu, Mar 7, 2024 at 1:25 PM Robin Dapp <rdapp....@gmail.com> wrote:
>
> Attached v2 combines the checks.
>
> Bootstrapped and regtested on x86 an power10, aarch64 still running.
> Regtested on riscv64.

LGTM.

> Regards
>  Robin
>
>
> Subject: [PATCH v2] vect: Do not peel epilogue for partial vectors.
>
> r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early
> break but PR114196 shows that we also run into the problem without early
> break.  Therefore merge the condition into the topmost vectorization
> guard.
>
> gcc/ChangeLog:
>
>         PR middle-end/114196
>
>         * tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p): Merge
>         vectorization guards.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/aarch64/pr114196.c: New test.
>         * gcc.target/riscv/rvv/autovec/pr114196.c: New test.
> ---
>  gcc/testsuite/gcc.target/aarch64/pr114196.c   | 19 ++++++++++++
>  .../gcc.target/riscv/rvv/autovec/pr114196.c   | 19 ++++++++++++
>  gcc/tree-vect-loop-manip.cc                   | 30 +++++--------------
>  3 files changed, 45 insertions(+), 23 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr114196.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114196.c
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr114196.c 
> b/gcc/testsuite/gcc.target/aarch64/pr114196.c
> new file mode 100644
> index 00000000000..15e4b0e31b8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr114196.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options { -O3 -fno-vect-cost-model -march=armv9-a 
> -msve-vector-bits=256 } } */
> +
> +unsigned a;
> +int b;
> +long *c;
> +
> +int
> +main ()
> +{
> +  for (int d = 0; d < 22; d += 4) {
> +      b = ({
> +           int e = c[d];
> +           e;
> +           })
> +      ? 0 : -c[d];
> +      a *= 3;
> +  }
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114196.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114196.c
> new file mode 100644
> index 00000000000..7ba9cbbed70
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114196.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options { -O3 -fno-vect-cost-model -march=rv64gcv_zvl256b 
> -mabi=lp64d -mrvv-vector-bits=zvl } } */
> +
> +unsigned a;
> +int b;
> +long *c;
> +
> +int
> +main ()
> +{
> +  for (int d = 0; d < 22; d += 4) {
> +      b = ({
> +           int e = c[d];
> +           e;
> +           })
> +      ? 0 : -c[d];
> +      a *= 3;
> +  }
> +}
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index f72da915103..56a6d8e4a8d 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -2129,16 +2129,19 @@ vect_can_peel_nonlinear_iv_p (loop_vec_info 
> loop_vinfo,
>       For mult, don't known how to generate
>       init_expr * pow (step, niters) for variable niters.
>       For neg, it should be ok, since niters of vectorized main loop
> -     will always be multiple of 2.  */
> -  if ((!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> -       || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ())
> +     will always be multiple of 2.
> +     See also PR113163 and PR114196.  */
> +  if ((!LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()
> +       || LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)
> +       || !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
>        && induction_type != vect_step_op_neg)
>      {
>        if (dump_enabled_p ())
>         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>                          "Peeling for epilogue is not supported"
>                          " for nonlinear induction except neg"
> -                        " when iteration count is unknown.\n");
> +                        " when iteration count is unknown or"
> +                        " when using partial vectorization.\n");
>        return false;
>      }
>
> @@ -2178,25 +2181,6 @@ vect_can_peel_nonlinear_iv_p (loop_vec_info loop_vinfo,
>        return false;
>      }
>
> -  /* We can't support partial vectors and early breaks with an induction
> -     type other than add or neg since we require the epilog and can't
> -     perform the peeling.  The below condition mirrors that of
> -     vect_gen_vector_loop_niters  where niters_vector_mult_vf_var then sets
> -     step_vector to VF rather than 1.  This is what creates the nonlinear
> -     IV.  PR113163.  */
> -  if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> -      && LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()
> -      && LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)
> -      && induction_type != vect_step_op_neg)
> -    {
> -      if (dump_enabled_p ())
> -       dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> -                        "Peeling for epilogue is not supported"
> -                        " for nonlinear induction except neg"
> -                        " when VF is known and early breaks.\n");
> -      return false;
> -    }
> -
>    return true;
>  }
>
> --
> 2.43.2
>

Reply via email to