Jennifer Schmitz <jschm...@nvidia.com> writes:
> The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using
> partial_subreg_p in the function copy_value during the RTL pass
> regcprop, failing the assertion in
>
> inline bool
> partial_subreg_p (machine_mode outermode, machine_mode innermode)
> {
>   /* Modes involved in a subreg must be ordered.  In particular, we must
>      always know at compile time whether the subreg is paradoxical.  */
>   poly_int64 outer_prec = GET_MODE_PRECISION (outermode);
>   poly_int64 inner_prec = GET_MODE_PRECISION (innermode);
>   gcc_checking_assert (ordered_p (outer_prec, inner_prec));
>   return maybe_lt (outer_prec, inner_prec);
> }
>
> Replacing the call to partial_subreg_p by ordered_p && maybe_lt resolves
> the ICE and passes bootstrap and testing without regression.
> OK for mainline?
>
> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
>
> gcc/
>       PR middle-end/120276
>       * regcprop.cc (copy_value): Replace call to partial_subreg_p by
>       ordered_p && maybe_lt.
>
> gcc/testsuite/
>       PR middle-end/120276
>       * gcc.target/aarch64/sve/pr120276.c: New test.
> ---
>  gcc/regcprop.cc                               |  5 ++++-
>  .../gcc.target/aarch64/sve/pr120276.c         | 20 +++++++++++++++++++
>  2 files changed, 24 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
>
> diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc
> index 4fa1305526c..7f4a25a4718 100644
> --- a/gcc/regcprop.cc
> +++ b/gcc/regcprop.cc
> @@ -326,6 +326,9 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
>        || (sr > dr && sr < dr + dn))
>      return;
>  
> +  poly_int64 sr_prec = GET_MODE_PRECISION (vd->e[sr].mode);
> +  poly_int64 src_prec = GET_MODE_PRECISION (GET_MODE (src));
> +
>    /* If SRC had no assigned mode (i.e. we didn't know it was live)
>       assign it now and assume the value came from an input argument
>       or somesuch.  */
> @@ -371,7 +374,7 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
>       (set (reg:DI dx) (reg:DI bp))
>       The last set is not redundant, while the low 8 bits of dx are already
>       equal to low 8 bits of bp, the other bits are undefined.  */
> -  else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
> +  else if (ordered_p (sr_prec, src_prec) && maybe_lt (sr_prec, src_prec))
>      {
>        if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode)
>         || !REG_CAN_CHANGE_MODE_P (dr, vd->e[sr].mode, GET_MODE (dest)))

I think we should instead add:

   else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode),
                        GET_MODE_PRECISION (GET_MODE (src))))
     return;

after:

  if (vd->e[sr].mode == VOIDmode)
    set_value_regno (sr, vd->e[dr].mode, vd);

The ICE in paradoxical_subreg is just the canary.  The register
replacement isn't valid if we don't know at compile time how the bytes
are distributed between the two modes.

Thanks,
Richard

> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
> new file mode 100644
> index 00000000000..1b29c90f69b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +int a;
> +char b[1];
> +int c[18];
> +void d(char *);
> +void e() {
> +  int f;
> +  char *g;
> +  a = 0;
> +  for (; a < 18; a++) {
> +    int h = f = 0;
> +    for (; f < 4; f++) {
> +      g[a * 4 + f] = c[a] >> h;
> +      h += 8;
> +    }
> +  }
> +  d(b);
> +}
> \ No newline at end of file

Reply via email to