Jennifer Schmitz <jschm...@nvidia.com> writes: > The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using > partial_subreg_p in the function copy_value during the RTL pass > regcprop, failing the assertion in > > inline bool > partial_subreg_p (machine_mode outermode, machine_mode innermode) > { > /* Modes involved in a subreg must be ordered. In particular, we must > always know at compile time whether the subreg is paradoxical. */ > poly_int64 outer_prec = GET_MODE_PRECISION (outermode); > poly_int64 inner_prec = GET_MODE_PRECISION (innermode); > gcc_checking_assert (ordered_p (outer_prec, inner_prec)); > return maybe_lt (outer_prec, inner_prec); > } > > Replacing the call to partial_subreg_p by ordered_p && maybe_lt resolves > the ICE and passes bootstrap and testing without regression. > OK for mainline? > > Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> > > gcc/ > PR middle-end/120276 > * regcprop.cc (copy_value): Replace call to partial_subreg_p by > ordered_p && maybe_lt. > > gcc/testsuite/ > PR middle-end/120276 > * gcc.target/aarch64/sve/pr120276.c: New test. > --- > gcc/regcprop.cc | 5 ++++- > .../gcc.target/aarch64/sve/pr120276.c | 20 +++++++++++++++++++ > 2 files changed, 24 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr120276.c > > diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc > index 4fa1305526c..7f4a25a4718 100644 > --- a/gcc/regcprop.cc > +++ b/gcc/regcprop.cc > @@ -326,6 +326,9 @@ copy_value (rtx dest, rtx src, struct value_data *vd) > || (sr > dr && sr < dr + dn)) > return; > > + poly_int64 sr_prec = GET_MODE_PRECISION (vd->e[sr].mode); > + poly_int64 src_prec = GET_MODE_PRECISION (GET_MODE (src)); > + > /* If SRC had no assigned mode (i.e. we didn't know it was live) > assign it now and assume the value came from an input argument > or somesuch. */ > @@ -371,7 +374,7 @@ copy_value (rtx dest, rtx src, struct value_data *vd) > (set (reg:DI dx) (reg:DI bp)) > The last set is not redundant, while the low 8 bits of dx are already > equal to low 8 bits of bp, the other bits are undefined. */ > - else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))) > + else if (ordered_p (sr_prec, src_prec) && maybe_lt (sr_prec, src_prec)) > { > if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode) > || !REG_CAN_CHANGE_MODE_P (dr, vd->e[sr].mode, GET_MODE (dest)))
I think we should instead add: else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode), GET_MODE_PRECISION (GET_MODE (src)))) return; after: if (vd->e[sr].mode == VOIDmode) set_value_regno (sr, vd->e[dr].mode, vd); The ICE in paradoxical_subreg is just the canary. The register replacement isn't valid if we don't know at compile time how the bytes are distributed between the two modes. Thanks, Richard > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c > b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c > new file mode 100644 > index 00000000000..1b29c90f69b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3" } */ > + > +int a; > +char b[1]; > +int c[18]; > +void d(char *); > +void e() { > + int f; > + char *g; > + a = 0; > + for (; a < 18; a++) { > + int h = f = 0; > + for (; f < 4; f++) { > + g[a * 4 + f] = c[a] >> h; > + h += 8; > + } > + } > + d(b); > +} > \ No newline at end of file