In the comment trail for PR119966, I'd said that the validate_subreg condition:
/* The outer size must be ordered wrt the register size, otherwise we wouldn't know at compile time how many registers the outer mode occupies. */ if (!ordered_p (osize, regsize)) return false; "is also potentially relevant" for paradoxical subregs. But I'd forgotten an important caveat. If the inner size is smaller than a register, we know that the inner value will only occupy a single register. Although the paradoxical subreg might extend that single register to multiple registers by padding with undefined bits, the register size that matters for the extension is: REGMODE_NATURAL_SIZE (omode) rather than regsize's: REGMODE_NATURAL_SIZE (imode) The ordered check is still relevant if the inner value spans multiple registers. Enabling the check above for paradoxical subregs led to an ICE in the testcase, where we tried to generate a VNx4QI paradoxical subreg of a QI scalar. This was previously allowed, and AFAIK worked correctly. The patch doesn't have the effect of relaxing the condition for non-paradoxical subregs, since: known_le (osize, isize) && known_le (isize, regsize) => known_le (osize, regsize) => ordered_p (osize, regsize) So even before the patch for PR119966, the condition only existed for the maybe_gt (isize, regsize) case. The term "block" used in the comment is taken from the rtl.texi documentation of subregs. Tested on aarch64-linux-gnu. OK to install? Richard gcc/ PR rtl-optimization/120447 * emit-rtl.cc (validate_subreg): Restrict ordered_p test between osize and regsize to cases where the inner value occupies multiple blocks. gcc/testsuite/ PR rtl-optimization/120447 * gcc.dg/pr120447.c: New test. --- gcc/emit-rtl.cc | 9 +++++---- gcc/testsuite/gcc.dg/pr120447.c | 24 ++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr120447.c diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc index 3f453cda67e..50e3bfcb777 100644 --- a/gcc/emit-rtl.cc +++ b/gcc/emit-rtl.cc @@ -998,10 +998,11 @@ validate_subreg (machine_mode omode, machine_mode imode, && known_le (osize, isize)) return false; - /* The outer size must be ordered wrt the register size, otherwise - we wouldn't know at compile time how many registers the outer - mode occupies. */ - if (!ordered_p (osize, regsize)) + /* If ISIZE is greater than REGSIZE, the inner value is split into blocks + of size REGSIZE. The outer size must then be ordered wrt REGSIZE, + otherwise we wouldn't know at compile time how many blocks the + outer mode occupies. */ + if (maybe_gt (isize, regsize) && !ordered_p (osize, regsize)) return false; /* For normal pseudo registers, we want most of the same checks. Namely: diff --git a/gcc/testsuite/gcc.dg/pr120447.c b/gcc/testsuite/gcc.dg/pr120447.c new file mode 100644 index 00000000000..bd51f9b174d --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr120447.c @@ -0,0 +1,24 @@ +/* { dg-options "-Ofast" } */ +/* { dg-additional-options "-mcpu=neoverse-v2" { target aarch64*-*-* } } */ + +char g; +long h; +typedef struct { + void *data; +} i; +i* a; +void b(i *j, char *p2); +void c(char *d) { + d = d ? " and " : " or "; + b(a, d); +} +void b(i *j, char *p2) { + h = __builtin_strlen(p2); + while (g) + ; + int *k = j->data; + char *l = p2, *m = p2 + h; + l += 4; + while (l < m) + *k++ = *l++; +} -- 2.43.0