In the comment trail for PR119966, I'd said that the validate_subreg
condition:

  /* The outer size must be ordered wrt the register size, otherwise
     we wouldn't know at compile time how many registers the outer
     mode occupies.  */
  if (!ordered_p (osize, regsize))
    return false;

"is also potentially relevant" for paradoxical subregs.  But I'd
forgotten an important caveat.  If the inner size is smaller than
a register, we know that the inner value will only occupy a single
register.  Although the paradoxical subreg might extend that single
register to multiple registers by padding with undefined bits,
the register size that matters for the extension is:

   REGMODE_NATURAL_SIZE (omode)

rather than regsize's:

   REGMODE_NATURAL_SIZE (imode)

The ordered check is still relevant if the inner value spans
multiple registers.

Enabling the check above for paradoxical subregs led to an ICE in the
testcase, where we tried to generate a VNx4QI paradoxical subreg of a
QI scalar.  This was previously allowed, and AFAIK worked correctly.

The patch doesn't have the effect of relaxing the condition for
non-paradoxical subregs, since:

  known_le (osize, isize) && known_le (isize, regsize)
    => known_le (osize, regsize)
    => ordered_p (osize, regsize)

So even before the patch for PR119966, the condition only existed for
the maybe_gt (isize, regsize) case.

The term "block" used in the comment is taken from the rtl.texi
documentation of subregs.

Tested on aarch64-linux-gnu.  OK to install?

Richard


gcc/
        PR rtl-optimization/120447
        * emit-rtl.cc (validate_subreg): Restrict ordered_p test
        between osize and regsize to cases where the inner value
        occupies multiple blocks.

gcc/testsuite/
        PR rtl-optimization/120447
        * gcc.dg/pr120447.c: New test.
---
 gcc/emit-rtl.cc                 |  9 +++++----
 gcc/testsuite/gcc.dg/pr120447.c | 24 ++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr120447.c

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 3f453cda67e..50e3bfcb777 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -998,10 +998,11 @@ validate_subreg (machine_mode omode, machine_mode imode,
           && known_le (osize, isize))
     return false;
 
-  /* The outer size must be ordered wrt the register size, otherwise
-     we wouldn't know at compile time how many registers the outer
-     mode occupies.  */
-  if (!ordered_p (osize, regsize))
+  /* If ISIZE is greater than REGSIZE, the inner value is split into blocks
+     of size REGSIZE.  The outer size must then be ordered wrt REGSIZE,
+     otherwise we wouldn't know at compile time how many blocks the
+     outer mode occupies.  */
+  if (maybe_gt (isize, regsize) && !ordered_p (osize, regsize))
     return false;
 
   /* For normal pseudo registers, we want most of the same checks.  Namely:
diff --git a/gcc/testsuite/gcc.dg/pr120447.c b/gcc/testsuite/gcc.dg/pr120447.c
new file mode 100644
index 00000000000..bd51f9b174d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120447.c
@@ -0,0 +1,24 @@
+/* { dg-options "-Ofast" } */
+/* { dg-additional-options "-mcpu=neoverse-v2" { target aarch64*-*-* } } */
+
+char g;
+long h;
+typedef struct {
+  void *data;
+} i;
+i* a;
+void b(i *j, char *p2);
+void c(char *d) {
+  d = d ? " and " : " or ";
+  b(a, d);
+}
+void b(i *j, char *p2) {
+  h = __builtin_strlen(p2);
+  while (g)
+    ;
+  int *k = j->data;
+  char *l = p2, *m = p2 + h;
+  l += 4;
+  while (l < m)
+    *k++ = *l++;
+}
-- 
2.43.0

Reply via email to