On Tue, 5 Oct 2021, Richard Sandiford wrote:

> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Mon, 4 Oct 2021, Qing Zhao wrote:
> >
> >> 
> >> 
> >> > On Oct 4, 2021, at 12:19 PM, Richard Biener <rguent...@suse.de> wrote:
> >> > 
> >> > On October 4, 2021 7:00:10 PM GMT+02:00, Qing Zhao 
> >> > <qing.z...@oracle.com> wrote:
> >> >> I have several questions on this fix:
> >> >> 
> >> >> 1. This fix avoided expanding ?.DEFERRED_INIT? when !tree_fits_uhwi_p 
> >> >> (TYPE_SIZE_UNIT (var_type)).
> >> >>   As a result, this call to .DEFERRED_INIT will NOT be expanded at all.
> >> > 
> >> > Yes. 
> >> 
> >> Then, should we exclude such auto init during gimplification phase?
> >
> > No, we do want to and can handle such variables just fine.
> >
> >> > 
> >> >>   Then not expanding .DEFERRED_INIT in RTL expanding phase will trigger 
> >> >> more issues in later RTL phases, this looks not correct to me. 
> >> >> (Actually, with is the patch, this testing case still failed in a later 
> >> >> RTL stage). 
> >> >> 
> >> >>   So, If we really want to avoid auto-init for VLA vectors, we should 
> >> >> not add call to .DEFERRED_INIT in gimplification phase at all. 
> >> 
> >> 
> >> >> 
> >> >> 
> >> >> 2. For the added .DEFERRED_INIT:
> >> >> 
> >> >> __SVFloat64_t f64;
> >> >> 
> >> >> f64 = .DEFERRED_INIT (POLY_INT_CST [16, 16], 2, 0);
> >> >> 
> >> >> What does ?POLY_INT_CST[16,16]? mean? Is this a constant size? If YES, 
> >> >> what?s the value of it? If Not, can we use ?memset? to expand it?
> >> > 
> >> > When the target is a register memset doesn't work. I'm not sure the 
> >> > memset expansion path will work as-is either for aggregates with vla 
> >> > parts -
> >> 
> >> Stupid question here:  what does POLY_INT_CST[16,16] mean?   It?s not a 
> >> constant? 
> >
> > It's 16 * <vector-factor> where the factor is determined by the hardware
> > implementation but fixed throughout the programs lifetime.  You could
> > think of the POLY_INT_CST expanding to a multiplication of 16 by a special
> > hardware register.
> >
> > For vector types the zero-init could be done using build_zero_cst and
> > the expand_assignment path.  Also the memset path should just work
> > as well.
> >
> > It's the pattern init that's a bit more complicated but I'm sure
> > Richard will sort that out.
> >
> > Note TYPE_SIZE_UNIT will honor tree_fits_poly_uint64_p but for the
> > pattern init we'd have to repeat the constant and maybe there's
> > a clever way to do this repeating just the single pattern byte.
> >
> > But as said...
> >
> >> > but I'll leave that to Richard S. to sort out. 
> >
> > ^^^
> 
> Yeah, I'm hoping to get to this in stage 3 :-)
> 
> The PR is still open until then and I agree the bypass is a good idea in
> the meantime.

Btw, I've just completed testing the following which restores init
on aarch64 (when you specify -march=armv8.3-a+sve, otherwise we
ICE on SVE register uses) and also restores the init of the VLA
case that was lost.  The only caveat is that we use zero-init
for the VLA vectors even with pattern init - that's something to
improve.  Also initializing from build_zero_cst might explode
later for poly-int sized things I cannot imagine right now ;)

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

Richard.

>From bd73fdacf72563ce27edbcdfc0d06d5378339f85 Mon Sep 17 00:00:00 2001
From: Richard Biener <rguent...@suse.de>
Date: Tue, 5 Oct 2021 09:28:20 +0200
Subject: [PATCH] More .DEFERRED_INIT expansion rework
To: gcc-patches@gcc.gnu.org

This avoids looking at the type size and instead uses the size
as passed to .DEFERRED_INIT to determine the size of the non-MEM
to be initialized.  It also arranges for possibly poly-int
inits to always use zero-initialization rather than not initializing
and when we need to pun puns the LHS instead of the constant value.

That correctly initializes the variable-size typed array in the
testcase for PR102285 and the SVE vector in PR102587 where for
the testcase I needed to add a SVE capable -march as to not
ICE later.

2021-10-05  Richard Biener  <rguent...@suse.de>

        PR middle-end/102587
        PR middle-end/102285
        * internal-fn.c (expand_DEFERRED_INIT): Fall back to
        zero-initialization as last resort, use the constant
        size as given by the DEFERRED_INIT argument to build
        the initializer.

        * gcc.target/aarch64/sve/pr102587-1.c: Add -march=armv8.3-a+sve.
        * gcc.target/aarch64/sve/pr102587-2.c: Likewise.
---
 gcc/internal-fn.c                             | 27 ++++++++++---------
 .../gcc.target/aarch64/sve/pr102587-1.c       |  2 +-
 .../gcc.target/aarch64/sve/pr102587-2.c       |  2 +-
 3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 110145218b9..78db25bbac4 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -3038,19 +3038,18 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
       /* Expand this memset call.  */
       expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type));
     }
-  /* ???  Deal with poly-int sized registers.  */
-  else if (tree_fits_uhwi_p (TYPE_SIZE_UNIT (var_type)))
+  else
     {
-      /* If this variable is in a register, use expand_assignment might
-        generate better code.  */
-      tree init = build_zero_cst (var_type);
-      unsigned HOST_WIDE_INT total_bytes
-       = tree_to_uhwi (TYPE_SIZE_UNIT (var_type));
-
-      if (init_type == AUTO_INIT_PATTERN)
+      /* If this variable is in a register use expand_assignment.  */
+      tree init;
+      if (tree_fits_uhwi_p (var_size)
+         && (init_type == AUTO_INIT_PATTERN
+             || !is_gimple_reg_type (var_type)))
        {
+         unsigned HOST_WIDE_INT total_bytes = tree_to_uhwi (var_size);
          unsigned char *buf = (unsigned char *) xmalloc (total_bytes);
-         memset (buf, INIT_PATTERN_VALUE, total_bytes);
+         memset (buf, (init_type == AUTO_INIT_PATTERN
+                       ? INIT_PATTERN_VALUE : 0), total_bytes);
          if (can_native_interpret_type_p (var_type))
            init = native_interpret_expr (var_type, buf, total_bytes);
          else
@@ -3058,10 +3057,14 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
              tree itype = build_nonstandard_integer_type
                             (total_bytes * BITS_PER_UNIT, 1);
              wide_int w = wi::from_buffer (buf, total_bytes);
-             init = build1 (VIEW_CONVERT_EXPR, var_type,
-                            wide_int_to_tree (itype, w));
+             init = wide_int_to_tree (itype, w);
+             /* Pun the LHS to make sure its type has constant size.  */
+             lhs = build1 (VIEW_CONVERT_EXPR, itype, lhs);
            }
        }
+      else
+       /* Use zero-init also for variable-length sizes.  */
+       init = build_zero_cst (var_type);
 
       expand_assignment (lhs, init, false);
     }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr102587-1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr102587-1.c
index 2b9a68b0b59..af2ae59e5d4 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/pr102587-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr102587-1.c
@@ -1,4 +1,4 @@
 /* { dg-do compile } */
-/* { dg-options "-ftrivial-auto-var-init=zero" } */
+/* { dg-options "-march=armv8.3-a+sve -ftrivial-auto-var-init=zero" } */
 
 void foo() { __SVFloat64_t f64; }
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr102587-2.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr102587-2.c
index 4cdb9056002..8c9d9908bac 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/pr102587-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr102587-2.c
@@ -1,4 +1,4 @@
 /* { dg-do compile } */
-/* { dg-options "-ftrivial-auto-var-init=pattern" } */
+/* { dg-options "-march=armv8.3-a+sve -ftrivial-auto-var-init=pattern" } */
 
 void foo() { __SVFloat64_t f64; }
-- 
2.31.1

Reply via email to