> -----Original Message-----
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Thursday, July 2, 2020 10:46 PM
> To: xiezhiheng <xiezhih...@huawei.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> emitted at -O3
> 
> On Thu, Jul 2, 2020 at 3:22 PM xiezhiheng <xiezhih...@huawei.com> wrote:
> >
> > Hi,
> >
> > This is a fix for pr94442.
> > I modify get_inner_reference to handle the case for MEM[ptr, off].
> > I extract the "off" and add it to the recorded offset, then I build a
> > MEM[ptr, 0] and return it later.
> >
> > diff --git a/gcc/expr.c b/gcc/expr.c
> > index 3c68b0d754c..8cc18449a0c 100644
> > --- a/gcc/expr.c
> > +++ b/gcc/expr.c
> > @@ -7362,7 +7362,8 @@ tree
> >  get_inner_reference (tree exp, poly_int64_pod *pbitsize,
> >                      poly_int64_pod *pbitpos, tree *poffset,
> >                      machine_mode *pmode, int *punsignedp,
> > -                    int *preversep, int *pvolatilep)
> > +                    int *preversep, int *pvolatilep,
> > +                    bool include_memref_p)
> >  {
> >    tree size_tree = 0;
> >    machine_mode mode = VOIDmode;
> > @@ -7509,6 +7510,21 @@ get_inner_reference (tree exp, poly_int64_pod
> *pbitsize,
> >                 }
> >               exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
> >             }
> > +         else if (include_memref_p
> > +                  && TREE_CODE (TREE_OPERAND (exp, 0)) ==
> SSA_NAME)
> > +           {
> > +             tree off = TREE_OPERAND (exp, 1);
> > +             if (!integer_zerop (off))
> > +               {
> > +                 poly_offset_int boff = mem_ref_offset (exp);
> > +                 boff <<= LOG2_BITS_PER_UNIT;
> > +                 bit_offset += boff;
> > +
> > +                 exp = build2 (MEM_REF, TREE_TYPE (exp),
> > +                               TREE_OPERAND (exp, 0),
> > +                               build_int_cst (TREE_TYPE (off), 0));
> > +               }
> > +           }
> >           goto done;
> >
> >         default:
> > @@ -10786,7 +10802,7 @@ expand_expr_real_1 (tree exp, rtx target,
> machine_mode tmode,
> >         int reversep, volatilep = 0, must_force_mem;
> >         tree tem
> >           = get_inner_reference (exp, &bitsize, &bitpos, &offset,
> &mode1,
> > -                                &unsignedp, &reversep, &volatilep);
> > +                                &unsignedp, &reversep, &volatilep,
> true);
> >         rtx orig_op0, memloc;
> >         bool clear_mem_expr = false;
> >
> > diff --git a/gcc/tree.h b/gcc/tree.h
> > index a74872f5f3e..7df0d15f7f9 100644
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -6139,7 +6139,8 @@ extern bool complete_ctor_at_level_p
> (const_tree, HOST_WIDE_INT, const_tree);
> >     look for the ultimate containing object, which is returned and specify
> >     the access position and size.  */
> >  extern tree get_inner_reference (tree, poly_int64_pod *, poly_int64_pod
> *,
> > -                                tree *, machine_mode *, int *, int *,
> int *);
> > +                                tree *, machine_mode *, int *, int *,
> int *,
> > +                                bool = false);
> >
> >  extern tree build_personality_function (const char *);
> >
> >
> > I add an argument "include_memref_p" to control whether to go into
> MEM_REF,
> > because without it will cause the test case "Warray-bounds-46.c" to fail in
> regression.
> >
> > It because function set_base_and_offset in gimple-ssa-warn-restrict.c
> >   base = get_inner_reference (expr, &bitsize, &bitpos, &var_off,
> >                               &mode, &sign, &reverse, &vol);
> >   ...
> >   ...
> >   if (TREE_CODE (base) == MEM_REF)
> >     {
> >       tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND
> (base, 1));
> >       extend_offset_range (memrefoff);
> >       base = TREE_OPERAND (base, 0);
> >
> >       if (refoff != HOST_WIDE_INT_MIN
> >           && TREE_CODE (expr) == COMPONENT_REF)
> >         {
> >           /* Bump up the offset of the referenced subobject to reflect
> >              the offset to the enclosing object.  For example, so that
> >              in
> >                struct S { char a, b[3]; } s[2];
> >                strcpy (s[1].b, "1234");
> >              REFOFF is set to s[1].b - (char*)s.  */
> >           offset_int off = tree_to_shwi (memrefoff);
> >           refoff += off;
> >         }
> >
> >       if (!integer_zerop (memrefoff))       <=================
> >         /* A non-zero offset into an array of struct with flexible array
> >            members implies that the array is empty because there is no
> >            way to initialize such a member when it belongs to an array.
> >            This must be some sort of a bug.  */
> >         refsize = 0;
> >     }
> >
> > needs MEM_REF offset to judge whether refsize should be set to zero.
> > But I fold the offset into bitpos and the offset will always be zero.
> >
> > Suggestion?
> 
> The thing you want to fix is not get_inner_reference but the aarch64 backend
> to not make __builtin_aarch64_sqaddv16qi clobber global memory.  That
> way
> CSE can happen on GIMPLE which can handle the difference in the IL just
> fine.
> 
> Richard.

Yes, __builtin_aarch64_sqaddv16qi is not set any attributes to describe that
it would not clobber global memory.  But I find it strange that when building
SIMD built-in FUNCTION_DECLs they are not set any attributes in the backend.

void
aarch64_init_simd_builtins (void)
{
...
      ftype = build_function_type (return_type, args);

      gcc_assert (ftype != NULL);

      if (print_type_signature_p)
        snprintf (namebuf, sizeof (namebuf), "__builtin_aarch64_%s_%s",
                  d->name, type_signature);
      else
        snprintf (namebuf, sizeof (namebuf), "__builtin_aarch64_%s",
                  d->name);

      fndecl = aarch64_general_add_builtin (namebuf, ftype, fcode);
      aarch64_builtin_decls[fcode] = fndecl;
...
}
static tree
aarch64_general_add_builtin (const char *name, tree type, unsigned int code)
{
  code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_GENERAL;
  return add_builtin_function (name, type, code, BUILT_IN_MD,
                               NULL, NULL_TREE);
}

The loop in aarch64_init_simd_builtins creates FUNCTION_DECL node for each
build-in function and put the node in array.  But it does not set any 
attributes.
And I did not find interface for each build-in function to control the 
attributes.

Did I miss anything?

Reply via email to