> -----Original Message----- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Thursday, July 2, 2020 10:46 PM > To: xiezhiheng <xiezhih...@huawei.com> > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions > emitted at -O3 > > On Thu, Jul 2, 2020 at 3:22 PM xiezhiheng <xiezhih...@huawei.com> wrote: > > > > Hi, > > > > This is a fix for pr94442. > > I modify get_inner_reference to handle the case for MEM[ptr, off]. > > I extract the "off" and add it to the recorded offset, then I build a > > MEM[ptr, 0] and return it later. > > > > diff --git a/gcc/expr.c b/gcc/expr.c > > index 3c68b0d754c..8cc18449a0c 100644 > > --- a/gcc/expr.c > > +++ b/gcc/expr.c > > @@ -7362,7 +7362,8 @@ tree > > get_inner_reference (tree exp, poly_int64_pod *pbitsize, > > poly_int64_pod *pbitpos, tree *poffset, > > machine_mode *pmode, int *punsignedp, > > - int *preversep, int *pvolatilep) > > + int *preversep, int *pvolatilep, > > + bool include_memref_p) > > { > > tree size_tree = 0; > > machine_mode mode = VOIDmode; > > @@ -7509,6 +7510,21 @@ get_inner_reference (tree exp, poly_int64_pod > *pbitsize, > > } > > exp = TREE_OPERAND (TREE_OPERAND (exp, 0), 0); > > } > > + else if (include_memref_p > > + && TREE_CODE (TREE_OPERAND (exp, 0)) == > SSA_NAME) > > + { > > + tree off = TREE_OPERAND (exp, 1); > > + if (!integer_zerop (off)) > > + { > > + poly_offset_int boff = mem_ref_offset (exp); > > + boff <<= LOG2_BITS_PER_UNIT; > > + bit_offset += boff; > > + > > + exp = build2 (MEM_REF, TREE_TYPE (exp), > > + TREE_OPERAND (exp, 0), > > + build_int_cst (TREE_TYPE (off), 0)); > > + } > > + } > > goto done; > > > > default: > > @@ -10786,7 +10802,7 @@ expand_expr_real_1 (tree exp, rtx target, > machine_mode tmode, > > int reversep, volatilep = 0, must_force_mem; > > tree tem > > = get_inner_reference (exp, &bitsize, &bitpos, &offset, > &mode1, > > - &unsignedp, &reversep, &volatilep); > > + &unsignedp, &reversep, &volatilep, > true); > > rtx orig_op0, memloc; > > bool clear_mem_expr = false; > > > > diff --git a/gcc/tree.h b/gcc/tree.h > > index a74872f5f3e..7df0d15f7f9 100644 > > --- a/gcc/tree.h > > +++ b/gcc/tree.h > > @@ -6139,7 +6139,8 @@ extern bool complete_ctor_at_level_p > (const_tree, HOST_WIDE_INT, const_tree); > > look for the ultimate containing object, which is returned and specify > > the access position and size. */ > > extern tree get_inner_reference (tree, poly_int64_pod *, poly_int64_pod > *, > > - tree *, machine_mode *, int *, int *, > int *); > > + tree *, machine_mode *, int *, int *, > int *, > > + bool = false); > > > > extern tree build_personality_function (const char *); > > > > > > I add an argument "include_memref_p" to control whether to go into > MEM_REF, > > because without it will cause the test case "Warray-bounds-46.c" to fail in > regression. > > > > It because function set_base_and_offset in gimple-ssa-warn-restrict.c > > base = get_inner_reference (expr, &bitsize, &bitpos, &var_off, > > &mode, &sign, &reverse, &vol); > > ... > > ... > > if (TREE_CODE (base) == MEM_REF) > > { > > tree memrefoff = fold_convert (ptrdiff_type_node, TREE_OPERAND > (base, 1)); > > extend_offset_range (memrefoff); > > base = TREE_OPERAND (base, 0); > > > > if (refoff != HOST_WIDE_INT_MIN > > && TREE_CODE (expr) == COMPONENT_REF) > > { > > /* Bump up the offset of the referenced subobject to reflect > > the offset to the enclosing object. For example, so that > > in > > struct S { char a, b[3]; } s[2]; > > strcpy (s[1].b, "1234"); > > REFOFF is set to s[1].b - (char*)s. */ > > offset_int off = tree_to_shwi (memrefoff); > > refoff += off; > > } > > > > if (!integer_zerop (memrefoff)) <================= > > /* A non-zero offset into an array of struct with flexible array > > members implies that the array is empty because there is no > > way to initialize such a member when it belongs to an array. > > This must be some sort of a bug. */ > > refsize = 0; > > } > > > > needs MEM_REF offset to judge whether refsize should be set to zero. > > But I fold the offset into bitpos and the offset will always be zero. > > > > Suggestion? > > The thing you want to fix is not get_inner_reference but the aarch64 backend > to not make __builtin_aarch64_sqaddv16qi clobber global memory. That > way > CSE can happen on GIMPLE which can handle the difference in the IL just > fine. > > Richard.
Yes, __builtin_aarch64_sqaddv16qi is not set any attributes to describe that it would not clobber global memory. But I find it strange that when building SIMD built-in FUNCTION_DECLs they are not set any attributes in the backend. void aarch64_init_simd_builtins (void) { ... ftype = build_function_type (return_type, args); gcc_assert (ftype != NULL); if (print_type_signature_p) snprintf (namebuf, sizeof (namebuf), "__builtin_aarch64_%s_%s", d->name, type_signature); else snprintf (namebuf, sizeof (namebuf), "__builtin_aarch64_%s", d->name); fndecl = aarch64_general_add_builtin (namebuf, ftype, fcode); aarch64_builtin_decls[fcode] = fndecl; ... } static tree aarch64_general_add_builtin (const char *name, tree type, unsigned int code) { code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_GENERAL; return add_builtin_function (name, type, code, BUILT_IN_MD, NULL, NULL_TREE); } The loop in aarch64_init_simd_builtins creates FUNCTION_DECL node for each build-in function and put the node in array. But it does not set any attributes. And I did not find interface for each build-in function to control the attributes. Did I miss anything?