On Sat, 2020-08-15 at 16:19 +0200, Christophe Lyon wrote: > Hi Martin, > > > On Sat, 15 Aug 2020 at 01:14, Martin Sebor via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > On 8/13/20 11:44 AM, Martin Sebor wrote: > > > On 8/13/20 10:21 AM, Jeff Law wrote: > > > > On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote: > > > > > The folders for these functions (and some others) call c_getsr > > > > > which relies on string_constant to return the representation of > > > > > constant strings. Because the function doesn't handle constants > > > > > of other types, including aggregates, memcmp or memchr calls > > > > > involving those are not folded when they could be. > > > > > > > > > > The attached patch extends the algorithm used by string_constant > > > > > to also handle constant aggregates involving elements or members > > > > > of the same types as native_encode_expr. (The change restores > > > > > the empty initializer optimization inadvertently disabled in > > > > > the fix for pr96058.) > > > > > > > > > > To avoid accidentally misusing either string_constant or c_getstr > > > > > with non-strings I have introduced a pair of new functions to get > > > > > the representation of those: byte_representation and getbyterep. > > > > > > > > > > Tested on x86_64-linux. > > > > > > > > > > Martin > > > > > PR tree-optimization/78257 - missing memcmp optimization with > > > > > constant arrays > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > PR middle-end/78257 > > > > > * builtins.c (expand_builtin_memory_copy_args): Rename called > > > > > function. > > > > > (expand_builtin_stpcpy_1): Remove argument from call. > > > > > (expand_builtin_memcmp): Rename called function. > > > > > (inline_expand_builtin_bytecmp): Same. > > > > > * expr.c (convert_to_bytes): New function. > > > > > (constant_byte_string): New function (formerly string_constant). > > > > > (string_constant): Call constant_byte_string. > > > > > (byte_representation): New function. > > > > > * expr.h (byte_representation): Declare. > > > > > * fold-const-call.c (fold_const_call): Rename called function. > > > > > * fold-const.c (c_getstr): Remove an argument. > > > > > (getbyterep): Define a new function. > > > > > * fold-const.h (c_getstr): Remove an argument. > > > > > (getbyterep): Declare a new function. > > > > > * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee. > > > > > (gimple_fold_builtin_string_compare): Same. > > > > > (gimple_fold_builtin_memchr): Same. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > PR middle-end/78257 > > > > > * gcc.dg/memchr.c: New test. > > > > > * gcc.dg/memcmp-2.c: New test. > > > > > * gcc.dg/memcmp-3.c: New test. > > > > > * gcc.dg/memcmp-4.c: New test. > > > > > > > > > > diff --git a/gcc/expr.c b/gcc/expr.c > > > > > index a150fa0d3b5..a124df54655 100644 > > > > > --- a/gcc/expr.c > > > > > +++ b/gcc/expr.c > > > > > @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, > > > > > const_tree exp) > > > > > /* This must now be the address of EXP. */ > > > > > return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, > > > > > 0) == exp; > > > > > } > > > > > - > > > > > -/* Return the tree node if an ARG corresponds to a string constant > > > > > or zero > > > > > - if it doesn't. If we return nonzero, set *PTR_OFFSET to the > > > > > (possibly > > > > > - non-constant) offset in bytes within the string that ARG is > > > > > accessing. > > > > > - If MEM_SIZE is non-zero the storage size of the memory is > > > > > returned. > > > > > - If DECL is non-zero the constant declaration is returned if > > > > > available. */ > > > > > -tree > > > > > -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree > > > > > *decl) > > > > > +/* If EXPR is a constant initializer (either an expression or > > > > > CONSTRUCTOR), > > > > > + attempt to obtain its native representation as an array of > > > > > nonzero BYTES. > > > > > + Return true on success and false on failure (the latter without > > > > > modifying > > > > > + BYTES). */ > > > > > + > > > > > +static bool > > > > > +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes) > > > > > +{ > > > > > + if (TREE_CODE (expr) == CONSTRUCTOR) > > > > > + { > > > > > + /* Set to the size of the CONSTRUCTOR elements. */ > > > > > + unsigned HOST_WIDE_INT ctor_size = bytes->length (); > > > > > + > > > > > + if (TREE_CODE (type) == ARRAY_TYPE) > > > > > + { > > > > > + tree val, idx; > > > > > + tree eltype = TREE_TYPE (type); > > > > > + unsigned HOST_WIDE_INT elsize = > > > > > + tree_to_uhwi (TYPE_SIZE_UNIT (eltype)); > > > > > + unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U; > > > > > + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val) > > > > > + { > > > > > + /* Append zeros for elements with no initializers. */ > > > > > + if (!tree_fits_uhwi_p (idx)) > > > > > + return false; > > > > > + unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx); > > > > > + if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1)) > > > > > + { > > > > > + size = size * elsize + bytes->length (); > > > > > + bytes->safe_grow_cleared (size); > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > > > > + } > > > > > + > > > > > + if (!convert_to_bytes (eltype, val, bytes)) > > > > > + return false; > > > > > + > > > > > + last_idx = cur_idx; > > > > > + } > > > > > + } > > > > > + else if (TREE_CODE (type) == RECORD_TYPE) > > > > > + { > > > > > + tree val, fld; > > > > > + unsigned HOST_WIDE_INT i; > > > > > + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val) > > > > > + { > > > > > + /* Append zeros for members with no initializers and > > > > > + any padding. */ > > > > > + unsigned HOST_WIDE_INT cur_off = int_byte_position (fld); > > > > > + if (bytes->length () < cur_off) > > > > > + bytes->safe_grow_cleared (cur_off); > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > > + > > > > > + if (!convert_to_bytes (TREE_TYPE (val), val, bytes)) > > > > > + return false; > > > > > + } > > > > > + } > > > > > + else > > > > > + return false; > > > > > + > > > > > + /* Compute the size of the COSNTRUCTOR elements. */ > > > > > + ctor_size = bytes->length () - ctor_size; > > > > > + > > > > > + /* Append zeros to the byte vector to the full size of the > > > > > type. > > > > > + The type size can be less than the size of the CONSTRUCTOR > > > > > + if the latter contains initializers for a flexible array > > > > > + member. */ > > > > > + tree size = TYPE_SIZE_UNIT (type); > > > > > + unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size); > > > > > + if (ctor_size < type_size) > > > > > + if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size) > > > > > + bytes->safe_grow_cleared (bytes->length () + size_grow); > > > > > + > > > > > + return true; > > > > > + } > > > > So I think you need to be more careful with CONSTRUCTOR nodes here. > > > > Not all > > > > elements of an object need to appear in the CONSTRUCTOR. Elements > > > > which do not > > > > appear in the CONSTRUCTOR node are considered zero-initialized, unless > > > > CONSTRUCTOR_NO_CLEARING is set. > > > > > > > > I don't see anything in the code above which deals with those oddities > > > > of > > > > CONSTRUCTOR nodes. Did I miss it? > > > > > > Just capturing for reference what we just discussed off list: > > > > > > The underlined code above zeroes out the bytes of elements with > > > no initializers as well as any padding between fields. It doesn't > > > consider CONSTRUCTOR_NO_CLEARING. I didn't know about that bit so > > > I looked it up. According to the internals manual: > > > > > > Unrepresented fields will be cleared (zeroed), unless the > > > CONSTRUCTOR_NO_CLEARING flag is set, in which case their value > > > becomes undefined. > > > > > > So assuming they're zero should be fine, as would doing nothing. > > > We agreed on the former so I will go ahead with the patch as is. > > > > I had missed a few Ada failures. Apparently, Ada can specify > > arbitrary array bounds (not just upper but also lower) but > > the code assumed the lower bound would always be zero. I've > > adjusted it to avoid making that assumption and committed > > the updated revision in r11-2709. > > > > This commit is causing a regression on arm: > FAIL: gcc.dg/strlenopt-55.c scan-tree-dump-times gimple "memcmp" 0 > FAIL: gcc.dg/strlenopt-55.c scan-tree-dump-times optimized > "call_in_true_branch_not_eliminated" 0 I'm seeing this on a variety of targets as well.
jeff