On Fri, 16 Dec 2016, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 01:50:54PM +0100, Richard Biener wrote: > > > +static void > > > +optimize_memcpy (gimple_stmt_iterator *gsip, tree dest, tree src, tree > > > len) > > > +{ > > > + gimple *stmt = gsi_stmt (*gsip); > > > + if (gimple_has_volatile_ops (stmt) > > > + || TREE_THIS_VOLATILE (dest) > > > + || TREE_THIS_VOLATILE (src)) > > > > The last two should be redundant and/or not necessary. > > I've been doing this if stmt is __builtin_memcpy and the actual > objects are TREE_THIS_VOLATILE, then I wanted to punt. > But I guess if we never transform __builtin_memcpy into anything > other than __builtin_memset, then it isn't a problem.
Yeah, and a memcpy of a volatile object isn't a volatile access anyway. > > > + return; > > > + > > > + tree vuse = gimple_vuse (stmt); > > > + if (vuse == NULL) > > > + return; > > > + > > > + gimple *defstmt = SSA_NAME_DEF_STMT (vuse); > > > + tree src2 = NULL_TREE, len2 = NULL_TREE; > > > + HOST_WIDE_INT offset, offset2; > > > + tree val = integer_zero_node; > > > + if (gimple_store_p (defstmt) > > > + && gimple_assign_single_p (defstmt) > > > + && TREE_CODE (gimple_assign_rhs1 (defstmt)) == CONSTRUCTOR > > > + && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (defstmt)) == 0 > > > > Should be always true for stores from constructors. > > Ok, can I just gcc_checking_assert verify it or is that not worth it? Not worth it -- tree-cfg.c gimple verification contains that already (ok, not exactly - it allows vector CONSTRUCTORs as RHS of !DECL_GIMPLE_REG_P vectors -- see a bit above (MEM_REF:) for a missed check. > > > > + && !gimple_clobber_p (defstmt)) > > > + src2 = gimple_assign_lhs (defstmt); > > > + else if (gimple_call_builtin_p (defstmt, BUILT_IN_MEMSET) > > > + && TREE_CODE (gimple_call_arg (defstmt, 0)) == ADDR_EXPR > > > + && TREE_CODE (gimple_call_arg (defstmt, 1)) == INTEGER_CST) > > > + { > > > + src2 = TREE_OPERAND (gimple_call_arg (defstmt, 0), 0); > > > + len2 = gimple_call_arg (defstmt, 2); > > > + val = gimple_call_arg (defstmt, 1); > > > + if (!integer_zerop (val) && is_gimple_assign (stmt)) > > > > Please add a comment here. I think below ... (*) > > > > > + src2 = NULL_TREE; > > > + } > > > + > > > + if (src2 == NULL_TREE) > > > + return; > > > + > > > + if (len == NULL_TREE) > > > + len = (TREE_CODE (src) == COMPONENT_REF > > > + ? DECL_SIZE_UNIT (TREE_OPERAND (src, 1)) > > > + : TYPE_SIZE_UNIT (TREE_TYPE (src))); > > > + if (len2 == NULL_TREE) > > > + len2 = (TREE_CODE (src2) == COMPONENT_REF > > > + ? DECL_SIZE_UNIT (TREE_OPERAND (src2, 1)) > > > + : TYPE_SIZE_UNIT (TREE_TYPE (src2))); > > > + if (len == NULL_TREE > > > + || TREE_CODE (len) != INTEGER_CST > > > + || len2 == NULL_TREE > > > + || TREE_CODE (len2) != INTEGER_CST) > > > + return; > > > + > > > + src = get_addr_base_and_unit_offset (src, &offset); > > > + src2 = get_addr_base_and_unit_offset (src2, &offset2); > > > + if (src == NULL_TREE > > > + || src2 == NULL_TREE > > > + || offset < offset2) > > > + return; > > > + > > > + if (!operand_equal_p (src, src2, 0)) > > > + return; > > > + > > > + /* [ src + offset2, src + offset2 + len2 - 1 ] is set to val. > > > + Make sure that > > > + [ src + offset, src + offset + len - 1 ] is a subset of that. */ > > > + if (wi::to_widest (len) + (offset - offset2) > wi::to_widest (len2)) > > > > I think you can use ::to_offset which is more efficient. > > That is reasonable. > > > > + return; > > > + > > > + if (dump_file && (dump_flags & TDF_DETAILS)) > > > + { > > > + fprintf (dump_file, "Simplified\n "); > > > + print_gimple_stmt (dump_file, stmt, 0, dump_flags); > > > + fprintf (dump_file, "after previous\n "); > > > + print_gimple_stmt (dump_file, defstmt, 0, dump_flags); > > > + } > > > + > > > + if (is_gimple_assign (stmt)) > > > > (*) a better check would be if val is zero because even a memcpy > > could be optimized to an assignment from {}. I suppose you're > > doing it this way because of simplicity and because we're > > late in the compilation and thus it doesn't matter in practice > > for optimization? (the 2nd destination could become > > non-TREE_ADDRESSABLE, enabling better alias disambiguation) > > Is memset and = {} equivalent even for aggregates with paddings? When lowering memset to = {} you should use MEM<char[]> (&decl, off-of-original-alias-type) = {}-of-type-char[]; similar for lowering of memcpy (I think what gimple-fold.c does for example is slightly unsafe). But in practice I think nothing in GCC assumes you can lower accesses and ignore padding if that pass (like SRA) does not see all accesses to prove that is safe. > If val is not zero, then I can only handle it if stmt is memcpy, > (or in theory if stmt is assignment, and dest is addressable it > could be transformed into memset). If val is zero, and dest isn't > volatile and the size matches the size of dest, then perhaps it > could be transformed into = {} (if there are no paddings/bitfields > or if it doesn't matter), I haven't done that for simplicity indeed. Yes, please add a comment so a curious bystander doesn't have to figure that out himself. Richard.