On Wed, Jun 26, 2019 at 6:17 AM Jeff Law <l...@redhat.com> wrote: > > So based on the conversation in the BZ I cobbled together a patch to > extend tree-ssa-dse.c to also detect redundant stores. > > To be clear, given two stores, the first store is dead if the later > store overwrites all the live bytes set by the first store. In this > case we delete the first store. If the first store is partially dead we > may trim it. > > Given two stores, the second store is redundant if it stores _the same > value_ into locations set by the first store. In this case we delete > the second store. > > > We prefer to remove redundant stores over removing dead or trimming > partially dead stores. > > First, if we detect a redundant store, we can always remove it. We may > not always be able to trim a partially dead store. So removing the > redundant store wins in this case. > > But even if the redundant store occurs at the head or tail of the prior > store, removing the redundant store is better than trimming the > partially dead store because we end up with fewer calls to memset with > the same number of total bytes written. > > We only look for redundant stores in a few cases. The first store must > be a memset, empty constructor or calloc call -- ie things which > initialize multiple memory locations to zero. Subsequent stores can > occur via memset, empty constructors or simple memory assignments. > > The chagne to tree-ssa-alias.c deserves a quick note. > > When we're trying to determine if we have a redundant store, we create > an AO_REF for the *second* store, then ask the alias system if the first > store would kill the AO_REF. > > So while normally a calloc wouldn't ever kill anything in normal > execution order, we're not asking about things in execution order. We > really just want to know if the calloc is going to write into the > entirety of the AO_REF of the subsequent store. So we compute the size > of the allocation and we know the destination from the LHS of the calloc > call and everything "just works".
I see how stmt_kills_ref_p is convenient here and it's the only ref-must-include-other-ref kind of query the oracle supports right now. Note it is not optimized for your particular case querying the same stmt for multiple refs. It's not refs_must_alias_p that is missing but something stronger ('kills' is also wrong since both refs might be reads), ref_covered_by_ref_p or so. That said, factoring stmt_kills_ref_p might not be so straight-forward for calls since we lack a general ao_ref_init for calls (ao_ref_init_stores_from_call, ao_ref_init_loads_from_call?). So I think the tree-ssa-alias.c change is fine but please put a comment before + if (DECL_FUNCTION_CODE (callee) == BUILT_IN_CALLOC) + { + tree arg0 = gimple_call_arg (stmt, 0); explaining this is used by DSE to detect redundant stores. > This patch also includes a hunk I apparently left out from yesterday's > submission which just adds _CHK cases to all the existing BUILT_IN_MEM* > cases. That's what I get for writing this part first, then adding the > _CHK stuff afterwards, then reversing the order of submission. > > This includes a slightly reduced testcase from the BZ in g++.dg -- it's > actually a good way to capture when one empty constructor causes another > empty constructor to be redundant. The gcc.dg cases capture other > scenarios. > > This has been bootstrapped and regression tested on x86-64, i686, ppc64, > ppc64le, sparc64 & aarch64. It's also bootstrapped on various arm > targets, alpha, m68k, mips, riscv64, sh4. It's been built and tested on > a variety of *-elf targets as well as various *-linux-gnu targets as > crosses. ANd just for giggles it was tested before the changes to add > the _CHK support, so it works with and without that as well. > > OK for the trunk? I also notice we wouldn't handle memset(p, 1, 64); memset(p, 1, 32); (non-zero-initializer) or x = {}; y = {}; x.a = {}; (intermediate non-aliasing store) Did you see if / how often this triggers on trunk? OK. Thanks, Richard. > > Jeff