Hello,this patch lets gcc know that if a pointer existed before the call to malloc, the result of malloc cannot alias it. This is a bit ad hoc and fragile. A small improvement would be, when the 2 statements are in the same bb but in the wrong order, to check if there is any statement in between that might prevent from reordering them. But that's more complicated, and the patch as it is already does help.
I expect people may not like the call to a function from tree-ssa-loop-niter too much, but it is convenient. And if someone improves it, they will likely have to rewrite something not quite equivalent to stmt_dominates_stmt_p.
The testcase uses libstdc++ quite a bit. I thought about putting it in the libstdc++ testsuite, but it does not know scan-tree-dump, and in the assembly it may be too late, memcpy may get expanded to something target-specific. So it is in g++.dg. I could write a more artificial testcase, but the behavior of std::vector is really what I want to test...
Bootstrap+regtest on x86_64-pc-linux-gnu. 2019-05-13 Marc Glisse <marc.gli...@inria.fr> gcc/ * tree-ssa-loop-niter.c (stmt_dominates_stmt_p): Handle NULL basic block. * tree-ssa-alias.c: Include tree-ssa-loop-niter.h. (ptr_derefs_may_alias_p): Detect malloc and an older pointer. gcc/testsuite/ * g++.dg/tree-ssa/ldist-2.C: New file. -- Marc Glisse
Index: gcc/testsuite/g++.dg/tree-ssa/ldist-2.C =================================================================== --- gcc/testsuite/g++.dg/tree-ssa/ldist-2.C (nonexistent) +++ gcc/testsuite/g++.dg/tree-ssa/ldist-2.C (working copy) @@ -0,0 +1,14 @@ +// { dg-do compile { target c++11 } } +// { dg-options "-O3 -fdump-tree-optimized" } + +#include <vector> +#include <memory> +#include <new> +// Remove those 2 inlines once gcc knows that new/delete are special +inline void* operator new(std::size_t n){return malloc(n);} +inline void operator delete(void*p){free(p);} +typedef std::unique_ptr<int> T; +typedef std::vector<T> V; +void f(V&v){v.reserve(1024);} + +/* { dg-final { scan-tree-dump "memcpy" "optimized" } } */ Index: gcc/tree-ssa-alias.c =================================================================== --- gcc/tree-ssa-alias.c (revision 271097) +++ gcc/tree-ssa-alias.c (working copy) @@ -31,20 +31,21 @@ along with GCC; see the file COPYING3. #include "cgraph.h" #include "tree-pretty-print.h" #include "alias.h" #include "fold-const.h" #include "langhooks.h" #include "dumpfile.h" #include "tree-eh.h" #include "tree-dfa.h" #include "ipa-reference.h" #include "varasm.h" +#include "tree-ssa-loop-niter.h" /* Broad overview of how alias analysis on gimple works: Statements clobbering or using memory are linked through the virtual operand factored use-def chain. The virtual operand is unique per function, its symbol is accessible via gimple_vop (cfun). Virtual operands are used for efficiently walking memory statements in the gimple IL and are useful for things like value-numbering as a generation count for memory references. @@ -284,20 +285,40 @@ ptr_derefs_may_alias_p (tree ptr1, tree || !POINTER_TYPE_P (TREE_TYPE (ptr1)) || !POINTER_TYPE_P (TREE_TYPE (ptr2))) return true; /* We may end up with two empty points-to solutions for two same pointers. In this case we still want to say both pointers alias, so shortcut that here. */ if (ptr1 == ptr2) return true; + /* Memory returned by malloc cannot alias with a pre-existing pointer. + This is extremely crude, the order between the statements may be quite + arbitrary. */ + if (in_gimple_form && dom_info_available_p (cfun, CDI_DOMINATORS)) + { + gimple *def1 = SSA_NAME_DEF_STMT (ptr1); + gimple *def2 = SSA_NAME_DEF_STMT (ptr2); + tree decl; + if (stmt_dominates_stmt_p (def1, def2) + && is_gimple_call (def2) + && (decl = gimple_call_fndecl (def2)) + && DECL_IS_MALLOC (decl)) + return false; + else if (stmt_dominates_stmt_p (def2, def1) + && is_gimple_call (def1) + && (decl = gimple_call_fndecl (def1)) + && DECL_IS_MALLOC (decl)) + return false; + } + /* If we do not have useful points-to information for either pointer we cannot disambiguate anything else. */ pi1 = SSA_NAME_PTR_INFO (ptr1); pi2 = SSA_NAME_PTR_INFO (ptr2); if (!pi1 || !pi2) return true; /* ??? This does not use TBAA to prune decls from the intersection that not both pointers may access. */ return pt_solutions_intersect (&pi1->pt, &pi2->pt); Index: gcc/tree-ssa-loop-niter.c =================================================================== --- gcc/tree-ssa-loop-niter.c (revision 271097) +++ gcc/tree-ssa-loop-niter.c (working copy) @@ -4433,20 +4433,23 @@ stmt_dominates_stmt_p (gimple *s1, gimpl if (gimple_code (s1) == GIMPLE_PHI) return true; for (bsi = gsi_start_bb (bb1); gsi_stmt (bsi) != s2; gsi_next (&bsi)) if (gsi_stmt (bsi) == s1) return true; return false; } + if (!bb2) + return false; + return dominated_by_p (CDI_DOMINATORS, bb2, bb1); } /* Returns true when we can prove that the number of executions of STMT in the loop is at most NITER, according to the bound on the number of executions of the statement NITER_BOUND->stmt recorded in NITER_BOUND and fact that NITER_BOUND->stmt dominate STMT. ??? This code can become quite a CPU hog - we can have many bounds, and large basic block forcing stmt_dominates_stmt_p to be queried