On Sun, Jun 18, 2023 at 5:55 PM Jan Hubicka via Gcc-patches
<[email protected]> wrote:
>
> Hi,
> we currently produce very bad code on loops using std::vector as a stack,
> since
> we fail to inline push_back which in turn prevents SRA and we fail to optimize
> out some store-to-load pairs (PR109849).
>
> I looked into why this function is not inlined and it is inlined by clang. We
> currently estimate it to 66 instructions and inline limits are 15 at -O2 and
> 30
> at -O3. Clang has similar estimate, but still decides to inline at -O2.
>
> I looked into reason why the body is so large and one problem I spotted is the
> way std::max is implemented by taking and returning reference to the values.
>
> const T& max( const T& a, const T& b );
>
> This makes it necessary to store the values to memory and load them later
> (Max is used by code computing new size of vector on resize.)
> Two stores, conditional and load accounts as 8 instructions, while
> MAX_EXPR as 1 and has a lot better chance to fold with the surrounding
> code.
>
> We optimize this to MAX_EXPR, but only during late optimizations. I think
> this
> is a common enough coding pattern and we ought to make this transparent to
> early opts and IPA. The following is easist fix that simply adds phiprop pass
> that turns the PHI of address values into PHI of values so later FRE can
> propagate values across memory, phiopt discover the MAX_EXPR pattern and DSE
> remove the memory stores.
>
> Bootstrapped/regtested x86_64-linux, does this look resonable thing to do?
OK.
Thanks,
Richard.
> Looking into how expensive the pass is, I think it is very cheap, except
> that it computes postdominator and updates ssa even if no patterns
> are matched. I will send patch to avoid that.
>
> gcc/ChangeLog:
>
> PR tree-optimization/109811
> PR tree-optimization/109849
> * passes.def: Add phiprop to early optimization passes.
> * tree-ssa-phiprop.cc: Allow clonning.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/109811
> PR tree-optimization/109849
> * gcc.dg/tree-ssa/phiprop-1.c: New test.
>
> diff --git a/gcc/passes.def b/gcc/passes.def
> index c9a8f19747b..faa5208b26b 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -88,6 +88,8 @@ along with GCC; see the file COPYING3. If not see
> /* pass_build_ealias is a dummy pass that ensures that we
> execute TODO_rebuild_alias at this point. */
> NEXT_PASS (pass_build_ealias);
> + /* Do phiprop before FRE so we optimize std::min and std::max well.
> */
> + NEXT_PASS (pass_phiprop);
> NEXT_PASS (pass_fre, true /* may_iterate */);
> NEXT_PASS (pass_early_vrp);
> NEXT_PASS (pass_merge_phi);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c
> new file mode 100644
> index 00000000000..9f52c2a7298
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-phiprop1-details -fdump-tree-release_ssa" }
> */
> +int max(int a, int b)
> +{
> + int *ptr;
> + if (a > b)
> + ptr = &a;
> + else
> + ptr = &b;
> + return *ptr;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Inserting PHI for result of load" 1
> "phiprop1"} } */
> +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "release_ssa"} } */
> diff --git a/gcc/tree-ssa-phiprop.cc b/gcc/tree-ssa-phiprop.cc
> index 3cb4900b6be..5dc505df420 100644
> --- a/gcc/tree-ssa-phiprop.cc
> +++ b/gcc/tree-ssa-phiprop.cc
> @@ -476,6 +476,7 @@ public:
> {}
>
> /* opt_pass methods: */
> + opt_pass * clone () final override { return new pass_phiprop (m_ctxt); }
> bool gate (function *) final override { return flag_tree_phiprop; }
> unsigned int execute (function *) final override;
>