On Tue, 18 Feb 2020, Xionghu Luo wrote:

> Store-merging pass should run twice, the reason is pass fre/pre will do
> some kind of optimizations to instructions by:
>   1. Converting the load from address to load from function arguments
>   (store_merging_30.c:foo1).
>   2. Converting the byte access to BIT_FIELD_REF(store_merging_30.c:foo2).
>   3. Other bitfield combinations or potential interference optimizations etc.
> These optimizations will break the store chain, store-merging pass fails
> to catch such kind of pattern so stores are not merged in middle end,
> then consecutive stb/sth instructions(should be merged to stw) are emitted
> finally.
> 
> And why not directly move store-merging pass(numbered 194) just before
> fre1(numbered 35) is for case store_merging_14.c, 5 merges are done by
> store_merging1, and 4 merges are done fore store_merge2. So, keep the
> original store_merge as store_merge2 as store merge may be still available
> after other pass optimizations.  Most of the 30 store_merging_N.c test
> case dg-final pass name would be updated from store-merging to
> store-merging1 once this RFC patch idea got confirmed.
> Any comments?  Thanks.

Generally trying to solve a pass ordering issue by re-ordering/duplicating
passes might help a single testcase but will surely pessimize others.
So that's a no-go.

What the testcase shows is that store-merging needs to operate
similar to bswap when trying to find a "common" source for a combined
store.  That is, it should appropriately follow defs.  For foo2 I expect
it to be not too difficult, for foo1 I'd say we miss a FRE opportunity
here (but Jakub is working on that one IIRC).

Richard.

> PS:
> Before this patch, store_merging_30.c.035t.fre1:
> 
> ... foo1:
> Inserted _13 = (short unsigned int) counters_new_5(D);
> Replaced tmp.D.2912.D.2911.D.2910.D.2909.inuse with _13 in all uses of
> _1 = tmp.D.2912.D.2911.D.2910.D.2909.inuse;
> Removing dead stmt _1 = tmp.D.2912.D.2911.D.2910.D.2909.inuse;
> ... foo2:
> Inserted _17 = BIT_FIELD_REF <_1, 8, 16>;
> Replaced tmp.D.2926.D.2925.D.2924.D.2923.objects with _17 in all uses of
> _3 = tmp.D.2926.D.2925.D.2924.D.2923.objects;
> Removing dead stmt _3 = tmp.D.2926.D.2925.D.2924.D.2923.objects;
> 
> foo1 asm:
> rldicl 9,4,48,48
> sth 4,0(3)
> sth 9,2(3)
> blr
> 
> With this patch(similar for foo2):
> 
> stw r4,0(r3)
> blr
> 
> gcc/ChangeLog
> 
> 2020-02-18  Xiong Hu Luo  <luo...@linux.ibm.com>
> 
>       Part of PR middle-end/71509
>       gimple-ssa-store-merging.c (clone): New.
>       passes.def (pass_store_merging): New.
> 
> gcc/testsuite/ChangeLog
> 
> 2020-02-18  Xiong Hu Luo  <luo...@linux.ibm.com>
> 
>       Part of PR middle-end/71509
>       testsuite/gcc.dg/store_merging_14.c: Update.
>       testsuite/gcc.dg/store_merging_30.c: New.
> ---
>  gcc/gimple-ssa-store-merging.c          |  2 +
>  gcc/passes.def                          |  1 +
>  gcc/testsuite/gcc.dg/store_merging_14.c |  3 +-
>  gcc/testsuite/gcc.dg/store_merging_30.c | 86 +++++++++++++++++++++++++
>  4 files changed, 91 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/store_merging_30.c
> 
> diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
> index 8371323ef4a..9a5bd49fc3a 100644
> --- a/gcc/gimple-ssa-store-merging.c
> +++ b/gcc/gimple-ssa-store-merging.c
> @@ -2156,6 +2156,8 @@ public:
>    {
>    }
>  
> +  opt_pass * clone () { return new pass_store_merging (m_ctxt); }
> +
>    /* Pass not supported for PDP-endian, nor for insane hosts or
>       target character sizes where native_{encode,interpret}_expr
>       doesn't work properly.  */
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 2bf2cb78fc5..e531531cb14 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -85,6 +85,7 @@ along with GCC; see the file COPYING3.  If not see
>         /* pass_build_ealias is a dummy pass that ensures that we
>            execute TODO_rebuild_alias at this point.  */
>         NEXT_PASS (pass_build_ealias);
> +       NEXT_PASS (pass_store_merging);
>         NEXT_PASS (pass_fre, true /* may_iterate */);
>         NEXT_PASS (pass_early_vrp);
>         NEXT_PASS (pass_merge_phi);
> diff --git a/gcc/testsuite/gcc.dg/store_merging_14.c 
> b/gcc/testsuite/gcc.dg/store_merging_14.c
> index 9310aaf3489..bd120d18ac6 100644
> --- a/gcc/testsuite/gcc.dg/store_merging_14.c
> +++ b/gcc/testsuite/gcc.dg/store_merging_14.c
> @@ -214,4 +214,5 @@ main ()
>    return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "Merging successful" 9 "store-merging" 
> } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 5 "store-merging1" 
> } } */
> +/* { dg-final { scan-tree-dump-times "Merging successful" 4 "store-merging2" 
> } } */
> diff --git a/gcc/testsuite/gcc.dg/store_merging_30.c 
> b/gcc/testsuite/gcc.dg/store_merging_30.c
> new file mode 100644
> index 00000000000..71369c3b196
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/store_merging_30.c
> @@ -0,0 +1,86 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target store_merge } */
> +/* { dg-options "-O2 -fdump-tree-store-merging" } */
> +
> +typedef unsigned int atomic_t;
> +
> +struct page
> +{
> +  union
> +  {
> +    unsigned long counters;
> +    struct
> +    {
> +      union
> +      {
> +     struct
> +     {
> +       unsigned inuse : 16;
> +       unsigned objects : 15;
> +       unsigned frozen : 1;
> +     };
> +      };
> +    };
> +  };
> +};
> +
> +struct page2
> +{
> +  union
> +  {
> +    unsigned counters;
> +    struct
> +    {
> +      union
> +      {
> +     struct
> +     {
> +       unsigned inuse : 16;
> +       unsigned objects : 8;
> +       unsigned frozen : 8;
> +     };
> +      };
> +    };
> +  };
> +};
> +
> +__attribute__((noipa)) void
> +foo1 (struct page *page, unsigned long counters_new)
> +{
> +        struct page tmp;
> +        tmp.counters = counters_new;
> +        page->inuse   = tmp.inuse;
> +        page->objects = tmp.objects;
> +        page->frozen  = tmp.frozen;
> +}
> +
> +__attribute__((noipa)) void
> +foo2 (struct page2 *page2, unsigned long counters_new)
> +{
> +        struct page2 tmp;
> +        tmp.counters = counters_new;
> +        page2->inuse   = tmp.inuse;
> +        page2->objects = tmp.objects;
> +        page2->frozen  = tmp.frozen;
> +}
> +
> +int main ()
> +{
> +  struct page page;
> +  foo1 (&page, 0x12345678ABCDEFFEUL);
> +
> +  asm volatile ("" : : : "memory");
> +  if (page.frozen != 1 || page.objects != 0x2bcd || page.inuse != 0xeffe)
> +    __builtin_abort ();
> +
> +  struct page2 page2;
> +  foo2 (&page2, 0x12345678ABCDEFFEUL);
> +
> +  asm volatile ("" : : : "memory");
> +
> +  if (page2.frozen != 0xab || page2.objects != 0xcd || page2.inuse != 0xeffe)
> +    __builtin_abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging1" 
> } } */
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to