Bootstrapped and tested on aarch64-linux-gnu & x86_64-linux-gnu.  Could you 
please help install this patch?

Kaipeng Zhou

> -----Original Message-----
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Tuesday, June 16, 2020 5:49 PM
> To: zhoukaipeng (A) <zhoukaipe...@huawei.com>
> Cc: Richard Sandiford <richard.sandif...@arm.com>; gcc-
> patc...@gcc.gnu.org; am...@gcc.gnu.org; rgue...@gcc.gnu.org
> Subject: Re: [PATCH PR95199 v2] vect: CSE for bump and offset in strided
> load/store operations
> 
> On Tue, 16 Jun 2020, zhoukaipeng (A) wrote:
> 
> > Hi,
> >
> > I try to eliminate the common stmts for *vec_offset also.  But I am
> > not sure it is a good way.
> >
> > New patch attached.  Bootstraped and testsuites are being tested.
> 
> Looks good to me and certainly worth if it makes a difference for IV
> calculation (that was the main motivation of the machinery).
> 
> Richard.
> 
> > Kaipeng Zhou
> >
> > > -----Original Message-----
> > > From: Richard Biener [mailto:rguent...@suse.de]
> > > Sent: Monday, June 15, 2020 2:21 PM
> > > To: zhoukaipeng (A) <zhoukaipe...@huawei.com>
> > > Cc: Richard Sandiford <richard.sandif...@arm.com>; gcc-
> > > patc...@gcc.gnu.org; am...@gcc.gnu.org; rgue...@gcc.gnu.org
> > > Subject: RE: [PATCH PR95199] vect: Remove extra variable created for
> > > memory reference
> > >
> > > On Sun, 14 Jun 2020, zhoukaipeng (A) wrote:
> > >
> > > > Hi,
> > > >
> > > > I modified the issue you mentioned.  Bootstrap and tested on
> > > > aarch64
> > > Linux platform again.  No new regression witnessed.
> > > >
> > > > For "*vec_offset", it is indeed a point optimizable.  However, it
> > > > is not able to eliminate it by the same way as "*dataref_bump".
> > > > The cse_and_gimplify_to_preheader will return the cached operand
> > > > if operand_compare::operand_equal_p for the cached one and the
> new
> > > > one returns true.  But it did not work well for "*vec_offset".
> > >
> > > You have to see what is actually in *vec_offset, if there's partly
> > > gimplified but not CSEd stmts in there then that's the issue.
> > >
> > > Richard.
> > >
> > > > Do you have any suggestions for the problem?
> > > >
> > > > Thanks,
> > > > Kaipeng Zhou
> > > >
> > > > > -----Original Message-----
> > > > > From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> > > > > Sent: Friday, June 12, 2020 3:48 PM
> > > > > To: zhoukaipeng (A) <zhoukaipe...@huawei.com>
> > > > > Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de;
> > > > > am...@gcc.gnu.org; rgue...@gcc.gnu.org
> > > > > Subject: Re: [PATCH PR95199] vect: Remove extra variable created
> > > > > for memory reference
> > > > >
> > > > > "zhoukaipeng (A)" <zhoukaipe...@huawei.com> writes:
> > > > > > Hi,
> > > > > >
> > > > > > This is a fix for pr95199.
> > > > > > In vectorization, vinfo->ivexpr_map is supposed to catch those
> > > > > > "IV base
> > > > > and/or step expressions".  Just call
> > > > > cse_and_gimplify_to_preheader to handle gathering/scattering to
> avoid the extra variable.
> > > > > >
> > > > > > Bootstrap and tested on aarch64/x86_64 Linux platform. No new
> > > > > regression witnessed.
> > > > > >
> > > > > > Is it ok?
> > > > > >
> > > > > > Thanks,
> > > > > > Kaipeng Zhou
> > > > > >
> > > > > > From 12cf9b362576735e4c584c48cd6db3d2b0f2e14b Mon Sep 17
> > > 00:00:00
> > > > > 2001
> > > > > > From: Kaipeng Zhou <zhoukaipe...@huawei.com>
> > > > > > Date: Fri, 12 Jun 2020 00:54:15 +0800
> > > > > > Subject: [PATCH] vect: Remove extra variable created for
> > > > > > memory reference in  loop vectorization.
> > > > > >
> > > > > > Vectorization create two copies of the same IV and IVOPTs does
> > > > > > not perform CSE.  But vinfo->ivexpr_map is supposed to catch
> > > > > > those "IV base and/or step expressions".  Just call
> > > > > > cse_and_gimplify_to_preheader to handle gathering/scattering
> > > > > > to avoid
> > > > > the extra variable.
> > > > > >
> > > > > > 2020-06-12  Bin Cheng <bin.ch...@linux.alibaba.com>
> > > > > >         Kaipeng Zhou  <zhoukaipe...@huawei.com>
> > > > > >
> > > > > >     PR tree-optimization/95199
> > > > > >     * tree-vect-stmts.c: Use CSE map to catch the IV step and
> eliminate
> > > > > >     extra variable.
> > > > > >
> > > > > > 2020-06-12  Bin Cheng <bin.ch...@linux.alibaba.com>
> > > > > >         Kaipeng Zhou  <zhoukaipe...@huawei.com>
> > > > > >
> > > > > >     PR tree-optimization/95199
> > > > > >     * gcc.dg/vect/pr95199.c: New test.
> > > > > > ---
> > > > > >  gcc/ChangeLog                       |  7 +++++++
> > > > > >  gcc/testsuite/ChangeLog             |  6 ++++++
> > > > > >  gcc/testsuite/gcc.dg/vect/pr95199.c | 17 +++++++++++++++++
> > > > > >  gcc/tree-vect-stmts.c               |  1 +
> > > > > >  4 files changed, 31 insertions(+)  create mode 100644
> > > > > > gcc/testsuite/gcc.dg/vect/pr95199.c
> > > > > >
> > > > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog index
> > > > > > c92582df7fe..753d70fc94b 100644
> > > > > > --- a/gcc/ChangeLog
> > > > > > +++ b/gcc/ChangeLog
> > > > > > @@ -1,3 +1,10 @@
> > > > > > +2020-06-12  Bin Cheng <bin.ch...@linux.alibaba.com>
> > > > > > +       Kaipeng Zhou  <zhoukaipe...@huawei.com>
> > > > > > +
> > > > > > +   PR tree-optimization/95199
> > > > > > +   * tree-vect-stmts.c: Use CSE map to catch the IV step and
> eliminate
> > > > > > +   extra variable.
> > > > > > +
> > > > > >  2020-06-08  Tobias Burnus  <tob...@codesourcery.com>
> > > > > >
> > > > > >     PR lto/94848
> > > > > > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> > > > > > index
> > > > > > 60d9ecca3ed..a27dd3fa072 100644
> > > > > > --- a/gcc/testsuite/ChangeLog
> > > > > > +++ b/gcc/testsuite/ChangeLog
> > > > > > @@ -1,3 +1,9 @@
> > > > > > +2020-06-12  Bin Cheng <bin.ch...@linux.alibaba.com>
> > > > > > +       Kaipeng Zhou  <zhoukaipe...@huawei.com>
> > > > > > +
> > > > > > +   PR tree-optimization/95199
> > > > > > +   * gcc.dg/vect/pr95199.c: New test.
> > > > > > +
> > > > > >  2020-06-08  Harald Anlauf  <anl...@gmx.de>
> > > > > >
> > > > > >     PR fortran/95195
> > > > > > diff --git a/gcc/testsuite/gcc.dg/vect/pr95199.c
> > > > > > b/gcc/testsuite/gcc.dg/vect/pr95199.c
> > > > > > new file mode 100644
> > > > > > index 00000000000..ec201feaec8
> > > > > > --- /dev/null
> > > > > > +++ b/gcc/testsuite/gcc.dg/vect/pr95199.c
> > > > > > @@ -0,0 +1,17 @@
> > > > > > +/* { dg-do compile { target aarch64-*-* } } */
> > > > > > +/* { dg-options "-O3 -march=armv8.2-a+fp+sve" } */
> > > > > > +
> > > > > > +void
> > > > > > +foo (double *a, double *b, double m, int inc_x, int inc_y) {
> > > > > > +  int ix = 0, iy = 0;
> > > > > > +  for (int i = 0; i < 1000; ++i)
> > > > > > +    {
> > > > > > +      a[ix] += m * b[iy];
> > > > > > +      ix += inc_x;
> > > > > > +      iy += inc_y;
> > > > > > +    }
> > > > > > +  return ;
> > > > > > +}
> > > > > > +
> > > > > > +/* { dg-final { scan-assembler-times "add" 12 } } */
> > > > >
> > > > > For asm tests, it's probably better to put this in aarch64/sve 
> > > > > instead.
> > > > > The scan-assembler should be a bit more precise though, since just
> "add"
> > > > > could catch filenames like /homd/ladd/gcc.  E.g. maybe it would
> > > > > be clearer to make the regexp match the register operands too.
> > > > > (In that case it's better to quote the regexp with {…} rather
> > > > > than "…", to reduce the number of
> > > > > backslashes.)
> > > > >
> > > > > > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> > > > > > index
> > > > > > b24b0fe4304..26a2b143b01 100644
> > > > > > --- a/gcc/tree-vect-stmts.c
> > > > > > +++ b/gcc/tree-vect-stmts.c
> > > > > > @@ -2969,6 +2969,7 @@ vect_get_strided_load_store_ops
> > > > > (stmt_vec_info stmt_info,
> > > > > >    tree bump = size_binop (MULT_EXPR,
> > > > > >                       fold_convert (sizetype, unshare_expr
> (DR_STEP
> > > > > (dr))),
> > > > > >                       size_int (TYPE_VECTOR_SUBPARTS
> (vectype)));
> > > > > > +  bump = cse_and_gimplify_to_preheader (loop_vinfo, bump);
> > > > > >    *dataref_bump = force_gimple_operand (bump, &stmts, true,
> > > > > NULL_TREE);
> > > > > >    if (stmts)
> > > > > >      gsi_insert_seq_on_edge_immediate (loop_preheader_edge
> > > > > > (loop), stmts);
> > > > >
> > > > > Agree it's good to do this, but using
> > > > > cse_and_gimplify_to_preheader makes the force_gimple_operand
> and
> > > gsi_insert_seq_on_edge_immediate
> > > > > redundant.
> > > > >
> > > > > I guess we might as well do the same thing for “*vec_offset” too.
> > > > >
> > > > > Thanks,
> > > > > Richard
> > > >
> > >
> > > --
> > > Richard Biener <rguent...@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> > > Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Reply via email to