On Tue, Mar 15, 2022 at 8:25 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Hi Richard and Marc,
> Many thanks for both your feedback on my patch for PR 101895.
> Here's version 2 of this patch, incorporating all of the suggested 
> improvements.
> The one minor complication is that the :s qualifier doesn't automatically
> recognize that a capture already has two (or N) uses in a pattern,
> so I have to manually confirm that there are no other uses of the mult
> using num_imm_uses.
>
> This revision has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?

OK.

Thanks,
Richard.

> 2022-03-15  Roger Sayle  <ro...@nextmovesoftware.com>
>             Marc Glisse  <marc.gli...@inria.fr>
>             Richard Biener  <rguent...@suse.de>
>
> gcc/ChangeLog
>         PR tree-optimization/101895
>         * match.pd (vec_same_elem_p): Handle CONSTRUCTOR_EXPR def.
>         (plus (vec_perm (mult ...) ...) ...): New reordering simplification.
>
> gcc/testsuite/ChangeLog
>         PR tree-optimization/101895
>         * gcc.target/i386/pr101895.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>
> > -----Original Message-----
> > From: Richard Biener <richard.guent...@gmail.com>
> > Sent: 14 March 2022 07:38
> > To: GCC Patches <gcc-patches@gcc.gnu.org>
> > Cc: Roger Sayle <ro...@nextmovesoftware.com>; Marc Glisse
> > <marc.gli...@inria.fr>
> > Subject: Re: [PATCH] PR tree-optimization/101895: Fold VEC_PERM to help
> > recognize FMA.
> >
> > On Sun, Mar 13, 2022 at 12:39 AM Marc Glisse via Gcc-patches <gcc-
> > patc...@gcc.gnu.org> wrote:
> > >
> > > On Fri, 11 Mar 2022, Roger Sayle wrote:
> > >
> > > +(match vec_same_elem_p
> > > +  CONSTRUCTOR@0
> > > +  (if (uniform_vector_p (TREE_CODE (@0) == SSA_NAME
> > > +                        ? gimple_assign_rhs1 (SSA_NAME_DEF_STMT (@0))
> > > +: @0))))
> > >
> > > Ah, I didn't remember we needed that, we don't seem to be very
> > > consistent about it. Probably for this reason, the transformation
> > > "Prefer vector1 << scalar to vector1 << vector2" does not match
> > >
> > > typedef int vec __attribute__((vector_size(16))); vec f(vec a, int b){
> > >    vec bb = { b, b, b, b };
> > >    return a << bb;
> > > }
> > >
> > > which is only optimized at vector lowering time.
> >
> > Few more comments - since match.pd is matching in match.pd order the
> >
> > (match vec_same_elem_p
> >   @0
> >   (...))
> >
> > should come last.  Please use
> >
> > +(match vec_same_elem_p
> > +  CONSTRUCTOR@0
> >     (if (TREE_CODE (@0) == SSA_NAME
> >          && uniform_vector_p (...
> >
> > since otherwise we'll try uniform_vector_p twice on all CTORs (that are not
> > uniform).
> >
> > > +/* Push VEC_PERM earlier if that may help FMA perception (PR101895).
> > > +*/ (for plusminus (plus minus)
> > > +  (simplify
> > > +    (plusminus (vec_perm (mult@0 @1 vec_same_elem_p@2) @0 @3) @4)
> > > +    (plusminus (mult (vec_perm @1 @1 @3) @2) @4)))
> > >
> > > Don't you want :s on mult and vec_perm?
> >
> > Yes.  Also for plus you want :c on it , likewise you want :c on the mult.  
> > The :c on
> > the plus will require splitting the plus and minus case :/
> >
> > Otherwise looks reasonable.
> >
> > Richard.
> >
> > >
> > > --
> > > Marc Glisse

Reply via email to