Hi Richi,

> -----Original Message-----
> From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On
> Behalf Of Richard Biener
> Sent: Monday, September 28, 2020 2:22 PM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz
> Subject: Re: [PATCH v2 5/16]middle-end: Add shared machinery for matching
> patterns involving complex numbers.
> 
> On Fri, 25 Sep 2020, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This patch adds shared machinery for detecting patterns having to do
> > with complex number operations.  The class ComplexPattern provides
> > helpers for matching and ultimately undoing the permutation in the
> > tree by rebuilding the graph.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> 
> I think you want to change all this to not look at individual
> stmts:
> 
> +    vect_match_expression_p (slp_tree node, tree_code code, int base,
> + int
> idx,
> +                            stmt_vec_info *op1, stmt_vec_info *op2)
> +    {
> +
> +      vec<stmt_vec_info> scalar_stmts = SLP_TREE_SCALAR_STMTS (node);
> +
> 
> _all_ lanes are supposed to match the operation in
> SLP_TREE_REPRESENTATIVE there's no need to do any operation-matching
> on lane granularity.
> 

That's fair, that flexibility seems like it indeed won't work since the 
statements are vectorized based on
SLP_TREE_REPRESENTATIVE anyway. So I'll simplify it.

> The only thing making a difference here is VEC_PERM_EXPR nodes where
> again - there's no need to look at (eventually non-existant!)
> SLP_TREE_SCALAR_STMTS but its SLP_TREE_REPRESENTATIVE.
> 
> +      vec<slp_tree> children = SLP_TREE_CHILDREN (node);
> +
> +      /* If it's a VEC_PERM_EXPR we need to look one deeper.
> VEC_PERM_EXPR
> +        only have one entry.  So pick on.  */
> +      if (node->code == VEC_PERM_EXPR)
> +       children = SLP_TREE_CHILDREN (children.last ());
> 
> that's too simplistic ;)
> 
> +         *op1 = SLP_TREE_SCALAR_STMTS (children[0])[n];
> 
> please make op1,op2 slp_tree, not stmt_vec_info.
> 
> Where do you look at SLP_TREE_LANE_PERMUTATION at all?  I think the
> result of
> 

Here I have to admit that I have/am a bit confused as to the relation between 
the different permute fields.
LOAD permute is quite straight forward, LANE permute I think are 
shuffles/explicit permutes. 

But then I am lost as to the purpose of a VEC_PERM_EXPR nodes. Is it just a 
marker to indicate that some node below
has a load permute somewhere?

> +    vect_detect_pair_op (int base, slp_tree node1, int offset1,
> + slp_tree
> node2,
> +                        int offset2, vec<stmt_vec_info> *ops)
> 
> could be simply the SLP_TREE_LANE_PERMUTATION? plus its two child
> nodes?

Right, if I understood correctly, on the two_operands case the lane permute
would tell me whether it's + or -, and in the case of the non- two_operands 
cases
I probably want to check that it's vNULL since any permute in the order changes
how the instruction accepts the inputs?

> 
> In the ComplexAddPattern patch I see
> 
> +      /* Correct the arguments after matching.  */
> +      std::swap (this->m_vects[1], this->m_vects[3]);
> 
> how's that necessary?  The replacement SLP node should end up with just a
> SLP_TREE_REPRESENTATIVE (the IFN function call).
> That is, the only thing necessary is verification / matching of the 
> appropriate
> "interleaving" scheme imposed by SLP_TREE_LANE_PERMUTATION.

But the order or the statements are important as those decide the 
LOAD_PERMUTATION
that build_slp_tree creates. 

So in this case the original statement is

       stmt 0 _39 = _37 + _12;
       stmt 1 _6 = _38 - _36;

{_12, _36} result in a LOAD_PERMUTATION of {1, 0} because of how the addition 
is done.
So to undo the LOAD_PERMUTE it has to build the new children using

       stmt 0 _39 = {_37, _36};
       stmt 1 _6 = {_38, _12};

So the scalar statements are purely a re-ordering to get it to rebuild the 
children correctly.

Now ADD is simplistic, the more complicated cases like MUL and FMA use this 
property to
Also rebuild the tree removing unneeded statements.  This method returns these 
and stores
them in the PatternMatch class, so I don't have to ask for them again later 
when replacing the
nodes.  Even for SLP_TREE_REPRESENTATIVE don't the arguments in the IFN call 
need to be
in such way that they both in the same place in the load chain?

> I guess things would go wrong if instead of effectively blending two vectors
> we'd interleave two smaller vector type vectors?  Say a 4-way add and a 4-
> way sub interleaved into a 8-way addsub, using V4SF and V8SF vectors?
> 

At this stage yes, most likely, but it should be rejected at the validate level.

What is also currently rejected is when some of the definitions are external,
which I think I should be able to handle. I'll fix that up with the rest of the 
changes.


Thanks for the feedback!

Regards,
Tamar

> Going to stop looking at the series at this point, one further note is that 
> all of
> the Complex*Patterns seem to share the initial match and thus redundantly
> do a vect_detect_pair_op repeatedly on each node for each pattern?  I
> wonder if it could be commoned into a single pattern, they all seem to share
> the initial mixed plus/minus, then we have the multiplication on one or both
> operand cases.
> 
> Richard.
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> >     * tree-vect-slp-patterns.c (complex_operation_t,class
> ComplexPattern):
> >     New.
> >
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409
> Nuernberg, Germany; GF: Felix Imend

Reply via email to