Hi Richi, > -----Original Message----- > From: rguent...@c653.arch.suse.de <rguent...@c653.arch.suse.de> On > Behalf Of Richard Biener > Sent: Monday, September 28, 2020 2:22 PM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; o...@ucw.cz > Subject: Re: [PATCH v2 5/16]middle-end: Add shared machinery for matching > patterns involving complex numbers. > > On Fri, 25 Sep 2020, Tamar Christina wrote: > > > Hi All, > > > > This patch adds shared machinery for detecting patterns having to do > > with complex number operations. The class ComplexPattern provides > > helpers for matching and ultimately undoing the permutation in the > > tree by rebuilding the graph. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > I think you want to change all this to not look at individual > stmts: > > + vect_match_expression_p (slp_tree node, tree_code code, int base, > + int > idx, > + stmt_vec_info *op1, stmt_vec_info *op2) > + { > + > + vec<stmt_vec_info> scalar_stmts = SLP_TREE_SCALAR_STMTS (node); > + > > _all_ lanes are supposed to match the operation in > SLP_TREE_REPRESENTATIVE there's no need to do any operation-matching > on lane granularity. >
That's fair, that flexibility seems like it indeed won't work since the statements are vectorized based on SLP_TREE_REPRESENTATIVE anyway. So I'll simplify it. > The only thing making a difference here is VEC_PERM_EXPR nodes where > again - there's no need to look at (eventually non-existant!) > SLP_TREE_SCALAR_STMTS but its SLP_TREE_REPRESENTATIVE. > > + vec<slp_tree> children = SLP_TREE_CHILDREN (node); > + > + /* If it's a VEC_PERM_EXPR we need to look one deeper. > VEC_PERM_EXPR > + only have one entry. So pick on. */ > + if (node->code == VEC_PERM_EXPR) > + children = SLP_TREE_CHILDREN (children.last ()); > > that's too simplistic ;) > > + *op1 = SLP_TREE_SCALAR_STMTS (children[0])[n]; > > please make op1,op2 slp_tree, not stmt_vec_info. > > Where do you look at SLP_TREE_LANE_PERMUTATION at all? I think the > result of > Here I have to admit that I have/am a bit confused as to the relation between the different permute fields. LOAD permute is quite straight forward, LANE permute I think are shuffles/explicit permutes. But then I am lost as to the purpose of a VEC_PERM_EXPR nodes. Is it just a marker to indicate that some node below has a load permute somewhere? > + vect_detect_pair_op (int base, slp_tree node1, int offset1, > + slp_tree > node2, > + int offset2, vec<stmt_vec_info> *ops) > > could be simply the SLP_TREE_LANE_PERMUTATION? plus its two child > nodes? Right, if I understood correctly, on the two_operands case the lane permute would tell me whether it's + or -, and in the case of the non- two_operands cases I probably want to check that it's vNULL since any permute in the order changes how the instruction accepts the inputs? > > In the ComplexAddPattern patch I see > > + /* Correct the arguments after matching. */ > + std::swap (this->m_vects[1], this->m_vects[3]); > > how's that necessary? The replacement SLP node should end up with just a > SLP_TREE_REPRESENTATIVE (the IFN function call). > That is, the only thing necessary is verification / matching of the > appropriate > "interleaving" scheme imposed by SLP_TREE_LANE_PERMUTATION. But the order or the statements are important as those decide the LOAD_PERMUTATION that build_slp_tree creates. So in this case the original statement is stmt 0 _39 = _37 + _12; stmt 1 _6 = _38 - _36; {_12, _36} result in a LOAD_PERMUTATION of {1, 0} because of how the addition is done. So to undo the LOAD_PERMUTE it has to build the new children using stmt 0 _39 = {_37, _36}; stmt 1 _6 = {_38, _12}; So the scalar statements are purely a re-ordering to get it to rebuild the children correctly. Now ADD is simplistic, the more complicated cases like MUL and FMA use this property to Also rebuild the tree removing unneeded statements. This method returns these and stores them in the PatternMatch class, so I don't have to ask for them again later when replacing the nodes. Even for SLP_TREE_REPRESENTATIVE don't the arguments in the IFN call need to be in such way that they both in the same place in the load chain? > I guess things would go wrong if instead of effectively blending two vectors > we'd interleave two smaller vector type vectors? Say a 4-way add and a 4- > way sub interleaved into a 8-way addsub, using V4SF and V8SF vectors? > At this stage yes, most likely, but it should be rejected at the validate level. What is also currently rejected is when some of the definitions are external, which I think I should be able to handle. I'll fix that up with the rest of the changes. Thanks for the feedback! Regards, Tamar > Going to stop looking at the series at this point, one further note is that > all of > the Complex*Patterns seem to share the initial match and thus redundantly > do a vect_detect_pair_op repeatedly on each node for each pattern? I > wonder if it could be commoned into a single pattern, they all seem to share > the initial mixed plus/minus, then we have the multiplication on one or both > operand cases. > > Richard. > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * tree-vect-slp-patterns.c (complex_operation_t,class > ComplexPattern): > > New. > > > > > > -- > Richard Biener <rguent...@suse.de> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 > Nuernberg, Germany; GF: Felix Imend