On 6/11/14, Richard Biener <richard.guent...@gmail.com> wrote: > On Wed, Jun 11, 2014 at 12:53 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener >>> <richard.guent...@gmail.com> wrote: >>>> On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni >>>> <bilbotheelffri...@gmail.com> wrote: >>>>> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener >>>>> <richard.guent...@gmail.com> wrote: >>>>>> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener >>>>>> <richard.guent...@gmail.com> wrote: >>>>>>> On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni >>>>>>> <bilbotheelffri...@gmail.com> wrote: >>>>>>>> On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener >>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni >>>>>>>>> <bilbotheelffri...@gmail.com> wrote: >>>>>>>>>> I have few questions regarding genmatch: >>>>>>>>>> >>>>>>>>>> a) Why is 4 hard-coded here: ? >>>>>>>>>> in write_nary_simplifiers: >>>>>>>>>> fprintf (f, " tree captures[4] = {};\n"); >>>>>>>>> >>>>>>>>> Magic number (this must be big enough for all cases ...). >>>>>>>>> Honestly >>>>>>>>> this should be improved (but requires another scan over the matcher >>>>>>>>> IL >>>>>>>>> to figure out the max N used in @N). >>>>>>>>> >>>>>>>>>> b) Should we add syntax for a symbol to denote multiple operators >>>>>>>>>> ? >>>>>>>>>> For exampleim in simplify_rotate: >>>>>>>>>> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^ (CNT1 + CNT2 == >>>>>>>>>> bitsize of type of X). >>>>>>>>> >>>>>>>>> Something to enhance the IL with, yes. I'd say we support >>>>>>>>> >>>>>>>>> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR) >>>>>>>>> >>>>>>>>> thus, >>>>>>>>> >>>>>>>>> (define_op <operator-name> op...) >>>>>>>>> >>>>>>>>>> c) Remove for parsing capture in parse_expr since we reject >>>>>>>>>> outermost >>>>>>>>>> captured expressions ? >>>>>>>>> >>>>>>>>> but parse_expr is also used for inner expressions, no? >>>>>>>>> >>>>>>>>> (plus (minus@2 @0 @1) @3) >>>>>>>>> >>>>>>>>> should still work >>>>>>>>> >>>>>>>>>> d) I am not able to follow this comment in match.pd: >>>>>>>>>> /* Patterns required to avoid SCCVN testsuite regressions. */ >>>>>>>>>> >>>>>>>>>> /* (x >> 31) & 1 -> (x >> 31). Folding in fold-const is more >>>>>>>>>> complicated here, it does >>>>>>>>>> Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1)) >>>>>>>>>> (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1)) >>>>>>>>>> if the new mask might be further optimized. */ >>>>>>>>>> (match_and_simplify >>>>>>>>>> (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep) >>>>>>>>>> if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) >>>>>>>>>> == 0) >>>>>>>>>> @0) >>>>>>>>> >>>>>>>>> The comment is literally copied from the case I extracted the >>>>>>>>> (simplified) variant from fold-const.c. See lines 11961-12056 in >>>>>>>>> fold-const.c. >>>>>>>>> It'll be a challenge to implement the equivalent in a pattern ;) >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Decision Tree. >>>>>>>>>> I have tried to come up with a prototype for decision tree >>>>>>>>>> (patch attached). >>>>>>>>>> For simplicity, it handles patterns involving only unary operators >>>>>>>>>> and >>>>>>>>>> no predicates >>>>>>>>>> and returns false when the pattern fails to match (no goto to >>>>>>>>>> match >>>>>>>>>> another pattern). >>>>>>>>>> I meant to post it only for illustration, and I have not really >>>>>>>>>> paid >>>>>>>>>> attention to code quality (bad formatting, memory leaks, etc.). >>>>>>>>>> >>>>>>>>>> * Basic Idea >>>>>>>>>> A pattern consists of following parts: match, ifexpr and result. >>>>>>>>>> Let's call <ifexpr, result> as "simplification" operand. >>>>>>>>>> The common prefix between different match operands would be >>>>>>>>>> represented >>>>>>>>>> by same nodes in the decision tree. >>>>>>>>>> >>>>>>>>>> Example: >>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>> S1 >>>>>>>>>> >>>>>>>>>> (negate (negate @0)) >>>>>>>>>> S2 >>>>>>>>>> >>>>>>>>>> S1, S2 denote simplifications for the above patterns >>>>>>>>>> respectively. >>>>>>>>>> >>>>>>>>>> The decision tree would look something like >>>>>>>>>> (that's the way it gets constructed with the patch): >>>>>>>>>> >>>>>>>>>> dummy/root >>>>>>>>>> | >>>>>>>>>> NEGATE_EXPR >>>>>>>>>> / \ >>>>>>>>>> BIT_NOT NEGATE_EXPR >>>>>>>>>> | | >>>>>>>>>> @0 @0 >>>>>>>>>> | | >>>>>>>>>> S1 S2 >>>>>>>>>> >>>>>>>>>> a) The children of an internal node are number of decisions that >>>>>>>>>> can be taken at that node. In the above case it's 2 for outer >>>>>>>>>> NEGATE_EXPR. >>>>>>>>>> b) Simplification operand represents leaves of the decision tree >>>>>>>>>> c) Instead of having list of heads, I have added one dummy node, >>>>>>>>>> and heads become children of these dummy root node. >>>>>>>>>> d) Code-gen for non-simplification operands involves generating, >>>>>>>>>> "matching" code and for simplification operands involves >>>>>>>>>> generating >>>>>>>>>> "transform" code >>>>>>>>>> >>>>>>>>>> * Overall Flow: >>>>>>>>>> I guess we would build the decision tree from the AST. >>>>>>>>>> So the flow would be like: >>>>>>>>>> source -> struct simplify (match, ifexpr, result) -> decision tree >>>>>>>>>> -> c code. >>>>>>>>>> >>>>>>>>>> Something like (in main): >>>>>>>>>> decision_tree dt; >>>>>>>>>> while (there is another pattern) >>>>>>>>>> { >>>>>>>>>> simplify *s = parse_match_and_simplify (); >>>>>>>>>> insert s into decision tree; >>>>>>>>>> }; >>>>>>>>>> So parsing routines are concerned with parsing and building the >>>>>>>>>> AST (operand), >>>>>>>>>> and not with the decision tree. Is that fine ? >>>>>>>>> >>>>>>>>> Yes, that's good. >>>>>>>>> >>>>>>>>>> * Representation of decision tree. >>>>>>>>>> A decision tree would need a way to represent language constructs >>>>>>>>>> like capture, predicate, etc. so in some ways it would be similar >>>>>>>>>> to AST. >>>>>>>>>> It >>>>>>>>>> In the patch, I have created the following heirarchy: >>>>>>>>>> dt_operand: represents a general base "operand" of in decision >>>>>>>>>> tree >>>>>>>>>> dt_expr: for representing expression. Expression contains >>>>>>>>>> operation >>>>>>>>>> to be performed (e_operation). >>>>>>>>>> dt_capture: analogous to capture. >>>>>>>>>> dt_simplify: for representing "simplification" operand. >>>>>>>>>> simplification consists of ifexpr and result >>>>>>>>>> dt_head: to represent "dummy" root. Maybe a separate class is not >>>>>>>>>> needed. >>>>>>>>>> >>>>>>>>>> * Constructing decision tree from AST >>>>>>>>>> The algorithm i have used is similar to inserting string in a >>>>>>>>>> trie >>>>>>>>>> outlined here: http://en.wikipedia.org/wiki/Trie >>>>>>>>>> The difference shall be to traverse AST depth-first rather than >>>>>>>>>> traversing the string. >>>>>>>>>> Apart from that I guess it would be the same (naturally find and >>>>>>>>>> compare operations would be different). >>>>>>>>>> I haven't given much thought about this. Currently, I construct >>>>>>>>>> decision tree for only patterns with unary operators in an >>>>>>>>>> ugly way by "flattening" the AST by walking it in pre-order and >>>>>>>>>> storing the nodes in vector in the order they are visited >>>>>>>>>> >>>>>>>>>> So to construct decision tree from the following AST: >>>>>>>>>> negate >>>>>>>>>> | >>>>>>>>>> bit_not >>>>>>>>>> | >>>>>>>>>> @0 >>>>>>>>>> >>>>>>>>>> AST is flattened by walking in preorder (by walk_operand_preorder) >>>>>>>>>> and stored >>>>>>>>>> as: [ expr, expr, capture ] >>>>>>>>>> and this vector is used to construct decision tree. >>>>>>>>>> I did it as a "quick and dirty" way, it has to be changed. >>>>>>>>>> And it won't work for patterns with n-ary operators. >>>>>>>>>> >>>>>>>>>> We should visit each node of AST in preorder, and add that node >>>>>>>>>> during traversal to decision tree. I am not yet clear on way of >>>>>>>>>> doing that. >>>>>>>>>> >>>>>>>>>> * Comparing operands >>>>>>>>>> How do we compare operands ? We shall need to do this while >>>>>>>>>> inserting >>>>>>>>>> in decision tree, since if the operand is already inserted we do >>>>>>>>>> not create >>>>>>>>>> a new node. >>>>>>>>>> In the patch (cmp_operand), I have used following rules: >>>>>>>>>> a) if types not equal, they are clearly not equal. >>>>>>>>>> b) if type is expr, compare operation->op->id >>>>>>>>>> c) if type is capture, compare the number. >>>>>>>>>> d) for predicate, compare on ident ? >>>>>>>>>> >>>>>>>>>> * Representing patterns with n-ary operators. >>>>>>>>>> Consider following two match operands: >>>>>>>>>> (plus (minus @0 @1) @1) >>>>>>>>>> (plus (negate @0) @1) >>>>>>>>>> >>>>>>>>>> In decision tree, it would be represented as: >>>>>>>>>> >>>>>>>>>> *********plus********* >>>>>>>>>> / \ / \ >>>>>>>>>> minus @1 negate @1 >>>>>>>>>> / \ | >>>>>>>>>> @0 @1 @0 >>>>>>>>>> >>>>>>>>>> plus has got 4 children - (minus @0 @1), @1, (negate @0), @1. >>>>>>>>>> However (minus @0 @1) and @1 are not separate "decisions", but >>>>>>>>>> children of plus within the same expression. >>>>>>>>>> >>>>>>>>>> For calculating number of decisions at an expression node: >>>>>>>>>> number of decisions = number of children / arity of operator. >>>>>>>>>> in the above case, number of decision = 4 / 2 = 2 >>>>>>>>>> >>>>>>>>>> For other cases, for instance capture node (the children would be >>>>>>>>>> simplification operand), >>>>>>>>>> number of decisions = number of children. >>>>>>>>> >>>>>>>>> Hmm. I think it should be only two children from the plus, >>>>>>>>> as pre-order for the first pattern is for example >>>>>>>>> [plus minus @0 @1 @1] so you'd have >>>>>>>>> >>>>>>>>> plus >>>>>>>>> / \ >>>>>>>>> minus negate >>>>>>>>> | | >>>>>>>>> @0 @0 >>>>>>>>> | | >>>>>>>>> @1 @1 >>>>>>>>> | >>>>>>>>> @1 >>>>>>>>> >>>>>>>>> instead? >>>>>>>>> >>>>>>>>> Note that you probably have to deal with non-matching capture >>>>>>>>> IDs by re-writing them on-the-fly. That is, the two >>>>>>>>> pre-oder traversals [plus minus @1 ...] and [plus minus @0 ...] >>>>>>>>> should commonize with renaming the non-matching capture >>>>>>>>> somehow. >>>>>>>> Um, could you please elaborate on that (non-matching capture ?), >>>>>>>> I am not sure if I understood it fully, thanks. >>>>>>> >>>>>>> I'm worried about >>>>>>> >>>>>>> (plus (minus@1 @2 @3) @2) >>>>>>> >>>>>>> vs. >>>>>>> >>>>>>> (plus (minus @1 @2) @1) >>>>>>> >>>>>>> (just made-up examples, of course). Here we'd like to >>>>>>> commonize the path checking for (plus (minus ...) ... but >>>>>>> of course we can't easily process the captures at that point >>>>>>> as the first has two captures (it captures the minus itself while >>>>>>> the 2nd doesn't) and the captures that are at the same position >>>>>>> use different indexes. To finally match the minus we have to >>>>>>> verify that in one case @2 == @2 and in the other case that >>>>>>> @1 == @1. >>>>>>> >>>>>>> Re-numbering capture indexes in pre-order would improve things >>>>>>> for the case where capture positions are in the same places but >>>>>>> the indexes do not match. It won't handle the above case >>>>>>> of course. >>>>> Thanks, I understand it now -:) >>>> >>>> I suppose that it would be best (in the end) to not insert the >>>> captures into the decision tree separately but leave them >>>> "combined" and then only after the decision tree is fully built >>>> decide on re-numbering. And treat those that are required for >>>> the matching process differently (those that assert that two >>>> ops are equal). >>>> >>>> I think we can defer the issue to later. >>>> >>>>>>> >>>>>>>> I have attached patch for patterns with binary operators (in >>>>>>>> principle >>>>>>>> should work for n-ary operators, >>>>>>>> but have only tested upto 2), that creates decision tree for >>>>>>>> patterns (excluding predicates, and multiple matching). I don't >>>>>>>> intend >>>>>>>> to commit this (contains many hacks!). >>>>>>>> >>>>>>>> * Constructing decision tree from AST >>>>>>>> We can define our decision tree to store prefixes of preorder >>>>>>>> traversals of diffferent AST. >>>>>>>> Insertion happens as follows (decision_tree::insert) >>>>>>>> a) Get preorder traversal of AST into vector. >>>>>>>> b) Insert AST nodes from vector into decision tree. >>>>>>>> Currently, I obtain preorder traversal into a vector. We can later >>>>>>>> change it to compute >>>>>>>> next preorder successor lazily. >>>>>>> >>>>>>> Just make the code easy to understand, optimizing it should be >>>>>>> secondary. >>>>>>> >>>>>>>> * Code-gen >>>>>>>> For n-ary operators, the patch generates code as follows: >>>>>>>> if (code == <expr code>) >>>>>>>> { >>>>>>>> match operand 0 >>>>>>>> match operand 1 >>>>>>>> .... >>>>>>>> match operand n-1 >>>>>>>> transform >>>>>>>> return true; >>>>>>>> } >>>>>>>> >>>>>>>> * Adding parent, level, pos fields to AST (operand). >>>>>>>> One change (hack), I have made to code-gen, is naming of operands >>>>>>>> that are >>>>>>>> assigned to gimple_assign_rhs in code-gen of expressions. >>>>>>>> I am not sure if this is really needed. >>>>>>>> >>>>>>>> in AST, we have following representation for expressions: >>>>>>>> expr- >>>>>>>> / \ >>>>>>>> left right >>>>>>>> operand operand >>>>>>>> >>>>>>>> code-gen off AST (expr::gen_gimple_match) produces code like: >>>>>>>> { >>>>>>>> tree op = gimple_assign_rhs1 (def_stmt); >>>>>>>> // code-gen for left operand >>>>>>>> } >>>>>>>> { >>>>>>>> tree op = gimple_assign_rhs2 (def_stmt); >>>>>>>> // code-gen for right operand >>>>>>>> } >>>>>>>> >>>>>>>> We can do this since, in expr::gen_gimple_match, we call >>>>>>>> left_operand->gen_gimple_match, >>>>>>>> come back to expr::gen_gimple_match (after >>>>>>>> left_operand->gen_gimple_match() returns), and >>>>>>>> then call right_operand->gen_gimple_match(). >>>>>>>> >>>>>>>> However in decision tree, the expression gets represented as >>>>>>>> follows: >>>>>>>> expr >>>>>>>> | >>>>>>>> left-operand >>>>>>>> | >>>>>>>> right-operand >>>>>>>> >>>>>>>> dt_expr::gen_gimple calls left_operand->gen_gimple, which calls >>>>>>>> right_operand->gen_gimple, >>>>>>>> so we return to dt_expr::gen_gimple, after code is generated for >>>>>>>> the >>>>>>>> whole subtree of expr. >>>>>>>> >>>>>>>> So I assign operands at the start before generating code for the >>>>>>>> subtree: >>>>>>>> tree op00 = gimple_assign_rhs1 (def_stmt); >>>>>>>> tree op10 = gimple_assign_rhs2 (def_stmt); >>>>>>>> <code for left-operand> >>>>>>>> <code for right-operand> >>>>>>>> each of the operands know their positions, so they know which >>>>>>>> operand >>>>>>>> to use (get_op_name()). >>>>>>>> the names are generated as: >>>>>>>> op<position><level>, position = index of operand in it's parent's >>>>>>>> expr's vector (vec<operand *> ops). >>>>>>>> level: level of AST, at which the operand is stored. >>>>>>>> >>>>>>>> For this I added three fields to operand (unsigned pos, and >>>>>>>> unsigned >>>>>>>> level, operand *parent), and accordingly >>>>>>>> made changes to expr::append_op. In a way this is abusing the AST. >>>>>>>> This probably works (works with test-cases i tried), but I don't >>>>>>>> like it much. >>>>>>> >>>>>>> The naming is ok, but indeed it doesn't feel very clean. Somehow >>>>>>> this >>>>>>> belongs to the decision tree instead... (but then you need to lookup >>>>>>> the decision tree operand info from code-gen which happens off the >>>>>>> AST...). >>>>>>> >>>>>>>> * Order of matching. >>>>>>>> Currently the order of matching operands is strictly 0, 1, .. n - 1 >>>>>>>> for n-ary operator. >>>>>>>> However this might not be the best choice. >>>>>>>> >>>>>>>> For example, consider following patterns: >>>>>>>> (operator op1 op) S1 >>>>>>>> (operator op2 op) S2 >>>>>>>> >>>>>>>> Both have common operator, and 1st common operand, they differ in >>>>>>>> 0th operand. >>>>>>>> With patch, the code generated is: >>>>>>>> if (code == <expr-code>) >>>>>>>> { >>>>>>>> match op1 >>>>>>>> match op >>>>>>>> S1 >>>>>>>> >>>>>>>> match op2 >>>>>>>> match op >>>>>>>> S2 >>>>>>>> } >>>>>>>> >>>>>>>> A better ordering would be to match the 1st operand and then >>>>>>>> respective 0th operands of both patterns: >>>>>>>> >>>>>>>> if (code == <expr-code> >>>>>>>> { >>>>>>>> match op >>>>>>>> match op1 >>>>>>>> S1 >>>>>>>> match op0 >>>>>>>> S2 >>>>>>>> } >>>>>>>> >>>>>>>> This complicates insertion into decision tree. >>>>>>> >>>>>>> I'd say we leave "optimizing" the decision tree to a later stage - >>>>>>> it's a global >>>>>>> optimization problem after all (I believe you can't do optimally with >>>>>>> just >>>>>>> looking at the present decision tree and the AST you want to >>>>>>> insert). >>>>>>> >>>>>>>> * hack: only one generated gimple_match_and_simplify function >>>>>>>> In the patch, I generate code for gimple_match_and_simplify with 3 >>>>>>>> operands version (op0, op1, op2) >>>>>>>> and the other two (unary and binary versions), call it with >>>>>>>> NULL_TREE >>>>>>>> for extra operands. >>>>>>>> I will soon change that. >>>>>>>> >>>>>>>> * Testing patterns. >>>>>>>> I think I have written most of the test-cases wrongly in match-2.c >>>>>>>> For example this test-case doesn't match >>>>>>>> (i haven't checked in "fix testsuite fallout" patch, so this might >>>>>>>> not be true >>>>>>>> anymore) >>>>>>>> >>>>>>>> (match_and_simplify >>>>>>>> plus (minus @0 @1) @1) >>>>>>>> @0) >>>>>>>> >>>>>>>> for this test-case: >>>>>>>> /* (x - y) + y -> x */ >>>>>>>> int f6(int x, int y) >>>>>>>> { >>>>>>>> int t1 = x - y; >>>>>>>> return t1 + y; >>>>>>>> } >>>>>>>> /* { dg-final { scan-tree-dump "gimple_match_and_simplified to >>>>>>>> \[^\n\r\]*= x_\\d\+\\(D\\)" "forwprop1" } } */ >>>>>>>> >>>>>>>> I get following output (tree-forwprop-details): >>>>>>>> >>>>>>>> ;; Function f6 (f, funcdef_no=0, decl_uid=1744, symbol_order=0) >>>>>>>> >>>>>>>> gimple_match_and_simplified to _4 = y_2(D) + t1_3; >>>>>>>> f (int x, int y) >>>>>>>> { >>>>>>>> int t1; >>>>>>>> int _4; >>>>>>>> >>>>>>>> <bb 2>: >>>>>>>> t1_3 = x_1(D) - y_2(D); >>>>>>>> _4 = x_1(D); >>>>>>>> return _4; >>>>>>>> >>>>>>>> } >>>>>>>> >>>>>>>> Shouldn't it show: gimple_match_and_simplified to _4 = x_1(D) >>>>>>>> instead ? >>>>>>> >>>>>>> Yeah. I suppose you hit the issue that we don't commutate patterns, >>>>>>> so try with a commutated pattern inserted. >>>>>>> >>>>>>>> The issue is this test-case fails in isolation, however it passes >>>>>>>> when >>>>>>>> it is placed with other >>>>>>>> test cases that PASS and have same regex in scan-tree-dump as this >>>>>>>> test-case: >>>>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>>>>> x_\\d\+\\(D\\)" >>>>>>>> >>>>>>>> Summary: >>>>>>>> I think we need to (at-least) deal with following problems for >>>>>>>> decision tree: >>>>>>> >>>>>>> I have to leave for a while now, I'll come back later and have a >>>>>>> detailed >>>>>>> look at the patch to try answer the questions below. >>>>>> >>>>>> Now for the remainder. >>>>>> >>>>>>>> a) Representation and construction of decision tree from AST - >>>>>>>> We do this by storing prefixes of preorder traversals of AST in >>>>>>>> decision tree. Do we finalize on that ? >>>>>> >>>>>> Yes. >>>>>> >>>>>>>> b) Order of operand matching >>>>>> >>>>>> I think that's the same as a) - the visiting order of the AST >>>>>> determines >>>>>> the order of operand matching. >>>>>> >>>>>>>> c) Generating patterns for built-in functions (I guess this will be >>>>>>>> similar to expr-gen). >>>>>> >>>>>> It should really be transparent. >>>>>> >>>>>>>> d) Multiple matching patterns >>>>>> >>>>>> The decision tree phase leaves us with a sequence of >>>>>> >>>>>> if (if-expr of pattern1) >>>>>> transform-pattern1 >>>>>> else (if-expr of pattern2) >>>>>> transform-pattern2 >>>>>> ... >>>>>> >>>>>> etc. thus that part should be basically unchanged (only the match >>>>>> part is driven by a decision tree). >>>>>> >>>>>>>> e) Commutative ops >>>>>> >>>>>> I'd do these as a pass over the AST of each pattern, duplicating >>>>>> patterns. >>>>>> >>>>>> Btw, I see you mirror the AST structure with >>>>>> dt_{node,operand,expr,...}. >>>>>> It seems to me that you can simply stick an AST operand * into >>>>>> dt_node and be done with just having dt_node? That is, the AST >>>>>> operand refered to contains all necessary information already. >>>>>> >>>>>>>> >>>>>>>> Pattern Enhancements: >>>>>>>> Some of these were already discussed, I am jotting them here: >>>>>>>> a) Symbol for multiple operators >>>>>>>> b) grouping patterns by ifexpr >>>>>>>> c) should we have MATCH_FAILED or FAIL to explicitly denote failure >>>>>>>> in >>>>>>>> manual transform (c_expr) ? >>>>>>>> d) making operators to be commutative. >>>>>> >>>>>> I'd say d) would be really nice to have. Either with a grammar >>>>>> extension to say which expression should have commutation applied >>>>>> to (for example: (plus! (minus @1 @2) @2) would say that the >>>>>> operands of plus should be commutated), or with a simple heuristic. >>>>>> I'd say explicitly specifying is better. >>>>>> >>>>>> a) will probably be needed to avoid pattern explosion. But similar to >>>>>> d) >>>>>> I'd do it as a pre-pass on the AST, duplicating the patterns. It's >>>>>> also a good question how to define its semantic. The mode-iterators >>>>>> that the machine description supports would suggest we do instead >>>>>> sth like >>>>>> >>>>>> (for OP in (plus minus mult) >>>>>> (match_and_simplify >>>>>> (OP @1 integer_zerop) >>>>>> ....)) >>>>>> >>>>>> c) should be easy, I'm going to try sth like that - also setting >>>>>> temporaries >>>>>> from the ifexpr and re-using them in the result like >>>>>> >>>>>> (match_and_simplify >>>>>> (plus @1 @2) >>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@2)) >>>>>> && find_better_ops (&@3, &@4, @1, @2)) >>>>>> (plus @3 @4)) >>>>>> >>>>>> which also requires us to scan the AST for the highest capture index >>>>>> (a good thing anyway, given my hack to statically set the size of >>>>>> the captures array ;)) >>>>>> >>>>>> b) I'm not sure it's worth the trouble (apart from doing less typing >>>>>> in match.pd). We're probably going to have GENERIC defined >>>>>> when creating the generic routines and GIMPLE when creating >>>>>> the gimple routines so we can easily disable patterns for one or >>>>>> the other by doing >>>>>> >>>>>> #ifdef GIMPLE >>>>>> (match_and_simplify >>>>>> ...) >>>>>> #endif >>>>>> >>>>>>>> Once the decision tree is done, I can try working on these. >>>>>>>> Till Sunday, I will attempt to have another prototype, that covers >>>>>>>> most of the patterns in match.pd >>>>>>>> (except for cond_expr, and those requiring GENERIC support), and >>>>>>>> accompanying correct test-case >>>>>>>> for each pattern. And next-week start working on a fair patch. >>>>>> >>>>>> Sounds good. >>>>> I have attached a patch, that fires most of the patterns from match.pd >>>>> except for those that include >>>>> non-matching capture or require GENERIC support, and added a test-case >>>>> match-decision-tree.c >>>>> (fires around 33 patterns). >>>>> >>>>> * Representation of decision tree: >>>>> I clubbed dt_capture, dt_predicate, dt_expr into dt_operand, and added >>>>> new gen_gimple_match() function in AST >>>>> to generates matching code. dt_operand contains operand *op as member, >>>>> and merely calls op->gen_gimple_match. >>>>> (that's the intent anyway, in the patch, i needed to put some hacks to >>>>> get it working). >>>>> In the patch, i have kept: dt_node, dt_operand (dt_node), dt_simplify >>>>> (dt_node), to separate matching operand >>>>> from simplification (ifexpr, result). >>>>> >>>>> * Constructing decision tree from AST >>>>> Same as previous patch, stores prefixes of preorder traversals of AST. >>>>> decision_tree::print() pretty-prints the decision tree. >>>>> >>>>> I hope we won't need to change much up-to this point in the patch. >>>>> >>>>> * Code-gen >>>>> code gen of (operator op0 op1 ..opN) simplify takes the following >>>>> form: >>>>> if (code == <expr code>) >>>>> { >>>>> match op0 >>>>> match op1 >>>>> ... >>>>> match opN >>>>> simplify >>>>> return true; >>>>> } >>>>> >>>>> a) Scope of temporaries (the ones that are assigned to >>>>> gimple_assign_rhsX): >>>>> for code-gen of expressions off the AST, we create a temporary: >>>>> >>>>> { >>>>> tree op = gimple_assign_rhsX (); >>>>> generate code for X'th operand >>>>> } >>>>> >>>>> In the decision tree, an expression can only directly access it's 1st >>>>> operand (since it's preorder traversal). >>>>> So I decided to first create all temporaries and then generate code >>>>> for operands: >>>>> >>>>> { >>>>> tree op0 = gimple_assign_rhs1 (); >>>>> tree op1 = gimple_assign_rhs2 (); >>>>> tree opX-1 = gimple_assign_rhsX (); >>>>> >>>>> code for matching 0th operand >>>>> code for matching 1st operand >>>>> ... >>>>> code for matching Nth operand >>>>> } >>>>> >>>>> Unfortunately, this creates a problem if the operand itself is an >>>>> expression, >>>>> since it's temporaries would also be named op0, op1 ... they would >>>>> shadow the outer op0, op1 .. temporaries. >>>>> So we need distinct names for each temporary (for all temps within one >>>>> pattern). >>>>> The temporary name is created by: >>>>> op<position of child operand><level of tree>; >>>>> and each child of AST knows it's position and level, so it can >>>>> correctly access it's temp. >>>>> This requires adding members operand *parent, unsigned pos, unsigned >>>>> level to struct operand. >>>>> I have done it this way in the patch, however it doesn't feel clean, I >>>>> will try to see if we can change this. >>>>> >>>>> b) Short-ranged gotos >>>>> consider: (negate @0) S1 >>>>> The code is generated as (with the patch): >>>>> >>>>> if (code == PLUS) >>>>> { >>>>> if (!captures [0]) >>>>> { >>>>> captures[0] = op0; >>>>> goto L0; >>>>> } >>>>> else if (captures[0] == op0) >>>>> { >>>>> L0: >>>>> // simplification code S1 >>>>> return true; >>>>> } >>>>> } >>>>> >>>>> Instead of: >>>>> if (!captures[0]) >>>>> captures[0] = op0; >>>>> else if (captures[0] != op0) >>>>> goto L0; >>>>> >>>>> // Simplification code S1 >>>>> return true; >>>>> >>>>> L0: // match next pattern >>>>> >>>>> Generating code the former way is more easier, since location of label >>>>> immediately follows >>>>> capture code (if (!captures[0]) ...). In the latter case, I probably >>>>> need to store label name in the sibling's node in >>>>> the decision tree and then retrieve that. >>>>> Similarly for valueize (expr::gen_valueize). >>>> >>>> But then why not generate >>>> >>>> if (!captures[0] >>>> || captures[0] == op0) >>>> { >>>> captures[0] = op0; >>>> // simplification code S1 >>>> return true; >>>> } >>>> >>>> and go without label and goto? Or did I miss something? >>>> >>>>> c) Shared captures between all patterns: >>>>> In the patch all patterns share the capture array tree captures[4] = >>>>> {} >>>>> (since all match patterns get interleaved in generated code off >>>>> decision tree). >>>>> This requires extra code to be generated to make sure captures are >>>>> correctly restored for another pattern. >>> >>> Btw, I see that you back-track the decision tree. That is, for >>> >>> root >>> |--operand: MINUS_EXPR >>> |----operand: PLUS_EXPR >>> |------operand: @0 >>> |--------operand: @1 >>> |----------operand: @0 >>> |------------simplify >>> |----------operand: @1 >>> |------------simplify >>> >>> you do >>> >>> if (code == MINUS_EXPR) >>> { >>> tree captures[4] = {}; >>> if (TREE_CODE (op0) == SSA_NAME) >>> { >>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>> if (is_gimple_assign (def_stmt) && gimple_assign_rhs_code >>> (def_stmt) == PLUS_EXPR) >>> { >>> tree op01 = gimple_assign_rhs1 (def_stmt); >>> if (valueize && TREE_CODE (op01) == SSA_NAME) >>> op01 = valueize (op01); >>> else >>> goto L0; >>> if (op01) >>> { >>> L0: >>> tree op11 = gimple_assign_rhs2 (def_stmt); >>> if (valueize && TREE_CODE (op11) == SSA_NAME) >>> op11 = valueize (op11); >>> else >>> goto L1; >>> if (op11) >>> { >>> L1: >>> tree captures_0 = captures[0]; >>> if (!captures[0]) >>> { >>> captures[0] = op01; >>> goto L2; >>> } >>> else if (captures[0] == op01) >>> { >>> L2: >>> tree captures_1 = captures[1]; >>> if (!captures[1]) >>> { >>> captures[1] = op11; >>> goto L3; >>> } >>> else if (captures[1] == op11) >>> { >>> L3: >>> tree captures_2 = captures[0]; >>> if (!captures[0]) >>> { >>> captures[0] = op1; >>> goto L4; >>> } >>> else if (captures[0] == op1) >>> { >>> L4: >>> res_ops[0] = captures[1]; >>> *res_code = TREE_CODE (res_ops[0]); >>> return true; >>> } >>> captures [0] = captures_2; >>> tree captures_3 = captures[1]; >>> if (!captures[1]) >>> { >>> captures[1] = op1; >>> goto L5; >>> } >>> else if (captures[1] == op1) >>> { >>> L5: >>> res_ops[0] = captures[0]; >>> *res_code = TREE_CODE (res_ops[0]); >>> return true; >>> } >>> captures [1] = captures_3; >>> } >>> captures [1] = captures_1; >>> } >>> captures [0] = captures_0; >>> } >>> } >>> } >>> } >>> >>> but if you processed all children of a decision tree node and didn't >>> hit one of the simplifies (which return true) then you know nothing >>> else will match and you can just return false. That early out seems >>> to be missing completely and we fall through processing all siblings >>> of the parents. >>> >>> But maybe I am missing something? I see that if I make capture >>> IDs not matching that we'd miss to detect things then (as a >>> "first" capture always "matches"). >>> >>> root >>> |--operand: MINUS_EXPR >>> |----operand: PLUS_EXPR >>> |------operand: @0 >>> |--------operand: @1 >>> |----------operand: @0 >>> |------------simplify >>> |------operand: @1 >>> |--------operand: @0 >>> |----------operand: @0 >>> |------------simplify >>> >>> So I suppose we really have to distinguish "first" captures from >>> "matching" captures, basically not treating the "first" ones as >>> nodes of the decision tree and only populating the capture >>> array once we hit the code generation part of a pattern (well, >>> or the if-expr). >> >> It's probably a good idea to pre-parse the AST to classify >> captures and matches differently. Given for example >> >> (plus @0 (minus@0 @1 @2)) >> >> this would say (well, if we support this), that the plus has two >> "same" operands (on GIMPLE the "same" valueized SSA name) >> and that this operand is defined by a (minus @1 @2). But with >> the current operator ordering in the decision tree we'd meet >> the no-op capture first and the match in an odd position. >> >> In the RTL machine description the matches are more explicit >> like >> >> (plus (match @0) (minus@0 @1 @2)) >> >> so maybe we want to auto-detect those and rewrite the AST. >> Just distinguish between capture-def and capture-ref, where >> expression captures are always -def but leafs may be -refs as well. >> >> In the decision tree builder we'd "ignore" -defs and only make >> -refs explicit (and the -refs would need to "materialize" the -defs >> just like the ifexpr or transform stage). >> >> So I'd say add a flag to struct capture telling whether it's a ref >> and compute that via an AST traversal (that should then error >> for invalid refs). >> >> In the decision tree I'd keep a vector of operand * and record >> a name of a temporary the operand can be accessed via >> (in case any of the AST operands represented by the decision tree >> node has a capture associated). >> >> You'd still have "empty" leaf captures that would then only record >> the value into a temporary. >> >> Eventually you need a back-ref to the decision tree node for >> each capture when you insert a new AST traversal so you >> can decide whether a match (a capture-ref) refers to the same >> decision tree node or not. >> >> Richard. > > Ok, let me re-phrase after some more thinking and talking with > a collegue. > > When you do the AST traversal computing the array of AST operands * > you'd initialize book-keeping like the following > > for operand in AST do > if (operand is a capture) > { > if (this capture index was not yet seen) > { > record for this pattern (in its simplify node created at the > decision tree leaf) that the capture index refers to the > operand with the current index > if (capture has a sub-expression) > recurse > else > insert "true" op (matches everything) > } > else > { > generate a decision tree node that matches the decision > tree index of the capture > recurse > } > } > else > record op and recurse > > So when you generate code you implicitely capture all operands > in automatic variables named like 'captureN' with 'N' being the > current level of the decision tree (the index of the operand array > from the AST walk). Once you hit a 'match-X' node you > simply match against the operand from the refered to decision > tree index. > > Once you reach the ifexpr point you populate the operands[] array > from the side-information you initialized in the AST traversal and > the automatic variables initialized from the decision tree walk. > > So > > (plus (minus@2 @1 @3) @2) > > would have the decision tree > > plus - minus - 'true' - 'true' - match(1) - simplify > > and the generated code would look like (simplified) > > [can't capture outermost expr] > if (code == plus) > { > op1 = gimple_assign_rhs1 (plus-stmt); > if (TREE_CODE (op1) == minus) > { > op2 = gimple_assign_rhs1 (minus-stmt); > if (true) > { > op3 = gimple_assign_rhs2 (minus-stmt); > if (true) > { > op4 = gimple_assign_rhs2 (plus-stmt); > if (op4 == op1) > { > operands[0] = op2; > operands[1] = op1; > operands[2] = op3; Did you mean captures ? captures[0] = op2; captures[1] = op1; captures[2] = op3 > <simplify> > > where the 'true' cases would be matched last (as "else") in case there > are multiplie choices here if you'd have also > > (plus @1 (minus @2 @3)) > > the decision tree would become > > plus - minus - 'true' - 'true' - match(1) - simplify > \ > 'true' - minus - 'true' - 'true' - simplify > Thanks, I will get started on this. Today I added a patch that handles "matching-captures" but does not commonize, I am not not posting since it's of no use now. The above algorithm will also get rid of generating temporary as op<pos><level>.
suppose these are patterns: (plus @0 @1) S1 (plus @1 @0) S2 then how would be decision tree represented as ? plus - true - true - S1 \ S2 I suppose pattern-1 would always match (even with earlier code gen. > Richard. > >>> There are also opportunities to optimize the generated code, but >>> basic correctness first I guess. >>> >>> I suppose we have to work a bit on the capture stuff. >>> >>> Btw, you can easily play with the code generator by doing inside >>> the gcc build directory >>> >>> obj/gcc> build/genmatch test.pd > test.c >>> >>> with a small test.pd. I used the following for the above examples: >>> >>> (match_and_simplify >>> (MINUS_EXPR (PLUS_EXPR @0 @1) @0) >>> @1) >>> (match_and_simplify >>> (MINUS_EXPR (PLUS_EXPR @1 @0) @0) >>> @1) > >>> Richard. >>> >>>>> I will change this to have capture per pattern >>>>> tree captures1[4] = {}; // for pattern-1 >>>>> tree captures2[4] = {}; >>>> >>>> Hmm, is this the matching captures issue I mentioned? Btw, I see >>>> you do >>>> >>>> +void >>>> +walk_operand_preorder(vec<operand *>& operands, operand *op) >>>> +{ >>>> + if (op->type == operand::OP_CAPTURE || op->type == >>>> operand::OP_PREDICATE || op->type == operand::OP_C_EXPR) >>>> + { >>>> + operands.safe_push (op); >>>> + return; >>>> + } >>>> >>>> but that leaves captured expressions as a single operand? >>>> >>>> (plus (minus@1 @2 @3) @2) >>>> >>>> would have a decision tree >>>> >>>> plus -> minus -> @2 >>>> >>>> correct? >>>> >>>>> d) Matching multiple patterns. >>>>> Code for patterns with same match, but different transforms is >>>>> generated as follows: >>>>> code for match operand. >>>>> if (if-expr of pattern-1) >>>>> { >>>>> code for result of pattern-1 >>>>> return true; >>>>> } >>>>> if (if-expr of pattern-2) >>>>> { >>>>> code for result of pattern-2 >>>>> return true; >>>>> } >>>> >>>> good. >>>> >>>>> ... >>>>> Should we emit a warning for patterns that have same match operand but >>>>> no if-expr and no manual transform ? >>>>> >>>>> for eg: >>>>> (match_and_simplify >>>>> (plus (minus @0 @1) @1) >>>>> @0 >>>>> >>>>> (match_and_simplify >>>>> (plus (minus @0 @1) @1) >>>>> @1 // just for illustration >>>>> >>>>> Since the matching is ambiguous (same match, no if-expr, no manual >>>>> transform). >>>> >>>> Yeah, I suppose we should. >>>> >>>>> we are left to choose between result of pattern-1 and result of >>>>> pattern-2. >>>>> We can emit warning and choose result of pattern-1 (first-match rule >>>>> as in flex). >>>>> >>>>> e) Non-matching captures: >>>>> Haven't thought about this yet. >>>>> >>>>> * Should we add "negative" predicates that match only if the predicate >>>>> fails ? >>>>> for eg: >>>>> (match_and_simplify >>>>> trunc_mod integer_zerop@0 !integer_zerop) >>>>> @0) >>>> >>>> well, there could simply be a integer_not_zerop predicate. >>>> >>>>> * Testing >>>>> Sorry to bring this up again, but I am still not clear what regex to >>>>> write in scan-tree-dump. >>>>> >>>>> Suppose we have these two patterns in match.pd: >>>>> /* (x + y) - y -> x */ >>>>> (match_and_simplify >>>>> (minus (plus @0 @1) @1) >>>>> @0) >>>>> >>>>> /* (x - y) + y -> x */ >>>>> (match_and_simplify >>>>> (plus (minus @0 @1) @1) >>>>> @0) >>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>> x_\\d\+\\(D\\)" >>>>> >>>>> I have following test-cases: >>>>> int f1(int x, int y) >>>>> { >>>>> int t1 = x + y; >>>>> return t1 - y; >>>>> } >>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>> x_\\d\+\\(D\\)" >>>>> >>>>> int f2(int x, int y) >>>>> { >>>>> int t1 = x - y; >>>>> return t1 + y; >>>>> } >>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= >>>>> x_\\d\+\\(D\\)" >>>>> >>>>> both the test-cases have same regex. >>>>> Tested in isolation f1 passes (first pattern if fired) and f2 fails >>>>> (second pattern doesn't fire, it does after >>>>> adding it's commutative variant, but that's irrelevant for this case). >>>>> >>>>> However when both test-cases are put in one file both the test cases >>>>> PASS. >>>>> I think that's because both of them have same regex: \[^\n\r\]*= >>>>> x_\\d\+\\(D\\) >>>>> so in f1 and f2's regex both match the dump for f1 function in >>>>> forwprop dump file: >>>>> "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\) >>>>> >>>>> As a quick hack i rewrote f1 and f2 as: >>>>> int f1(int x, int y) >>>>> { >>>>> int t1 = x + y; >>>>> int f1_val = t1 - y; >>>>> return f1_val; >>>>> } >>>>> scan-tree-dump "gimple_match_and_simplified to f1_val_\\d\+ = >>>>> x_\\d\+\\(D\\)" >>>>> >>>>> int f2(int x, int y) >>>>> { >>>>> int t1 = x - y; >>>>> int f2_val = t1 + y; >>>>> return f2_val; >>>>> } >>>>> scan-tree-dump "gimple_match_and_simplified to f2_val_\\d\+ = >>>>> x_\\d\+\\(D\\)" >>>>> so both f1 and f2's scan-tree-dump have different regexes. >>>>> and f2's regex does not match dump of f1's function. >>>>> This matches all patterns in match-decision-tree.c however this is not >>>>> ideal, >>>>> since it does not check for matching dump across newlines. >>>>> Could you suggest a better way ? >>>> >>>> There isn't a good better way (the others explicitely do _not_ match >>>> against >>>> a newline - see the ^ in the \[\] group). Well, apart from splitting >>>> the testcase >>>> into multiple files of course. >>>> >>>> Richard. >>>> >>>>> Thanks and Regards, >>>>> Prathamesh >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>>>> Thanks and Regards, >>>>>>>> Prathamesh >>>>>>>>> >>>>>>>>> Richard. >>>>>>>>> >>>>>>>>>> * Code generation. >>>>>>>>>> Code shall be generated by walking the decision tree. >>>>>>>>>> The way it's constructed, there's no difference between code >>>>>>>>>> generation >>>>>>>>>> for "matching" and code generation for "transform". For >>>>>>>>>> non-simplificaton >>>>>>>>>> operands, "matching" code is generated, and for "simplification" >>>>>>>>>> operands, >>>>>>>>>> "transform" code is generated. The tree shall be walked twice, >>>>>>>>>> once to generate GIMPLE code and second time for GENERIC. >>>>>>>>>> For simplicity, I currently return false whenever there's a fail >>>>>>>>>> in match, >>>>>>>>>> instead of trying to match the next pattern. >>>>>>>>>> >>>>>>>>>> Code-gen for capture - same as capture::gen_gimple_match. >>>>>>>>>> >>>>>>>>>> Code-gen for predicate - I haven't added support for predicate >>>>>>>>>> in >>>>>>>>>> decision tree yet, but I guess that would be the same as >>>>>>>>>> predicate::gen_gimple_match >>>>>>>>>> >>>>>>>>>> Code-gen for expr. >>>>>>>>>> There are two types of code-gen for expr. >>>>>>>>>> The patch generates following code: >>>>>>>>>> Type 1 - expr is child of root node. >>>>>>>>>> the only code that gets generated is (in >>>>>>>>>> decision_tree::gen_gimple): >>>>>>>>>> if (code == <expr code>) >>>>>>>>>> { >>>>>>>>>> tree captures[4] = {} >>>>>>>>>> <generated code for children> >>>>>>>>>> } >>>>>>>>>> This is similar to generating matching code in >>>>>>>>>> write_nary_simplifiers. >>>>>>>>>> >>>>>>>>>> Type 2 - expr_1 is a child of another expr_0 node. >>>>>>>>>> The code gets generated as follows (dt_expr::gen_gimple): >>>>>>>>>> { >>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op); >>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == <expr_1-code>) >>>>>>>>>> { >>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt); >>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME) >>>>>>>>>> { >>>>>>>>>> op = valueize (op); >>>>>>>>>> if (!op) return false; >>>>>>>>>> } >>>>>>>>>> <code-gen for children of expr_1 node> >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Example: >>>>>>>>>> (negate (negate @0)) >>>>>>>>>> S1 >>>>>>>>>> >>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>> S2 >>>>>>>>>> >>>>>>>>>> decision tree: >>>>>>>>>> >>>>>>>>>> dummy/root >>>>>>>>>> | >>>>>>>>>> NEGATE_EXPR >>>>>>>>>> / \ >>>>>>>>>> BIT_NOT NEGATE_EXPR >>>>>>>>>> | | >>>>>>>>>> @0 @0 >>>>>>>>>> | | >>>>>>>>>> S1 S2 >>>>>>>>>> >>>>>>>>>> // code-gen for NEGATE_EXPR (child of root): >>>>>>>>>> if (code == NEGATE_EXPR) >>>>>>>>>> { >>>>>>>>>> tree captures[4] = {}; >>>>>>>>>> // code gen for BIT_NOT_EXPR >>>>>>>>>> { >>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR) >>>>>>>>>> { >>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt); >>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME) >>>>>>>>>> { >>>>>>>>>> op = valueize (op); >>>>>>>>>> if (!op) return false; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> // code-gen for @0, child of BIT_NOT_EXPR >>>>>>>>>> if (!captures[0]) >>>>>>>>>> captures[0] = op; >>>>>>>>>> else if (captures[0] != op) >>>>>>>>>> return false; >>>>>>>>>> >>>>>>>>>> // code-gen for S1, child of @0 >>>>>>>>>> < same as code generated by .gen_gimple_transform > >>>>>>>>>> return true; >>>>>>>>>> } >>>>>>>>>> // code gen for inner NEGATE_EXPR >>>>>>>>>> { >>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR) >>>>>>>>>> <rest similar to the BIT_NOT case> >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> The following gets duplicated with the patch: >>>>>>>>>> { >>>>>>>>>> gimple_def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME) >>>>>>>>>> return false; >>>>>>>>>> if (is_gimple_assign (def_stmt) >>>>>>>>>> && gimple_assign_rhs_code (def_stmt) == <expr-code>) >>>>>>>>>> ... >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Improving code-gen for expr: >>>>>>>>>> "gimple def_stmt = ..." and "if (TREE_CODE (op0)" get duplicated, >>>>>>>>>> while they could be factored out, similar to this: >>>>>>>>>> >>>>>>>>>> { >>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0); >>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME) >>>>>>>>>> return false; >>>>>>>>>> if (!is_gimple_assign (def_stmt)) >>>>>>>>>> return false; >>>>>>>>>> if (gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR) >>>>>>>>>> { >>>>>>>>>> // code-gen for BIT_NOT_EXPR subtree >>>>>>>>>> } >>>>>>>>>> else if (gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR) >>>>>>>>>> { >>>>>>>>>> // code-gen for NEGATE_EXPR subtree >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> For factoring "gimple def_stmt ..." and "if (TREE_CODE (op0) != >>>>>>>>>> SSA_NAME" >>>>>>>>>> we could have that generated at the parent of expr's node rather >>>>>>>>>> than >>>>>>>>>> at expr. However that would be incorrect for cases where not all >>>>>>>>>> children >>>>>>>>>> of a node are expressions: >>>>>>>>>> >>>>>>>>>> Example: >>>>>>>>>> // patterns only for illustration >>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>> (negate @0) >>>>>>>>>> >>>>>>>>>> root >>>>>>>>>> | >>>>>>>>>> negate >>>>>>>>>> / \ >>>>>>>>>> bit_not @0 >>>>>>>>>> | >>>>>>>>>> @0 >>>>>>>>>> >>>>>>>>>> we cannot have the above code generated at negate, >>>>>>>>>> since it's not applicable negate's 2nd child (@0). >>>>>>>>>> >>>>>>>>>> This can be done by grouping together children that are >>>>>>>>>> expressions. >>>>>>>>>> However the patch does not do that. >>>>>>>>>> >>>>>>>>>> * Code-gen for simplification operand >>>>>>>>>> This involves code-gen for ifexpr and result of pattern. >>>>>>>>>> Calls gen_gimple_transform of ifexpr and result >>>>>>>>>> (dt_simplify::gen_gimple) >>>>>>>>>> So this is really code-gen off AST >>>>>>>>> >>>>>>>>> Right (modulo replacing captures with their replacements). >>>>>>>>> >>>>>>>>>> * Matching multiple patterns >>>>>>>>>> A pattern has following parts: match, ifexpr and result. >>>>>>>>>> If pattern fails in match operand, I guess we can safely return >>>>>>>>>> false ? >>>>>>>>>> We "club" together patterns that have same match operand, >>>>>>>>>> and use goto, if one of them fails in their (ifexpr/result) and >>>>>>>>>> then goto the >>>>>>>>>> (ifexpr/result) of the next operand. >>>>>>>>>> >>>>>>>>>> Example: >>>>>>>>>> /* x & 0 -> 0 */ >>>>>>>>>> (match_and_simplify >>>>>>>>>> (bit_and @0 @1) >>>>>>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 == >>>>>>>>>> integer_zero_node)) >>>>>>>>>> { integer_zero_node; }) >>>>>>>>>> >>>>>>>>>> /* x & -1 -> x */ >>>>>>>>>> (match_and_simplify >>>>>>>>>> (bit_and @0 @1) >>>>>>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) >>>>>>>>>> && (@1 == integer_minus_one_node) >>>>>>>>>> @0) >>>>>>>>>> >>>>>>>>>> For both patterns match is same. >>>>>>>>>> >>>>>>>>>> Decision Tree: >>>>>>>>>> bit_and >>>>>>>>>> / \ >>>>>>>>>> @0 @1 >>>>>>>>>> | >>>>>>>>>> S1, S2 >>>>>>>>> >>>>>>>>> I think it's worth adding a diagnostic to genmach whenever exactly >>>>>>>>> same matches appear. But I'd say generally we'd support this >>>>>>>>> by testing the ifexpr of the next pattern. >>>>>>>>> >>>>>>>>>> S1 represents <ifexpr, result> of pattern-1, and S2 represents >>>>>>>>>> <ifexpr, result> >>>>>>>>>> of pattern-2 respectively. >>>>>>>>>> S1, S2 would be stored as children of @1 (the last operand of >>>>>>>>>> n-ary operator), >>>>>>>>>> in dt_operand::kids vector. >>>>>>>>>> >>>>>>>>>> The code would be generated as: >>>>>>>>>> >>>>>>>>>> matching code. >>>>>>>>>> if (! pattern-1 ifexpr condition) >>>>>>>>>> goto simplify2; // next pattern with the same "match" operand. >>>>>>>>>> transform code for pattern 1 >>>>>>>>>> >>>>>>>>>> simplify2: >>>>>>>>>> if (! pattern-2 ifexpr condition) >>>>>>>>>> return false; // last pattern >>>>>>>>>> transform code for pattern 2. >>>>>>>>>> >>>>>>>>>> If matching itself fails, that is neither of the decisions get >>>>>>>>>> matched, >>>>>>>>>> I believe we can then return false as it cannot match any other >>>>>>>>>> pattern ? >>>>>>>>>> >>>>>>>>>> * patterns needing hacks like cond_expr or GENERIC support >>>>>>>>>> I haven't given thought to this yet. I suppose we can look to >>>>>>>>>> handle >>>>>>>>>> these after adding support for GENERIC. Instead of generating >>>>>>>>>> GENERIC >>>>>>>>>> matching in >>>>>>>>>> gimple_match_and_simplify, could we then call >>>>>>>>>> generic_match_and_simplify from >>>>>>>>>> within gimple_match_and_simplify ? >>>>>>>>> >>>>>>>>> Yes (that's what's currently done btw). >>>>>>>>> >>>>>>>>>> * Tests >>>>>>>>>> The patch transformed the following patterns: >>>>>>>>>> >>>>>>>>>> (match_and_simplify >>>>>>>>>> (negate (bit_not @0)) >>>>>>>>>> if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) >>>>>>>>>> (plus @0 { build_int_cst (TREE_TYPE (@0)), 1); })) >>>>>>>>>> >>>>>>>>>> (match_and_simplify >>>>>>>>>> (negate (negate @0)) >>>>>>>>>> @0) >>>>>>>>>> >>>>>>>>>> I have attached test-case I tried it with (negate.c) >>>>>>>>>> >>>>>>>>>> * Conclusion >>>>>>>>>> Does it sound reasonable ? I am going to be re-writing the >>>>>>>>>> decision tree from scratch, but is the basic idea fine ? Or should >>>>>>>>>> we >>>>>>>>>> take a different approach ? >>>>>>>>>> >>>>>>>>>> Thanks and Regards, >>>>>>>>>> Prathamesh >