Re: [GSoC] decision tree first steps

Prathamesh Kulkarni Wed, 11 Jun 2014 07:09:43 -0700

On 6/11/14, Richard Biener <richard.guent...@gmail.com> wrote:
> On Wed, Jun 11, 2014 at 12:53 PM, Richard Biener
> <richard.guent...@gmail.com> wrote:
>> On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener
>>> <richard.guent...@gmail.com> wrote:
>>>> On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni
>>>> <bilbotheelffri...@gmail.com> wrote:
>>>>> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener
>>>>> <richard.guent...@gmail.com> wrote:
>>>>>> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener
>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>> On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni
>>>>>>> <bilbotheelffri...@gmail.com> wrote:
>>>>>>>> On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener
>>>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>>>> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni
>>>>>>>>> <bilbotheelffri...@gmail.com> wrote:
>>>>>>>>>> I have few questions regarding genmatch:
>>>>>>>>>>
>>>>>>>>>> a) Why is 4 hard-coded here: ?
>>>>>>>>>> in write_nary_simplifiers:
>>>>>>>>>>  fprintf (f, "      tree captures[4] = {};\n");
>>>>>>>>>
>>>>>>>>> Magic number (this must be big enough for all cases ...).
>>>>>>>>> Honestly
>>>>>>>>> this should be improved (but requires another scan over the matcher
>>>>>>>>> IL
>>>>>>>>> to figure out the max N used in @N).
>>>>>>>>>
>>>>>>>>>> b) Should we add syntax for a symbol to denote multiple operators
>>>>>>>>>> ?
>>>>>>>>>> For exampleim in simplify_rotate:
>>>>>>>>>> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^  (CNT1 + CNT2 ==
>>>>>>>>>> bitsize of type of X).
>>>>>>>>>
>>>>>>>>> Something to enhance the IL with, yes.  I'd say we support
>>>>>>>>>
>>>>>>>>> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR)
>>>>>>>>>
>>>>>>>>> thus,
>>>>>>>>>
>>>>>>>>> (define_op <operator-name> op...)
>>>>>>>>>
>>>>>>>>>> c) Remove for parsing capture in parse_expr since we reject
>>>>>>>>>> outermost
>>>>>>>>>> captured expressions ?
>>>>>>>>>
>>>>>>>>> but parse_expr is also used for inner expressions, no?
>>>>>>>>>
>>>>>>>>> (plus (minus@2 @0 @1) @3)
>>>>>>>>>
>>>>>>>>> should still work
>>>>>>>>>
>>>>>>>>>> d) I am not able to follow this comment in match.pd:
>>>>>>>>>> /* Patterns required to avoid SCCVN testsuite regressions.  */
>>>>>>>>>>
>>>>>>>>>> /* (x >> 31) & 1 -> (x >> 31).  Folding in fold-const is more
>>>>>>>>>>    complicated here, it does
>>>>>>>>>>      Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1))
>>>>>>>>>>      (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1))
>>>>>>>>>>      if the new mask might be further optimized.  */
>>>>>>>>>> (match_and_simplify
>>>>>>>>>>   (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep)
>>>>>>>>>>   if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1)
>>>>>>>>>> == 0)
>>>>>>>>>>   @0)
>>>>>>>>>
>>>>>>>>> The comment is literally copied from the case I extracted the
>>>>>>>>> (simplified) variant from fold-const.c.  See lines 11961-12056 in
>>>>>>>>> fold-const.c.
>>>>>>>>> It'll be a challenge to implement the equivalent in a pattern ;)
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Decision Tree.
>>>>>>>>>>     I have tried to come up with a prototype for decision tree
>>>>>>>>>> (patch attached).
>>>>>>>>>> For simplicity, it handles patterns involving only unary operators
>>>>>>>>>> and
>>>>>>>>>> no predicates
>>>>>>>>>> and returns false when the pattern fails to match (no goto to
>>>>>>>>>> match
>>>>>>>>>> another pattern).
>>>>>>>>>> I meant to post it only for illustration, and I have not really
>>>>>>>>>> paid
>>>>>>>>>> attention to code quality (bad formatting, memory leaks, etc.).
>>>>>>>>>>
>>>>>>>>>> * Basic Idea
>>>>>>>>>> A pattern consists of following parts: match, ifexpr and result.
>>>>>>>>>> Let's call <ifexpr, result> as "simplification" operand.
>>>>>>>>>> The common prefix between different match operands would be
>>>>>>>>>> represented
>>>>>>>>>> by same nodes in the decision tree.
>>>>>>>>>>
>>>>>>>>>> Example:
>>>>>>>>>> (negate (bit_not @0))
>>>>>>>>>> S1
>>>>>>>>>>
>>>>>>>>>> (negate (negate @0))
>>>>>>>>>> S2
>>>>>>>>>>
>>>>>>>>>> S1, S2 denote simplifications for the above patterns
>>>>>>>>>> respectively.
>>>>>>>>>>
>>>>>>>>>> The decision tree would look something like
>>>>>>>>>> (that's the way it gets constructed with the patch):
>>>>>>>>>>
>>>>>>>>>>                 dummy/root
>>>>>>>>>>                         |
>>>>>>>>>>            NEGATE_EXPR
>>>>>>>>>>              /                  \
>>>>>>>>>>      BIT_NOT           NEGATE_EXPR
>>>>>>>>>>            |                         |
>>>>>>>>>>          @0                     @0
>>>>>>>>>>            |                         |
>>>>>>>>>>          S1                      S2
>>>>>>>>>>
>>>>>>>>>> a) The children of an internal node are number of decisions that
>>>>>>>>>> can be taken at that node. In the above case it's 2 for outer
>>>>>>>>>> NEGATE_EXPR.
>>>>>>>>>> b) Simplification operand represents leaves of the decision tree
>>>>>>>>>> c) Instead of having list of heads, I have added one dummy node,
>>>>>>>>>> and heads become children of these dummy root node.
>>>>>>>>>> d) Code-gen for non-simplification operands involves generating,
>>>>>>>>>> "matching" code and for simplification operands involves
>>>>>>>>>> generating
>>>>>>>>>> "transform" code
>>>>>>>>>>
>>>>>>>>>> * Overall Flow:
>>>>>>>>>> I guess we would build the decision tree from the AST.
>>>>>>>>>> So the flow would be like:
>>>>>>>>>> source -> struct simplify (match, ifexpr, result) -> decision tree
>>>>>>>>>> -> c code.
>>>>>>>>>>
>>>>>>>>>> Something like (in main):
>>>>>>>>>> decision_tree dt;
>>>>>>>>>> while (there is another pattern)
>>>>>>>>>> {
>>>>>>>>>>   simplify *s = parse_match_and_simplify ();
>>>>>>>>>>   insert s into decision tree;
>>>>>>>>>> };
>>>>>>>>>> So parsing routines are concerned with parsing and building the
>>>>>>>>>> AST (operand),
>>>>>>>>>> and not with the decision tree. Is that fine ?
>>>>>>>>>
>>>>>>>>> Yes, that's good.
>>>>>>>>>
>>>>>>>>>> * Representation of decision tree.
>>>>>>>>>> A decision tree would need a way to represent language constructs
>>>>>>>>>> like capture, predicate, etc. so in some ways it would be similar
>>>>>>>>>> to AST.
>>>>>>>>>> It
>>>>>>>>>> In the patch, I have created the following heirarchy:
>>>>>>>>>> dt_operand: represents a general base "operand" of in decision
>>>>>>>>>> tree
>>>>>>>>>> dt_expr: for representing expression. Expression contains
>>>>>>>>>> operation
>>>>>>>>>> to be performed (e_operation).
>>>>>>>>>> dt_capture: analogous to capture.
>>>>>>>>>> dt_simplify: for representing "simplification" operand.
>>>>>>>>>> simplification consists of ifexpr and result
>>>>>>>>>> dt_head: to represent "dummy" root. Maybe a separate class is not
>>>>>>>>>> needed.
>>>>>>>>>>
>>>>>>>>>> * Constructing decision tree from AST
>>>>>>>>>> The algorithm i have used is similar to inserting string in a
>>>>>>>>>> trie
>>>>>>>>>> outlined here: http://en.wikipedia.org/wiki/Trie
>>>>>>>>>> The difference shall be to traverse AST depth-first rather than
>>>>>>>>>> traversing the string.
>>>>>>>>>> Apart from that I guess it would be the same (naturally find and
>>>>>>>>>> compare operations would be different).
>>>>>>>>>> I haven't given much thought about this. Currently, I construct
>>>>>>>>>> decision tree for only patterns with unary operators in an
>>>>>>>>>> ugly way by "flattening" the AST by walking it in pre-order and
>>>>>>>>>> storing the nodes in vector in the order they are visited
>>>>>>>>>>
>>>>>>>>>> So to construct decision tree from the following AST:
>>>>>>>>>>           negate
>>>>>>>>>>               |
>>>>>>>>>>           bit_not
>>>>>>>>>>               |
>>>>>>>>>>              @0
>>>>>>>>>>
>>>>>>>>>> AST is flattened by walking in preorder (by walk_operand_preorder)
>>>>>>>>>> and stored
>>>>>>>>>> as: [ expr, expr, capture ]
>>>>>>>>>> and this vector is used to construct decision tree.
>>>>>>>>>> I did it as a "quick and dirty" way, it has to be changed.
>>>>>>>>>> And it won't work for patterns with n-ary operators.
>>>>>>>>>>
>>>>>>>>>> We should visit each node of AST in preorder, and add that node
>>>>>>>>>> during traversal to decision tree. I am not yet clear on way of
>>>>>>>>>> doing that.
>>>>>>>>>>
>>>>>>>>>> * Comparing operands
>>>>>>>>>> How do we compare operands ? We shall need to do this while
>>>>>>>>>> inserting
>>>>>>>>>> in decision tree, since if the operand is already inserted we do
>>>>>>>>>> not create
>>>>>>>>>> a new node.
>>>>>>>>>> In the patch (cmp_operand), I have used following rules:
>>>>>>>>>> a) if types not equal, they are clearly not equal.
>>>>>>>>>> b) if type is expr, compare operation->op->id
>>>>>>>>>> c) if type is capture, compare the number.
>>>>>>>>>> d) for predicate, compare on ident ?
>>>>>>>>>>
>>>>>>>>>> * Representing patterns with n-ary operators.
>>>>>>>>>> Consider following two match operands:
>>>>>>>>>> (plus (minus @0 @1) @1)
>>>>>>>>>> (plus (negate @0) @1)
>>>>>>>>>>
>>>>>>>>>> In decision tree, it would be represented as:
>>>>>>>>>>
>>>>>>>>>>             *********plus*********
>>>>>>>>>>                /    \          /        \
>>>>>>>>>>           minus  @1  negate  @1
>>>>>>>>>>             /     \           |
>>>>>>>>>>         @0   @1       @0
>>>>>>>>>>
>>>>>>>>>> plus has got 4 children - (minus @0 @1), @1, (negate @0), @1.
>>>>>>>>>> However (minus @0 @1) and @1 are not separate "decisions", but
>>>>>>>>>> children of plus within the same expression.
>>>>>>>>>>
>>>>>>>>>> For calculating number of decisions at an expression node:
>>>>>>>>>> number of decisions =  number of children / arity of operator.
>>>>>>>>>> in the above case, number of decision = 4 / 2 = 2
>>>>>>>>>>
>>>>>>>>>> For other cases, for instance capture node (the children would be
>>>>>>>>>> simplification operand),
>>>>>>>>>> number of decisions = number of children.
>>>>>>>>>
>>>>>>>>> Hmm.  I think it should be only two children from the plus,
>>>>>>>>> as pre-order for the first pattern is for example
>>>>>>>>> [plus minus @0 @1 @1] so you'd have
>>>>>>>>>
>>>>>>>>>    plus
>>>>>>>>>    /     \
>>>>>>>>> minus  negate
>>>>>>>>> |           |
>>>>>>>>> @0     @0
>>>>>>>>> |           |
>>>>>>>>> @1     @1
>>>>>>>>> |
>>>>>>>>> @1
>>>>>>>>>
>>>>>>>>> instead?
>>>>>>>>>
>>>>>>>>> Note that you probably have to deal with non-matching capture
>>>>>>>>> IDs by re-writing them on-the-fly.  That is, the two
>>>>>>>>> pre-oder traversals [plus minus @1 ...] and [plus minus @0 ...]
>>>>>>>>> should commonize with renaming the non-matching capture
>>>>>>>>> somehow.
>>>>>>>> Um, could you please elaborate on that (non-matching capture ?),
>>>>>>>> I am not sure if I understood it fully, thanks.
>>>>>>>
>>>>>>> I'm worried about
>>>>>>>
>>>>>>> (plus (minus@1 @2 @3) @2)
>>>>>>>
>>>>>>> vs.
>>>>>>>
>>>>>>> (plus (minus @1 @2) @1)
>>>>>>>
>>>>>>> (just made-up examples, of course).  Here we'd like to
>>>>>>> commonize the path checking for (plus (minus ...) ... but
>>>>>>> of course we can't easily process the captures at that point
>>>>>>> as the first has two captures (it captures the minus itself while
>>>>>>> the 2nd doesn't) and the captures that are at the same position
>>>>>>> use different indexes.  To finally match the minus we have to
>>>>>>> verify that in one case @2 == @2 and in the other case that
>>>>>>> @1 == @1.
>>>>>>>
>>>>>>> Re-numbering capture indexes in pre-order would improve things
>>>>>>> for the case where capture positions are in the same places but
>>>>>>> the indexes do not match.  It won't handle the above case
>>>>>>> of course.
>>>>> Thanks, I understand it now -:)
>>>>
>>>> I suppose that it would be best (in the end) to not insert the
>>>> captures into the decision tree separately but leave them
>>>> "combined" and then only after the decision tree is fully built
>>>> decide on re-numbering.  And treat those that are required for
>>>> the matching process differently (those that assert that two
>>>> ops are equal).
>>>>
>>>> I think we can defer the issue to later.
>>>>
>>>>>>>
>>>>>>>> I have attached patch for patterns with binary operators (in
>>>>>>>> principle
>>>>>>>> should work for n-ary operators,
>>>>>>>> but have only tested upto 2), that creates decision tree for
>>>>>>>> patterns (excluding predicates, and multiple matching). I don't
>>>>>>>> intend
>>>>>>>> to commit this (contains many hacks!).
>>>>>>>>
>>>>>>>> * Constructing decision tree from AST
>>>>>>>> We can define our decision tree to store prefixes of preorder
>>>>>>>> traversals of diffferent AST.
>>>>>>>> Insertion happens as follows (decision_tree::insert)
>>>>>>>> a) Get preorder traversal of AST into vector.
>>>>>>>> b) Insert AST nodes from vector into decision tree.
>>>>>>>> Currently, I obtain preorder traversal into a vector. We can later
>>>>>>>> change it to compute
>>>>>>>> next preorder successor lazily.
>>>>>>>
>>>>>>> Just make the code easy to understand, optimizing it should be
>>>>>>> secondary.
>>>>>>>
>>>>>>>> * Code-gen
>>>>>>>> For n-ary operators, the patch generates code as follows:
>>>>>>>> if (code == <expr code>)
>>>>>>>> {
>>>>>>>>    match operand 0
>>>>>>>>      match operand 1
>>>>>>>>         ....
>>>>>>>>           match operand n-1
>>>>>>>>             transform
>>>>>>>>             return true;
>>>>>>>> }
>>>>>>>>
>>>>>>>> * Adding parent, level, pos fields to AST (operand).
>>>>>>>> One change (hack), I have made to code-gen, is naming of operands
>>>>>>>> that are
>>>>>>>> assigned to gimple_assign_rhs in code-gen of expressions.
>>>>>>>> I am not sure if this is really needed.
>>>>>>>>
>>>>>>>> in AST, we have following representation for expressions:
>>>>>>>>                       expr-
>>>>>>>>                      /       \
>>>>>>>>                   left      right
>>>>>>>>                operand  operand
>>>>>>>>
>>>>>>>> code-gen off AST (expr::gen_gimple_match) produces code like:
>>>>>>>> {
>>>>>>>>   tree op = gimple_assign_rhs1 (def_stmt);
>>>>>>>>   // code-gen for left operand
>>>>>>>> }
>>>>>>>> {
>>>>>>>>   tree op = gimple_assign_rhs2 (def_stmt);
>>>>>>>>   // code-gen for right operand
>>>>>>>> }
>>>>>>>>
>>>>>>>> We can do this since, in expr::gen_gimple_match, we call
>>>>>>>> left_operand->gen_gimple_match,
>>>>>>>> come back to expr::gen_gimple_match (after
>>>>>>>> left_operand->gen_gimple_match() returns), and
>>>>>>>> then call right_operand->gen_gimple_match().
>>>>>>>>
>>>>>>>> However in decision tree, the expression gets represented as
>>>>>>>> follows:
>>>>>>>>            expr
>>>>>>>>              |
>>>>>>>>            left-operand
>>>>>>>>              |
>>>>>>>>            right-operand
>>>>>>>>
>>>>>>>> dt_expr::gen_gimple calls left_operand->gen_gimple, which calls
>>>>>>>> right_operand->gen_gimple,
>>>>>>>> so we return to dt_expr::gen_gimple, after code is generated for
>>>>>>>> the
>>>>>>>> whole subtree of expr.
>>>>>>>>
>>>>>>>> So I assign operands at the start before generating code for the
>>>>>>>> subtree:
>>>>>>>> tree op00 = gimple_assign_rhs1 (def_stmt);
>>>>>>>> tree op10 = gimple_assign_rhs2 (def_stmt);
>>>>>>>> <code for left-operand>
>>>>>>>>   <code for right-operand>
>>>>>>>> each of the operands know their positions, so they know which
>>>>>>>> operand
>>>>>>>> to use (get_op_name()).
>>>>>>>> the names are generated as:
>>>>>>>> op<position><level>, position = index of operand in it's parent's
>>>>>>>> expr's vector (vec<operand *> ops).
>>>>>>>> level: level of AST, at which the operand is stored.
>>>>>>>>
>>>>>>>> For this I added three fields to operand (unsigned pos, and
>>>>>>>> unsigned
>>>>>>>> level, operand *parent), and accordingly
>>>>>>>> made changes to expr::append_op. In a way this is abusing the AST.
>>>>>>>> This probably works (works with test-cases i tried), but I don't
>>>>>>>> like it much.
>>>>>>>
>>>>>>> The naming is ok, but indeed it doesn't feel very clean.  Somehow
>>>>>>> this
>>>>>>> belongs to the decision tree instead... (but then you need to lookup
>>>>>>> the decision tree operand info from code-gen which happens off the
>>>>>>> AST...).
>>>>>>>
>>>>>>>> * Order of matching.
>>>>>>>> Currently the order of matching operands is strictly 0, 1, .. n - 1
>>>>>>>> for n-ary operator.
>>>>>>>> However this might not be the best choice.
>>>>>>>>
>>>>>>>> For example, consider following patterns:
>>>>>>>> (operator  op1  op) S1
>>>>>>>> (operator  op2 op)  S2
>>>>>>>>
>>>>>>>> Both have common operator, and 1st common operand, they differ in
>>>>>>>> 0th operand.
>>>>>>>> With patch, the code generated is:
>>>>>>>> if (code == <expr-code>)
>>>>>>>> {
>>>>>>>>   match op1
>>>>>>>>      match op
>>>>>>>>        S1
>>>>>>>>
>>>>>>>>   match op2
>>>>>>>>     match op
>>>>>>>>       S2
>>>>>>>> }
>>>>>>>>
>>>>>>>> A better ordering would be to match the 1st operand and then
>>>>>>>> respective 0th operands of both patterns:
>>>>>>>>
>>>>>>>> if (code == <expr-code>
>>>>>>>> {
>>>>>>>>   match op
>>>>>>>>     match op1
>>>>>>>>       S1
>>>>>>>>     match op0
>>>>>>>>       S2
>>>>>>>> }
>>>>>>>>
>>>>>>>> This complicates insertion into decision tree.
>>>>>>>
>>>>>>> I'd say we leave "optimizing" the decision tree to a later stage -
>>>>>>> it's a global
>>>>>>> optimization problem after all (I believe you can't do optimally with
>>>>>>> just
>>>>>>> looking at the present decision tree and the AST you want to
>>>>>>> insert).
>>>>>>>
>>>>>>>> * hack: only one generated gimple_match_and_simplify function
>>>>>>>> In the patch, I generate code for gimple_match_and_simplify with 3
>>>>>>>> operands version (op0, op1, op2)
>>>>>>>> and the other two (unary and binary versions), call it with
>>>>>>>> NULL_TREE
>>>>>>>> for extra operands.
>>>>>>>> I will soon change that.
>>>>>>>>
>>>>>>>> * Testing patterns.
>>>>>>>> I think I have written most of the test-cases wrongly in match-2.c
>>>>>>>> For example this test-case doesn't match
>>>>>>>> (i haven't checked in "fix testsuite fallout" patch, so this might
>>>>>>>> not be true
>>>>>>>> anymore)
>>>>>>>>
>>>>>>>> (match_and_simplify
>>>>>>>>   plus (minus @0 @1) @1)
>>>>>>>>   @0)
>>>>>>>>
>>>>>>>> for this test-case:
>>>>>>>> /* (x - y) + y -> x */
>>>>>>>> int f6(int x, int y)
>>>>>>>> {
>>>>>>>>   int t1 = x - y;
>>>>>>>>   return t1 + y;
>>>>>>>> }
>>>>>>>> /* { dg-final { scan-tree-dump "gimple_match_and_simplified to
>>>>>>>> \[^\n\r\]*= x_\\d\+\\(D\\)" "forwprop1" } } */
>>>>>>>>
>>>>>>>> I get following output (tree-forwprop-details):
>>>>>>>>
>>>>>>>> ;; Function f6 (f, funcdef_no=0, decl_uid=1744, symbol_order=0)
>>>>>>>>
>>>>>>>> gimple_match_and_simplified to _4 = y_2(D) + t1_3;
>>>>>>>> f (int x, int y)
>>>>>>>> {
>>>>>>>>   int t1;
>>>>>>>>   int _4;
>>>>>>>>
>>>>>>>>   <bb 2>:
>>>>>>>>   t1_3 = x_1(D) - y_2(D);
>>>>>>>>   _4 = x_1(D);
>>>>>>>>   return _4;
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>> Shouldn't it show: gimple_match_and_simplified to _4 = x_1(D)
>>>>>>>> instead ?
>>>>>>>
>>>>>>> Yeah.  I suppose you hit the issue that we don't commutate patterns,
>>>>>>> so try with a commutated pattern inserted.
>>>>>>>
>>>>>>>> The issue is this test-case fails in isolation, however it passes
>>>>>>>> when
>>>>>>>> it is placed with other
>>>>>>>> test cases that PASS and have same regex in scan-tree-dump as this
>>>>>>>> test-case:
>>>>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>>>>> x_\\d\+\\(D\\)"
>>>>>>>>
>>>>>>>> Summary:
>>>>>>>> I think we need to (at-least) deal with following problems for
>>>>>>>> decision tree:
>>>>>>>
>>>>>>> I have to leave for a while now, I'll come back later and have a
>>>>>>> detailed
>>>>>>> look at the patch to try answer the questions below.
>>>>>>
>>>>>> Now for the remainder.
>>>>>>
>>>>>>>> a) Representation and construction of decision tree from AST -
>>>>>>>> We do this by storing prefixes of preorder traversals of AST in
>>>>>>>> decision tree. Do we finalize on that ?
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>>> b) Order of operand matching
>>>>>>
>>>>>> I think that's the same as a) - the visiting order of the AST
>>>>>> determines
>>>>>> the order of operand matching.
>>>>>>
>>>>>>>> c) Generating patterns for built-in functions (I guess this will be
>>>>>>>> similar to expr-gen).
>>>>>>
>>>>>> It should really be transparent.
>>>>>>
>>>>>>>> d) Multiple matching patterns
>>>>>>
>>>>>> The decision tree phase leaves us with a sequence of
>>>>>>
>>>>>>   if (if-expr of pattern1)
>>>>>>     transform-pattern1
>>>>>>   else (if-expr of pattern2)
>>>>>>     transform-pattern2
>>>>>> ...
>>>>>>
>>>>>> etc.  thus that part should be basically unchanged (only the match
>>>>>> part is driven by a decision tree).
>>>>>>
>>>>>>>> e) Commutative ops
>>>>>>
>>>>>> I'd do these as a pass over the AST of each pattern, duplicating
>>>>>> patterns.
>>>>>>
>>>>>> Btw, I see you mirror the AST structure with
>>>>>> dt_{node,operand,expr,...}.
>>>>>> It seems to me that you can simply stick an AST operand * into
>>>>>> dt_node and be done with just having dt_node?  That is, the AST
>>>>>> operand refered to contains all necessary information already.
>>>>>>
>>>>>>>>
>>>>>>>> Pattern Enhancements:
>>>>>>>> Some of these were already discussed, I am jotting them here:
>>>>>>>> a) Symbol for multiple operators
>>>>>>>> b) grouping patterns by ifexpr
>>>>>>>> c) should we have MATCH_FAILED or FAIL to explicitly denote failure
>>>>>>>> in
>>>>>>>> manual transform (c_expr) ?
>>>>>>>> d) making operators to be commutative.
>>>>>>
>>>>>> I'd say d) would be really nice to have.  Either with a grammar
>>>>>> extension to say which expression should have commutation applied
>>>>>> to (for example: (plus! (minus @1 @2) @2) would say that the
>>>>>> operands of plus should be commutated), or with a simple heuristic.
>>>>>> I'd say explicitly specifying is better.
>>>>>>
>>>>>> a) will probably be needed to avoid pattern explosion.  But similar to
>>>>>> d)
>>>>>> I'd do it as a pre-pass on the AST, duplicating the patterns.  It's
>>>>>> also a good question how to define its semantic.  The mode-iterators
>>>>>> that the machine description supports would suggest we do instead
>>>>>> sth like
>>>>>>
>>>>>> (for OP in (plus minus mult)
>>>>>>   (match_and_simplify
>>>>>>     (OP @1 integer_zerop)
>>>>>>     ....))
>>>>>>
>>>>>> c) should be easy, I'm going to try sth like that - also setting
>>>>>> temporaries
>>>>>> from the ifexpr and re-using them in the result like
>>>>>>
>>>>>>  (match_and_simplify
>>>>>>    (plus @1 @2)
>>>>>>    if (INTEGRAL_TYPE_P (TREE_TYPE (@2))
>>>>>>       && find_better_ops (&@3, &@4, @1, @2))
>>>>>>    (plus @3 @4))
>>>>>>
>>>>>> which also requires us to scan the AST for the highest capture index
>>>>>> (a good thing anyway, given my hack to statically set the size of
>>>>>> the captures array ;))
>>>>>>
>>>>>> b) I'm not sure it's worth the trouble (apart from doing less typing
>>>>>> in match.pd).  We're probably going to have GENERIC defined
>>>>>> when creating the generic routines and GIMPLE when creating
>>>>>> the gimple routines so we can easily disable patterns for one or
>>>>>> the other by doing
>>>>>>
>>>>>> #ifdef GIMPLE
>>>>>>  (match_and_simplify
>>>>>>    ...)
>>>>>> #endif
>>>>>>
>>>>>>>> Once the decision tree is done, I can try working on these.
>>>>>>>> Till Sunday, I will attempt to have another prototype, that covers
>>>>>>>> most of the patterns in match.pd
>>>>>>>> (except for cond_expr, and those requiring GENERIC support), and
>>>>>>>> accompanying correct test-case
>>>>>>>> for each pattern. And next-week start working on a fair patch.
>>>>>>
>>>>>> Sounds good.
>>>>> I have attached a patch, that fires most of the patterns from match.pd
>>>>> except for those that include
>>>>> non-matching capture or require GENERIC support, and added a test-case
>>>>> match-decision-tree.c
>>>>> (fires around 33 patterns).
>>>>>
>>>>> * Representation of decision tree:
>>>>> I clubbed dt_capture, dt_predicate, dt_expr into dt_operand, and added
>>>>> new gen_gimple_match() function in AST
>>>>> to generates matching code. dt_operand contains operand *op as member,
>>>>> and merely calls op->gen_gimple_match.
>>>>> (that's the intent anyway, in the patch, i needed to put some hacks to
>>>>> get it working).
>>>>> In the patch, i have kept: dt_node, dt_operand (dt_node), dt_simplify
>>>>> (dt_node), to separate matching operand
>>>>> from simplification (ifexpr, result).
>>>>>
>>>>> * Constructing decision tree from AST
>>>>> Same as previous patch, stores prefixes of preorder traversals of AST.
>>>>> decision_tree::print() pretty-prints the decision tree.
>>>>>
>>>>> I hope we won't need to change much up-to this point in the patch.
>>>>>
>>>>> * Code-gen
>>>>> code gen of (operator op0 op1 ..opN) simplify takes the following
>>>>> form:
>>>>> if (code == <expr code>)
>>>>> {
>>>>>   match op0
>>>>>     match op1
>>>>>        ...
>>>>>          match opN
>>>>>             simplify
>>>>>             return true;
>>>>> }
>>>>>
>>>>> a) Scope of temporaries (the ones that are assigned to
>>>>> gimple_assign_rhsX):
>>>>> for code-gen of expressions off the AST, we create a temporary:
>>>>>
>>>>> {
>>>>> tree op = gimple_assign_rhsX ();
>>>>> generate code for X'th operand
>>>>> }
>>>>>
>>>>> In the decision tree, an expression can only directly access it's 1st
>>>>> operand (since it's preorder traversal).
>>>>> So I decided to first create all temporaries and then generate code
>>>>> for operands:
>>>>>
>>>>> {
>>>>> tree op0 = gimple_assign_rhs1 ();
>>>>> tree op1 = gimple_assign_rhs2 ();
>>>>> tree opX-1 = gimple_assign_rhsX ();
>>>>>
>>>>> code for matching 0th operand
>>>>> code for matching 1st operand
>>>>> ...
>>>>> code for matching Nth operand
>>>>> }
>>>>>
>>>>> Unfortunately, this creates a problem if the operand itself is an
>>>>> expression,
>>>>> since it's temporaries would also be named op0, op1 ... they would
>>>>> shadow the outer op0, op1 .. temporaries.
>>>>> So we need distinct names for each temporary (for all temps within one
>>>>> pattern).
>>>>> The temporary name is created by:
>>>>> op<position of child operand><level of tree>;
>>>>> and each child of AST knows it's position and level, so it can
>>>>> correctly access it's temp.
>>>>> This requires adding members operand *parent, unsigned pos, unsigned
>>>>> level to struct operand.
>>>>> I have done it this way in the patch, however it doesn't feel clean, I
>>>>> will try to see if we can change this.
>>>>>
>>>>> b) Short-ranged gotos
>>>>> consider: (negate @0) S1
>>>>> The code is generated as (with the patch):
>>>>>
>>>>> if (code == PLUS)
>>>>> {
>>>>>   if (!captures [0])
>>>>>    {
>>>>>       captures[0] = op0;
>>>>>       goto L0;
>>>>>    }
>>>>>   else if (captures[0] == op0)
>>>>>    {
>>>>> L0:
>>>>>       // simplification code S1
>>>>>       return true;
>>>>>    }
>>>>> }
>>>>>
>>>>> Instead of:
>>>>> if (!captures[0])
>>>>>   captures[0] = op0;
>>>>> else if (captures[0] != op0)
>>>>>   goto L0;
>>>>>
>>>>> // Simplification code S1
>>>>> return true;
>>>>>
>>>>> L0:  // match next pattern
>>>>>
>>>>> Generating code the former way is more easier, since location of label
>>>>> immediately follows
>>>>> capture code (if (!captures[0]) ...).  In the latter case, I probably
>>>>> need to store label name in the sibling's node in
>>>>> the decision tree and then retrieve that.
>>>>> Similarly for valueize (expr::gen_valueize).
>>>>
>>>> But then why not generate
>>>>
>>>>   if (!captures[0]
>>>>       || captures[0] == op0)
>>>>     {
>>>>       captures[0] = op0;
>>>>       // simplification code S1
>>>>       return true;
>>>>     }
>>>>
>>>> and go without label and goto?  Or did I miss something?
>>>>
>>>>> c) Shared captures between all patterns:
>>>>> In the patch all patterns share the capture array tree captures[4] =
>>>>> {}
>>>>> (since all match patterns get interleaved in generated code off
>>>>> decision tree).
>>>>> This requires extra code to be generated to make sure captures are
>>>>> correctly restored for another pattern.
>>>
>>> Btw, I see that you back-track the decision tree.  That is, for
>>>
>>> root
>>> |--operand: MINUS_EXPR
>>> |----operand: PLUS_EXPR
>>> |------operand: @0
>>> |--------operand: @1
>>> |----------operand: @0
>>> |------------simplify
>>> |----------operand: @1
>>> |------------simplify
>>>
>>> you do
>>>
>>>   if (code == MINUS_EXPR)
>>>     {
>>>       tree captures[4] = {};
>>>       if (TREE_CODE (op0) == SSA_NAME)
>>>         {
>>>           gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>           if (is_gimple_assign (def_stmt) && gimple_assign_rhs_code
>>> (def_stmt) == PLUS_EXPR)
>>>             {
>>>               tree op01 = gimple_assign_rhs1 (def_stmt);
>>>               if (valueize && TREE_CODE (op01) == SSA_NAME)
>>>                 op01 = valueize (op01);
>>>               else
>>>                 goto L0;
>>>               if (op01)
>>>                 {
>>> L0:
>>>                   tree op11 = gimple_assign_rhs2 (def_stmt);
>>>                   if (valueize && TREE_CODE (op11) == SSA_NAME)
>>>                     op11 = valueize (op11);
>>>                   else
>>>                     goto L1;
>>>                   if (op11)
>>>                     {
>>> L1:
>>>                       tree captures_0 = captures[0];
>>>                       if (!captures[0])
>>>                         {
>>>                           captures[0] = op01;
>>>                           goto L2;
>>>                         }
>>>                       else if (captures[0] == op01)
>>>                         {
>>> L2:
>>>                           tree captures_1 = captures[1];
>>>                           if (!captures[1])
>>>                             {
>>>                               captures[1] = op11;
>>>                               goto L3;
>>>                             }
>>>                           else if (captures[1] == op11)
>>>                             {
>>> L3:
>>>                               tree captures_2 = captures[0];
>>>                               if (!captures[0])
>>>                                 {
>>>                                   captures[0] = op1;
>>>                                   goto L4;
>>>                                 }
>>>                               else if (captures[0] == op1)
>>>                                 {
>>> L4:
>>>                                   res_ops[0] = captures[1];
>>>                                   *res_code = TREE_CODE (res_ops[0]);
>>>                                   return true;
>>>                                 }
>>>                               captures [0] = captures_2;
>>>                               tree captures_3 = captures[1];
>>>                               if (!captures[1])
>>>                                 {
>>>                                   captures[1] = op1;
>>>                                   goto L5;
>>>                                 }
>>>                               else if (captures[1] == op1)
>>>                                 {
>>> L5:
>>>                                   res_ops[0] = captures[0];
>>>                                   *res_code = TREE_CODE (res_ops[0]);
>>>                                   return true;
>>>                                 }
>>>                               captures [1] = captures_3;
>>>                             }
>>>                           captures [1] = captures_1;
>>>                         }
>>>                       captures [0] = captures_0;
>>>                     }
>>>                 }
>>>             }
>>>         }
>>>
>>> but if you processed all children of a decision tree node and didn't
>>> hit one of the simplifies (which return true) then you know nothing
>>> else will match and you can just return false.  That early out seems
>>> to be missing completely and we fall through processing all siblings
>>> of the parents.
>>>
>>> But maybe I am missing something?  I see that if I make capture
>>> IDs not matching that we'd miss to detect things then (as a
>>> "first" capture always "matches").
>>>
>>> root
>>> |--operand: MINUS_EXPR
>>> |----operand: PLUS_EXPR
>>> |------operand: @0
>>> |--------operand: @1
>>> |----------operand: @0
>>> |------------simplify
>>> |------operand: @1
>>> |--------operand: @0
>>> |----------operand: @0
>>> |------------simplify
>>>
>>> So I suppose we really have to distinguish "first" captures from
>>> "matching" captures, basically not treating the "first" ones as
>>> nodes of the decision tree and only populating the capture
>>> array once we hit the code generation part of a pattern (well,
>>> or the if-expr).
>>
>> It's probably a good idea to pre-parse the AST to classify
>> captures and matches differently.  Given for example
>>
>> (plus @0 (minus@0 @1 @2))
>>
>> this would say (well, if we support this), that the plus has two
>> "same" operands (on GIMPLE the "same" valueized SSA name)
>> and that this operand is defined by a (minus @1 @2).  But with
>> the current operator ordering in the decision tree we'd meet
>> the no-op capture first and the match in an odd position.
>>
>> In the RTL machine description the matches are more explicit
>> like
>>
>> (plus (match @0) (minus@0 @1 @2))
>>
>> so maybe we want to auto-detect those and rewrite the AST.
>> Just distinguish between capture-def and capture-ref, where
>> expression captures are always -def but leafs may be -refs as well.
>>
>> In the decision tree builder we'd "ignore" -defs and only make
>> -refs explicit (and the -refs would need to "materialize" the -defs
>> just like the ifexpr or transform stage).
>>
>> So I'd say add a flag to struct capture telling whether it's a ref
>> and compute that via an AST traversal (that should then error
>> for invalid refs).
>>
>> In the decision tree I'd keep a vector of operand * and record
>> a name of a temporary the operand can be accessed via
>> (in case any of the AST operands represented by the decision tree
>> node has a capture associated).
>>
>> You'd still have "empty" leaf captures that would then only record
>> the value into a temporary.
>>
>> Eventually you need a back-ref to the decision tree node for
>> each capture when you insert a new AST traversal so you
>> can decide whether a match (a capture-ref) refers to the same
>> decision tree node or not.
>>
>> Richard.
>
> Ok, let me re-phrase after some more thinking and talking with
> a collegue.
>
> When you do the AST traversal computing the array of AST operands *
> you'd initialize book-keeping like the following
>
>  for operand in AST do
>    if (operand is a capture)
>     {
>       if (this capture index was not yet seen)
>         {
>           record for this pattern (in its simplify node created at the
>           decision tree leaf) that the capture index refers to the
>           operand with the current index
>           if (capture has a sub-expression)
>             recurse
>           else
>             insert "true" op (matches everything)
>         }
>       else
>         {
>           generate a decision tree node that matches the decision
>           tree index of the capture
>           recurse
>         }
>     }
>   else
>     record op and recurse
>
> So when you generate code you implicitely capture all operands
> in automatic variables named like 'captureN' with 'N' being the
> current level of the decision tree (the index of the operand array
> from the AST walk).  Once you hit a 'match-X' node you
> simply match against the operand from the refered to decision
> tree index.
>
> Once you reach the ifexpr point you populate the operands[] array
> from the side-information you initialized in the AST traversal and
> the automatic variables initialized from the decision tree walk.
>
> So
>
> (plus (minus@2 @1 @3) @2)
>
> would have the decision tree
>
> plus - minus - 'true' - 'true' - match(1) - simplify
>
> and the generated code would look like (simplified)
>
>   [can't capture outermost expr]
>   if (code == plus)
>     {
>       op1 = gimple_assign_rhs1 (plus-stmt);
>       if (TREE_CODE (op1) == minus)
>         {
>           op2 = gimple_assign_rhs1 (minus-stmt);
>           if (true)
>             {
>               op3 = gimple_assign_rhs2 (minus-stmt);
>               if (true)
>                 {
>                    op4 = gimple_assign_rhs2 (plus-stmt);
>                    if (op4 == op1)
>                      {
>                         operands[0] = op2;
>                         operands[1] = op1;
>                         operands[2] = op3;
Did you mean captures ?
captures[0] = op2;
captures[1] = op1;
captures[2] = op3
>                         <simplify>
>
> where the 'true' cases would be matched last (as "else") in case there
> are multiplie choices here if you'd have also
>
> (plus @1 (minus @2 @3))
>
> the decision tree would become
>
> plus - minus - 'true' - 'true' - match(1) - simplify
>         \
>           'true' - minus - 'true' - 'true' - simplify
>
Thanks, I will get started on this. Today I added a patch that handles
"matching-captures" but does not commonize, I am not not posting since
it's of no
use now. The above algorithm will also get rid of generating temporary
as op<pos><level>.


suppose these are patterns:
(plus @0 @1) S1
(plus @1 @0) S2

then how would be decision tree represented as ?

plus - true - true - S1
                        \
                          S2

I suppose pattern-1 would always match (even with earlier code gen.
> Richard.
>
>>> There are also opportunities to optimize the generated code, but
>>> basic correctness first I guess.
>>>
>>> I suppose we have to work a bit on the capture stuff.
>>>
>>> Btw, you can easily play with the code generator by doing inside
>>> the gcc build directory
>>>
>>> obj/gcc> build/genmatch test.pd > test.c
>>>
>>> with a small test.pd. I used the following for the above examples:
>>>
>>> (match_and_simplify
>>>   (MINUS_EXPR (PLUS_EXPR @0 @1) @0)
>>>   @1)
>>> (match_and_simplify
>>>   (MINUS_EXPR (PLUS_EXPR @1 @0) @0)
>>>   @1)
>
>>> Richard.
>>>
>>>>> I will change this to have capture per pattern
>>>>> tree captures1[4] = {};  // for pattern-1
>>>>> tree captures2[4] = {};
>>>>
>>>> Hmm, is this the matching captures issue I mentioned?  Btw, I see
>>>> you do
>>>>
>>>> +void
>>>> +walk_operand_preorder(vec<operand *>& operands, operand *op)
>>>> +{
>>>> +  if (op->type == operand::OP_CAPTURE || op->type ==
>>>> operand::OP_PREDICATE || op->type == operand::OP_C_EXPR)
>>>> +    {
>>>> +      operands.safe_push (op);
>>>> +      return;
>>>> +    }
>>>>
>>>> but that leaves captured expressions as a single operand?
>>>>
>>>> (plus (minus@1 @2 @3) @2)
>>>>
>>>> would have a decision tree
>>>>
>>>>  plus -> minus -> @2
>>>>
>>>> correct?
>>>>
>>>>> d) Matching multiple patterns.
>>>>> Code for patterns with same match, but different transforms is
>>>>> generated as follows:
>>>>> code for match operand.
>>>>> if (if-expr of pattern-1)
>>>>> {
>>>>>   code for result of pattern-1
>>>>>   return true;
>>>>> }
>>>>> if (if-expr of pattern-2)
>>>>> {
>>>>>   code for result of pattern-2
>>>>>   return true;
>>>>> }
>>>>
>>>> good.
>>>>
>>>>> ...
>>>>> Should we emit a warning for patterns that have same match operand but
>>>>> no if-expr and no manual transform ?
>>>>>
>>>>> for eg:
>>>>> (match_and_simplify
>>>>>   (plus (minus @0 @1) @1)
>>>>>   @0
>>>>>
>>>>> (match_and_simplify
>>>>>   (plus (minus @0 @1) @1)
>>>>>   @1  // just for illustration
>>>>>
>>>>> Since the matching is ambiguous (same match, no if-expr, no manual
>>>>> transform).
>>>>
>>>> Yeah, I suppose we should.
>>>>
>>>>> we are left to choose between result of pattern-1 and result of
>>>>> pattern-2.
>>>>> We can emit warning and choose result of pattern-1 (first-match rule
>>>>> as in flex).
>>>>>
>>>>> e) Non-matching captures:
>>>>> Haven't thought about this yet.
>>>>>
>>>>> * Should we add "negative" predicates that match only if the predicate
>>>>> fails ?
>>>>> for eg:
>>>>> (match_and_simplify
>>>>>   trunc_mod integer_zerop@0 !integer_zerop)
>>>>>   @0)
>>>>
>>>> well, there could simply be a integer_not_zerop predicate.
>>>>
>>>>> * Testing
>>>>> Sorry to bring this up again, but I am still not clear what regex to
>>>>> write in scan-tree-dump.
>>>>>
>>>>> Suppose we have these two patterns in match.pd:
>>>>> /* (x + y) - y -> x */
>>>>> (match_and_simplify
>>>>>   (minus (plus @0 @1) @1)
>>>>>   @0)
>>>>>
>>>>> /* (x - y) + y -> x */
>>>>> (match_and_simplify
>>>>>   (plus (minus @0 @1) @1)
>>>>>   @0)
>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>> x_\\d\+\\(D\\)"
>>>>>
>>>>> I have following test-cases:
>>>>> int f1(int x, int y)
>>>>> {
>>>>>   int t1 = x + y;
>>>>>   return t1 - y;
>>>>> }
>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>> x_\\d\+\\(D\\)"
>>>>>
>>>>> int f2(int x, int y)
>>>>> {
>>>>>   int t1 = x - y;
>>>>>   return t1 + y;
>>>>> }
>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>> x_\\d\+\\(D\\)"
>>>>>
>>>>> both the test-cases have same regex.
>>>>> Tested in isolation f1 passes (first pattern if fired) and f2 fails
>>>>> (second pattern doesn't fire, it does after
>>>>> adding it's commutative variant, but that's irrelevant for this case).
>>>>>
>>>>> However when both test-cases are put in one file both the test cases
>>>>> PASS.
>>>>> I think that's because both of them have same regex: \[^\n\r\]*=
>>>>> x_\\d\+\\(D\\)
>>>>> so in f1 and f2's regex both match the dump for f1 function in
>>>>> forwprop dump file:
>>>>> "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\)
>>>>>
>>>>> As a quick hack i rewrote f1 and f2 as:
>>>>> int f1(int x, int y)
>>>>> {
>>>>>   int t1 = x + y;
>>>>>   int f1_val = t1 - y;
>>>>>   return f1_val;
>>>>> }
>>>>> scan-tree-dump "gimple_match_and_simplified to f1_val_\\d\+ =
>>>>> x_\\d\+\\(D\\)"
>>>>>
>>>>> int f2(int x, int y)
>>>>> {
>>>>>   int t1 = x - y;
>>>>>   int f2_val = t1 + y;
>>>>>   return f2_val;
>>>>> }
>>>>> scan-tree-dump "gimple_match_and_simplified to f2_val_\\d\+ =
>>>>> x_\\d\+\\(D\\)"
>>>>> so both f1 and f2's scan-tree-dump have different regexes.
>>>>> and f2's regex does not match dump of f1's function.
>>>>> This matches all patterns in match-decision-tree.c however this is not
>>>>> ideal,
>>>>> since it does not check for matching dump across newlines.
>>>>> Could you suggest a better way ?
>>>>
>>>> There isn't a good better way (the others explicitely do _not_ match
>>>> against
>>>> a newline - see the ^ in the \[\] group).  Well, apart from splitting
>>>> the testcase
>>>> into multiple files of course.
>>>>
>>>> Richard.
>>>>
>>>>> Thanks and Regards,
>>>>> Prathamesh
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>>> Thanks and Regards,
>>>>>>>> Prathamesh
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>> * Code generation.
>>>>>>>>>> Code shall be generated by walking the decision tree.
>>>>>>>>>> The way it's constructed, there's no difference between code
>>>>>>>>>> generation
>>>>>>>>>> for "matching" and code generation for "transform". For
>>>>>>>>>> non-simplificaton
>>>>>>>>>> operands, "matching" code is generated, and for "simplification"
>>>>>>>>>> operands,
>>>>>>>>>> "transform" code is generated. The tree shall be walked twice,
>>>>>>>>>> once to generate GIMPLE code and second time for GENERIC.
>>>>>>>>>> For simplicity, I currently return false whenever there's a fail
>>>>>>>>>> in match,
>>>>>>>>>> instead of trying to match the next pattern.
>>>>>>>>>>
>>>>>>>>>> Code-gen for capture - same as capture::gen_gimple_match.
>>>>>>>>>>
>>>>>>>>>> Code-gen for predicate -  I haven't added support for predicate
>>>>>>>>>> in
>>>>>>>>>> decision tree yet, but I guess that would be the same as
>>>>>>>>>> predicate::gen_gimple_match
>>>>>>>>>>
>>>>>>>>>> Code-gen for expr.
>>>>>>>>>> There are two types of code-gen for expr.
>>>>>>>>>> The patch generates following code:
>>>>>>>>>> Type 1 - expr is child of root node.
>>>>>>>>>> the only code that gets generated is (in
>>>>>>>>>> decision_tree::gen_gimple):
>>>>>>>>>> if (code == <expr code>)
>>>>>>>>>> {
>>>>>>>>>> tree captures[4] = {}
>>>>>>>>>> <generated code for children>
>>>>>>>>>> }
>>>>>>>>>> This is similar to generating matching code in
>>>>>>>>>> write_nary_simplifiers.
>>>>>>>>>>
>>>>>>>>>> Type 2 - expr_1 is a child of another expr_0 node.
>>>>>>>>>> The code gets generated as follows (dt_expr::gen_gimple):
>>>>>>>>>> {
>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op);
>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == <expr_1-code>)
>>>>>>>>>> {
>>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt);
>>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME)
>>>>>>>>>> {
>>>>>>>>>>   op = valueize (op);
>>>>>>>>>>   if (!op) return false;
>>>>>>>>>> }
>>>>>>>>>> <code-gen for children of expr_1 node>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Example:
>>>>>>>>>> (negate (negate @0))
>>>>>>>>>> S1
>>>>>>>>>>
>>>>>>>>>> (negate (bit_not @0))
>>>>>>>>>> S2
>>>>>>>>>>
>>>>>>>>>> decision tree:
>>>>>>>>>>
>>>>>>>>>>                 dummy/root
>>>>>>>>>>                         |
>>>>>>>>>>            NEGATE_EXPR
>>>>>>>>>>              /                  \
>>>>>>>>>>      BIT_NOT           NEGATE_EXPR
>>>>>>>>>>            |                         |
>>>>>>>>>>          @0                     @0
>>>>>>>>>>            |                         |
>>>>>>>>>>          S1                      S2
>>>>>>>>>>
>>>>>>>>>> // code-gen for NEGATE_EXPR (child of root):
>>>>>>>>>> if (code == NEGATE_EXPR)
>>>>>>>>>> {
>>>>>>>>>> tree captures[4] = {};
>>>>>>>>>> // code gen for BIT_NOT_EXPR
>>>>>>>>>> {
>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
>>>>>>>>>> {
>>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt);
>>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME)
>>>>>>>>>> {
>>>>>>>>>>   op = valueize (op);
>>>>>>>>>>   if (!op) return false;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> // code-gen for @0, child of BIT_NOT_EXPR
>>>>>>>>>> if (!captures[0])
>>>>>>>>>>   captures[0] = op;
>>>>>>>>>> else if (captures[0] != op)
>>>>>>>>>>   return false;
>>>>>>>>>>
>>>>>>>>>> // code-gen for S1, child of @0
>>>>>>>>>> < same as code generated by .gen_gimple_transform >
>>>>>>>>>> return true;
>>>>>>>>>> }
>>>>>>>>>> // code gen for inner NEGATE_EXPR
>>>>>>>>>> {
>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
>>>>>>>>>> <rest similar to the BIT_NOT case>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> The following gets duplicated with the patch:
>>>>>>>>>> {
>>>>>>>>>> gimple_def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME)
>>>>>>>>>>   return false;
>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == <expr-code>)
>>>>>>>>>> ...
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Improving code-gen for expr:
>>>>>>>>>> "gimple def_stmt = ..." and "if (TREE_CODE (op0)" get duplicated,
>>>>>>>>>> while they could be factored out, similar to this:
>>>>>>>>>>
>>>>>>>>>> {
>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME)
>>>>>>>>>>   return false;
>>>>>>>>>> if (!is_gimple_assign (def_stmt))
>>>>>>>>>>   return false;
>>>>>>>>>> if (gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
>>>>>>>>>> {
>>>>>>>>>>   // code-gen for BIT_NOT_EXPR subtree
>>>>>>>>>> }
>>>>>>>>>> else if (gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
>>>>>>>>>> {
>>>>>>>>>>   // code-gen for NEGATE_EXPR subtree
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> For factoring "gimple def_stmt ..." and "if (TREE_CODE (op0) !=
>>>>>>>>>> SSA_NAME"
>>>>>>>>>> we could have that generated at the parent of expr's node rather
>>>>>>>>>> than
>>>>>>>>>> at expr. However that would be incorrect for cases where not all
>>>>>>>>>> children
>>>>>>>>>> of a node are expressions:
>>>>>>>>>>
>>>>>>>>>> Example:
>>>>>>>>>> // patterns only for illustration
>>>>>>>>>> (negate (bit_not @0))
>>>>>>>>>> (negate @0)
>>>>>>>>>>
>>>>>>>>>>                    root
>>>>>>>>>>                      |
>>>>>>>>>>                   negate
>>>>>>>>>>                     /       \
>>>>>>>>>>                 bit_not   @0
>>>>>>>>>>                     |
>>>>>>>>>>                   @0
>>>>>>>>>>
>>>>>>>>>> we cannot have the above code generated at negate,
>>>>>>>>>> since it's not applicable negate's 2nd child (@0).
>>>>>>>>>>
>>>>>>>>>> This can be done by grouping together children that are
>>>>>>>>>> expressions.
>>>>>>>>>> However the patch does not do that.
>>>>>>>>>>
>>>>>>>>>> * Code-gen for simplification operand
>>>>>>>>>> This involves code-gen for ifexpr and result of pattern.
>>>>>>>>>> Calls gen_gimple_transform of ifexpr and result
>>>>>>>>>> (dt_simplify::gen_gimple)
>>>>>>>>>> So this is really code-gen off AST
>>>>>>>>>
>>>>>>>>> Right (modulo replacing captures with their replacements).
>>>>>>>>>
>>>>>>>>>> * Matching multiple patterns
>>>>>>>>>> A pattern has following parts: match, ifexpr and result.
>>>>>>>>>> If pattern fails in match operand, I guess we can safely return
>>>>>>>>>> false ?
>>>>>>>>>> We "club" together patterns that have same match operand,
>>>>>>>>>> and use goto, if one of them fails in their (ifexpr/result) and
>>>>>>>>>> then goto the
>>>>>>>>>> (ifexpr/result) of the next operand.
>>>>>>>>>>
>>>>>>>>>> Example:
>>>>>>>>>> /* x & 0 -> 0 */
>>>>>>>>>> (match_and_simplify
>>>>>>>>>>   (bit_and @0 @1)
>>>>>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 ==
>>>>>>>>>> integer_zero_node))
>>>>>>>>>>   { integer_zero_node; })
>>>>>>>>>>
>>>>>>>>>> /* x & -1 -> x */
>>>>>>>>>> (match_and_simplify
>>>>>>>>>>   (bit_and @0 @1)
>>>>>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>>>>>>>>>>      && (@1 == integer_minus_one_node)
>>>>>>>>>>   @0)
>>>>>>>>>>
>>>>>>>>>> For both patterns match is same.
>>>>>>>>>>
>>>>>>>>>> Decision Tree:
>>>>>>>>>>                 bit_and
>>>>>>>>>>                 /        \
>>>>>>>>>>              @0      @1
>>>>>>>>>>                            |
>>>>>>>>>>                         S1,  S2
>>>>>>>>>
>>>>>>>>> I think it's worth adding a diagnostic to genmach whenever exactly
>>>>>>>>> same matches appear.  But I'd say generally we'd support this
>>>>>>>>> by testing the ifexpr of the next pattern.
>>>>>>>>>
>>>>>>>>>> S1 represents <ifexpr, result> of pattern-1, and S2 represents
>>>>>>>>>> <ifexpr, result>
>>>>>>>>>> of pattern-2 respectively.
>>>>>>>>>> S1, S2 would be stored as children of @1 (the last operand of
>>>>>>>>>> n-ary operator),
>>>>>>>>>> in dt_operand::kids vector.
>>>>>>>>>>
>>>>>>>>>> The code would be generated as:
>>>>>>>>>>
>>>>>>>>>> matching code.
>>>>>>>>>> if (! pattern-1 ifexpr condition)
>>>>>>>>>>   goto simplify2;  // next pattern with the same "match" operand.
>>>>>>>>>> transform code for pattern 1
>>>>>>>>>>
>>>>>>>>>> simplify2:
>>>>>>>>>> if (! pattern-2 ifexpr condition)
>>>>>>>>>>   return false;  // last pattern
>>>>>>>>>> transform code for pattern 2.
>>>>>>>>>>
>>>>>>>>>> If matching itself fails, that is neither of the decisions get
>>>>>>>>>> matched,
>>>>>>>>>> I believe we can then return false as it cannot match any other
>>>>>>>>>> pattern ?
>>>>>>>>>>
>>>>>>>>>> * patterns needing hacks like cond_expr or GENERIC support
>>>>>>>>>> I haven't given thought to this yet. I suppose we can look to
>>>>>>>>>> handle
>>>>>>>>>> these after adding support for GENERIC. Instead of generating
>>>>>>>>>> GENERIC
>>>>>>>>>> matching in
>>>>>>>>>> gimple_match_and_simplify, could we then call
>>>>>>>>>> generic_match_and_simplify from
>>>>>>>>>> within gimple_match_and_simplify ?
>>>>>>>>>
>>>>>>>>> Yes (that's what's currently done btw).
>>>>>>>>>
>>>>>>>>>> * Tests
>>>>>>>>>> The patch transformed the following patterns:
>>>>>>>>>>
>>>>>>>>>> (match_and_simplify
>>>>>>>>>>   (negate (bit_not @0))
>>>>>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>>>>>>>>   (plus @0 { build_int_cst (TREE_TYPE (@0)), 1); }))
>>>>>>>>>>
>>>>>>>>>> (match_and_simplify
>>>>>>>>>>   (negate (negate @0))
>>>>>>>>>>   @0)
>>>>>>>>>>
>>>>>>>>>> I have attached test-case I tried it with (negate.c)
>>>>>>>>>>
>>>>>>>>>> * Conclusion
>>>>>>>>>> Does it sound reasonable ? I am going to be re-writing the
>>>>>>>>>> decision tree from scratch, but is the basic idea fine ? Or should
>>>>>>>>>> we
>>>>>>>>>> take a different approach ?
>>>>>>>>>>
>>>>>>>>>> Thanks and Regards,
>>>>>>>>>> Prathamesh
>

Re: [GSoC] decision tree first steps

Reply via email to