I haven't thought about whether it's algebraically correct, but otherwise your pattern looks fine to me.
If you haven't noticed already, I added some commented out code to nir_replace_instr() that will print out each pattern that's matched. The first thing I'd do is move that up to the beginning of the function, so that it prints potential matches instead of actual matches. If your pattern shows up as a potential match, then it's a problem with match_expression() and not the automaton, and at that point you can start setting breakpoints and stepping through it. If it's not a potential match because the automaton filtered it out, then debugging is currently a little harder. You'll have to add some debugging code to ${pass_name}_pre_block() that prints out the assigned state for each ALU instruction. The state is an integer which is conceptually an index into TreeAutomaton.states for the constructed TreeAutomaton. So if an instruction has state n, then automaton.states[n] should be a set of potential partial matches for that instruction. Note that variables like a, b, c etc. are converted into __wildcard while constants are all converted into __constant. So for example, ssa_2 should have a set { __const, __wildcard } as it can be matched as a variable or as a constant (we actually explicitly construct this set in TreeAutomaton._build_table). ssa_83 should have a set { __wildcard, (neg __wildcard) } since it can be matched either as a variable or as something like (neg a). (yes, this is very similar to the subset construction for getting a DFA from an NFA...). Unless there's a bug, each of these "match sets" should contain the appropriate subset of your pattern until ssa_97 which should have the full pattern as one of the entries in the set. Let us know the details if that's not the case. I think that we could definitely do better when it comes to debugging why the automaton didn't match something. We could emit the automaton's state list in C, and then have a debugging option to print the match set for each instruction so you'd know where something went awry. I didn't do that earlier since I didn't have a need for it while bringing up the automaton, but we could add it if it helps. That being said, hopefully you won't need it this time :) Best, Connor On Sat, Jun 22, 2019 at 2:26 AM Ian Romanick <i...@freedesktop.org> wrote: > > I have encountered what I believe to be a bug in nir_algebraic. Since > the rewrite to use automata, I'm not sure how to begin debugging it. > I'm looking for some suggestions... even if the suggestion is, "Fix your > patterns." > > I have added a pattern like: > > (('~fadd@32', ('fmul', ('fadd', 1.0, ('fneg', a)), > ('fadd', 1.0, ('fneg', a))), > ('fmul', ('flrp', a, 1.0, a), b)), > ('flrp', 1.0, b, a), '!options->lower_flrp32'), > > While using NIR_PRINT=1, I see this in my instruction stream: > > vec1 32 ssa_2 = load_const (0x3f800000 /* 1.000000 */) > ... > vec1 32 ssa_196 = intrinsic load_uniform (ssa_195) (68, 4, 160) > vec1 32 ssa_83 = fneg ssa_196 > vec1 32 ssa_84 = fadd ssa_83, ssa_2 > vec1 32 ssa_85 = fmul ssa_84, ssa_84 > ... > vec1 32 ssa_95 = flrp ssa_196, ssa_2, ssa_196 > vec1 32 ssa_96 = fmul ssa_78, ssa_95 > vec1 32 ssa_97 = fadd ssa_96, ssa_85 > > But nir_opt_algebraic does not make any progress. It sure looks like it > should trigger with a = ssa_196 and b = ssa_78. > > However, progress is made if I change the pattern to > > (('~fadd@32', ('fmul', ('fadd', 1.0, ('fneg', a)), > c), > ('fmul', ('flrp', a, 1.0, a), b)), > ('flrp', 1.0, b, a), '!options->lower_flrp32'), > > ssa_85 is definitely ('fmul', ssa_84, ssa_84), and ssa_84 is definitely > ('fadd', 1.0, ('fneg', ssa_196))... both times. :) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev