On Wed, Mar 18, 2015 at 7:22 AM, Connor Abbott <cwabbo...@gmail.com> wrote: > On Wed, Mar 18, 2015 at 1:56 AM, Matt Turner <matts...@gmail.com> wrote: >> Thanks. Looked through stats and at some of the regressions. >> >> Some of the areas I noticed we were doing worse: >> >> We generate two CMPs for discard_if; only one without NIR. I think you >> had an idea about solving this. >> >> SEL peephole interference -- we knew about this in general, but I >> found an interesting failure mode: >> >> NIR generates >> >> (+f0) if(8) JIP: 4 UIP: 5 >> else(8) JIP: 3 >> mov(8) g124<1>F -g124<8,8,1>F >> endif(8) JIP: 6 >> >> We should of course improve dead-control-flow-elimination to remove >> the else and set the if's predicate_inverse, but non-NIR generates >> >> (+f0) sel(8) g124<1>F g25<8,8,1>F -g25<8,8,1>F >> >> which is probably better even than a predicated MOV since it's not a >> partial write and subject to optimization restrictions. The assembly >> matches the NIR exactly, so I'm assuming NIR is removing a >> self-assignment in the if-then block, preventing it from turning it >> into a conditional select. > > I think this is because the sel peephole (along with all the other > optimizations) in NIR only works before we lower to source mods, where > this code probably looked something like: > > ssa_1 = ...; > if ... { > } else { > ssa_2 = fneg ssa_1; > } > ssa_3 = phi ssa_1, ssa_2;
More specifically, we probably got b = foo ? a : -a from the GLSL source which NIR turns into if (foo) { ssa_1 = a; } else { ssa_2 = -a; } ssa_3 = phi(ssa_1, ssa_2) which gets copy-propagated to what Connor mentioned above. > So we can probably fix it just by adding fneg/ineg to > block_check_for_allowed_instrs(). That was my first thought but we're starting to get into the world of backend-specific knowledge there. Maybe this is where we just need to pass in Eric's backend info struct and predicate doing so on whether or not neg is free? But I guess the only compiler we care about at the moment where that isn't true is vc4 which can't do control-flow right now anyway. Thoughts? We may also want to simply make the backend's peephole smarter. >> >> Also, since the fs's SEL peephole only recognizes MOVs to the same >> destination in order, it doesn't handle tropico-5/387 properly, which >> already improved from 110 to 79 instructions. >> >> I also noticed an interesting pattern in defense-grid/537 (and some >> others, I think). NIR generates >> >> mul(8) g4<1>F g2<8,8,1>F g3<8,8,1>F >> mul(8) g6<1>F g5<8,8,1>F g4<8,8,1>F >> mad(8) g127<1>F g6<4,4,1>F g3<4,4,1>F g2<4,4,1>F >> >> whereas without NIR we get: >> >> mul(8) g5<1>F g2<8,8,1>F g4<8,8,1>F >> mad(8) g127<1>F g5<4,4,1>F g3<4,4,1>F g5<4,4,1>F >> >> Again, the assembly matches the NIR, so NIR is doing something bad. >> >> The changes to sun-temple/209 (and a lot of other sun temple shaders) >> look wrong. I'm not sure if the translation from NIR is doing >> something bad or what, but we should have 8 texture fetches and we >> only end up with 2. >> >> >> Some highlights -- >> >> Some shaders are improved because NIR recognizes exp2(log(x) * y) as >> pow(x, y) when the GLSL optimizations couldn't, because its dead code >> elimination is much too stupid to allow tree grafting to put the IR in >> the form we expect. This accounts for a low of the high percentage >> gains. >> >> It cuts some Kerbal Space Program shaders in half (~1000 -> ~500 >> instructions) and removes ~100 spills and ~100 fills from each. Not >> sure how it actually accomplished this. Too hard to read the assembly >> with all of the spills. >> >> Code generated by tropico-5/252 is really stupid. We MOVs both before >> and in an if statement that do the same thing. NIR cleans that up >> easily as a side-effect of SSA. >> >> In strike-suit-zero/156 we emit more MADs that we couldn't recognize >> from the GLSL IR, since again, its DCE is terrible and we can't >> combine trees. >> >> In the-binding-of-isaac-rebirth/18, it's able to CSE some duplicate >> rcp expressions in different basic blocks. I'm not sure why the >> backend isn't doing this. >> >> >> At this point I've looked through all of the shaders helped by >15% or more. >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev