On Wed, Mar 18, 2015 at 1:56 AM, Matt Turner <matts...@gmail.com> wrote: > Thanks. Looked through stats and at some of the regressions. > > Some of the areas I noticed we were doing worse: > > We generate two CMPs for discard_if; only one without NIR. I think you > had an idea about solving this. > > SEL peephole interference -- we knew about this in general, but I > found an interesting failure mode: > > NIR generates > > (+f0) if(8) JIP: 4 UIP: 5 > else(8) JIP: 3 > mov(8) g124<1>F -g124<8,8,1>F > endif(8) JIP: 6 > > We should of course improve dead-control-flow-elimination to remove > the else and set the if's predicate_inverse, but non-NIR generates > > (+f0) sel(8) g124<1>F g25<8,8,1>F -g25<8,8,1>F > > which is probably better even than a predicated MOV since it's not a > partial write and subject to optimization restrictions. The assembly > matches the NIR exactly, so I'm assuming NIR is removing a > self-assignment in the if-then block, preventing it from turning it > into a conditional select.
I think this is because the sel peephole (along with all the other optimizations) in NIR only works before we lower to source mods, where this code probably looked something like: ssa_1 = ...; if ... { } else { ssa_2 = fneg ssa_1; } ssa_3 = phi ssa_1, ssa_2; So we can probably fix it just by adding fneg/ineg to block_check_for_allowed_instrs(). > > Also, since the fs's SEL peephole only recognizes MOVs to the same > destination in order, it doesn't handle tropico-5/387 properly, which > already improved from 110 to 79 instructions. > > I also noticed an interesting pattern in defense-grid/537 (and some > others, I think). NIR generates > > mul(8) g4<1>F g2<8,8,1>F g3<8,8,1>F > mul(8) g6<1>F g5<8,8,1>F g4<8,8,1>F > mad(8) g127<1>F g6<4,4,1>F g3<4,4,1>F g2<4,4,1>F > > whereas without NIR we get: > > mul(8) g5<1>F g2<8,8,1>F g4<8,8,1>F > mad(8) g127<1>F g5<4,4,1>F g3<4,4,1>F g5<4,4,1>F > > Again, the assembly matches the NIR, so NIR is doing something bad. > > The changes to sun-temple/209 (and a lot of other sun temple shaders) > look wrong. I'm not sure if the translation from NIR is doing > something bad or what, but we should have 8 texture fetches and we > only end up with 2. > > > Some highlights -- > > Some shaders are improved because NIR recognizes exp2(log(x) * y) as > pow(x, y) when the GLSL optimizations couldn't, because its dead code > elimination is much too stupid to allow tree grafting to put the IR in > the form we expect. This accounts for a low of the high percentage > gains. > > It cuts some Kerbal Space Program shaders in half (~1000 -> ~500 > instructions) and removes ~100 spills and ~100 fills from each. Not > sure how it actually accomplished this. Too hard to read the assembly > with all of the spills. > > Code generated by tropico-5/252 is really stupid. We MOVs both before > and in an if statement that do the same thing. NIR cleans that up > easily as a side-effect of SSA. > > In strike-suit-zero/156 we emit more MADs that we couldn't recognize > from the GLSL IR, since again, its DCE is terrible and we can't > combine trees. > > In the-binding-of-isaac-rebirth/18, it's able to CSE some duplicate > rcp expressions in different basic blocks. I'm not sure why the > backend isn't doing this. > > > At this point I've looked through all of the shaders helped by >15% or more. > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev