On Thu, Oct 17, 2013 at 1:53 PM, Chia-I Wu <olva...@gmail.com> wrote: > Hi Eric, > > On Sat, Oct 12, 2013 at 3:18 AM, Eric Anholt <e...@anholt.net> wrote: >> Chia-I Wu <olva...@gmail.com> writes: >> >>> Hi Eric, >>> The frame rate of Unigine Tropics (with low shader quality) dropped >>> from 40.8 to 23.5 after this change. >> >> Thanks for the note. I see the regression as well, and I see a shader >> that's started spilling. It looks like we can drop the regs_written <= >> 1 check on gen7+'s pre-regalloc scheduling to fix the problem (the MRF >> setup thing is no longer an issue, and its presence is now making us >> pessimize instead of optimize in general in the pre-regalloc >> scheduling). I'll want to run a few more tests to make sure that this >> doesn't regress something else. > Are you looking at this issue? The change you suggested does not > avoid spilling. > > I think the problem can be demonstrated with this snippet: > > vec4 val = vec4(0.0); > > vec4 tmp_001 = texture(tex, texcoord * 0.01); > val += tmp_001; > vec4 tmp_002 = texture(tex, texcoord * 0.02); > val += tmp_002; > vec4 tmp_003 = texture(tex, texcoord * 0.03); > val += tmp_003; > ... > vec4 tmp_099 = texture(tex, texcoord * 0.99); > val += tmp_099; > vec4 tmp_100 = texture(tex, texcoord * 1.00); > val += tmp_100; > > gl_FragColor = val; > > Before the change, the scheduler saw a dependency between any two > texture() calls (because of the use of MRF). It was inclined to keep > the accumulation of tmp_xxx between texture() calls even though the > accumulation also had a dependency on the last texture() call. > > After the change, the dependencies between texture()s are gone. The > scheduler sees a chance to move all the high latency texture() > together and generate something like this: Ah, I started looking at post-reg-alloc scheduling in the middle way... My reasoning was wrong. The correct one is:
It worked before this change because there were dependencies between texture() calls, and those texture() calls must thus be scheduled in that order. Accumulations were scheduled as soon as they were available, and thus were intermixed with texture() calls. It does not work now because the dependencies between texture() calls are gone. Since the scheduler schedules in FILO order, texture() calls are scheduled in reversed order. Accumulations are thus available only after all texture() calls are scheduled. This remains true with the fix suggested (it is still desirable, only that it is a partial fix). The problem can be demonstrated with the attached fragment shader. > vec4 tmp_003 = texture(tex, texcoord * 0.03); > ... > vec4 tmp_099 = texture(tex, texcoord * 0.99); > vec4 tmp_100 = texture(tex, texcoord * 1.00); > > val += tmp_001; > val += tmp_002; > val += tmp_003; > ... > val += tmp_099; > val += tmp_100; > > Since there are not enough registers to hold all tmp_xxx, the register > allocation starts spilling. > >> >> This shader is also in bad shape now that we don't have the redundant >> MRF move optimization, and we need to look into grf_size > 1 CSE. That >> would probably also have avoided the problem on this shader, though the >> scheduling problem is more general than this one shader. > > > > -- > o...@lunarg.com > val = texture(tex, texcoord * 1.0); -- o...@lunarg.com
465.frag
Description: Binary data
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev