When checking output VS in glsl-mat-from-int-ctor-03 piglit, I got the following (part of) code.
mov(8) g19<1>.xyzF g6<4,4,1>.xyzzD { align16 1Q }; dp4(8) g115<1>.wF g4<4,4,1>F g2.4<0,4,1>F { align16 NoDDChk 1Q }; cmp.nz.f0(8) null<1>F g11<4,4,1>.xyzzF g19<4,4,1>.xyzzF { align16 1Q switch }; cmp.nz.f0(8) null<1>D g7<4,4,1>D 0D { align16 1Q switch }; (+f0.any4h) mov(8) g21<1>.xUD 0xffffffffUD { align16 1Q }; Clearly the first cmp can be removed because the result is overwritten by the second one. Investigating why this is not happening, saw that in brw_vec4, after running opt_vector_float(), we are running optimizations just once, instead of in a loop until no progress happens. Not sure if there is a reason to keep it separated from the previous loop. Tracking back the code seems that originally it wasn't added because opt_vector_float() wasn't written as an optimization, and no optimizations were done after running it. But later someone suggested to run some optimizations if opt_vector_float() success, which made to add a conditional. In final commits, that opt_vector_float() was converted in a true optimiztaion, but still kept out of the loop. At this point I'm not sure if there was a good reason (no explanation found) or just to make the less changes in code. Maybe someone can bring light here. So merged those optimizations inside the previous loop (second commit). But this made some piglit tests to never end. Checking about this, saw that some optimizations were been reverted by others (specifically CSE reverted by copy-propagation). So I added a minor change in CSE (first commit) that prevents apply it when the common expression is just an immediate (it saves nothing, and adds a new instruction). The free shaders in shader-db are not improved (nor hurted). But testing against the non-free shaders we get a small improvement: total instructions in shared programs: 6819828 -> 6819468 (-0.01%) instructions in affected programs: 30516 -> 30156 (-1.18%) total loops in shared programs: 1971 -> 1971 (0.00%) helped: 154 HURT: 0 GAINED: 0 LOST: 0 Juan A. Suarez Romero (2): i965: Do not apply CSE opt to MOV immediate i965: run brw_vec4 optimizations in loop src/mesa/drivers/dri/i965/brw_vec4.cpp | 10 +++------- src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 13 ++++++++++++- 2 files changed, 15 insertions(+), 8 deletions(-) -- 2.5.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev