Caveat: I have not tested this on < Gen7 (platforms that send from MRF). My guess is that I've broken something on those, but that it shouldn't be a fundamental problem.
Extending this to MRF platforms seems like it'll take a little bit of extra work, from my 2 minutes of thinking about it. This series is available in my tree git://people.freedesktop.org/~mattst88/mesa tex-sources i965/fs: Add and use an fs_inst copy constructor. i965/fs: Disable fs_inst assignment operator. i965/fs: Combine fs_inst constructors using default i965/fs: ralloc fs_inst's fs_reg sources. i965/fs: Store the number of sources an fs_inst has. i965/fs: Add a function to resize fs_inst's sources i965/fs: Add fs_inst constructor that takes a list of i965/fs: Loop from 0 to inst->sources, not 0 to 3. These change fs_inst to contain a pointer to some ralloc'd array containing the instructions' sources. I sent some of these a few months ago hoping to go ahead and get them reviewed, but not much luck. i965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD. i965/fs: Lower LOAD_PAYLOAD and clean up. i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7(). i965/fs: Apply cube map array fixup and restore the These add and use the load_payload instruction. It takes a variable number of arguments, and loads them ito a large virtual GRF. This will let us avoid partial updates to large virtual registers when SSA comes. It also allows us to do some cool optimizations later in the series. i965/fs: Only consider real sources when comparing i965/fs: Emit load_payload instead of multiple MOVs for i965/fs: Support register coalescing on LOAD_PAYLOAD i965/fs: Perform CSE on load_payload instructions if i965/fs: Copy propagate from load_payload. These add support for combining duplicate load_payload instrs, register coalescing their sources, and copy propagating through them. With an unreleased title removed from shader-db, the results from before load_payload to this point are total instructions in shared programs: 1614283 -> 1614100 (-0.01%) instructions in affected programs: 3957 -> 3774 (-4.62%) GAINED: 9 LOST: 1 with a single 299 -> 300 instruction regression in the LOST program. i965/fs: Perform CSE on texture operations. This allows us to CSE duplicate texture operations! i965/fs: Optimize SEL with the same sources into a MOV. And a small optimization I noticed when I looked at the results of texture op CSE. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev