This is a v2 of the series that I sent out a month or two ago to fix up the LOAD_PAYLOAD instruction in the i965 FS backend compiler. This new version incorporates comments from a variety of people.
The last patch in the series is one that I think at least 3 of us have independantly written to enable copy-propagation from the ATTR register file. This helps the old fs_visitor code some but it helps NIR a *lot*. The end result is that NIR is now on par from a shader-db perspective with the old fs_visitor code. Shader-db results across all but the last patch: total instructions in shared programs: 7440514 -> 7440378 (-0.00%) instructions in affected programs: 11954 -> 11818 (-1.14%) helped: 68 HURT: 0 GAINED: 3 LOST: 0 All 68 helped shaders are in unigen heaven ultra 4xmsaa. It appears that it helps us to copy-propagate uses of omask. The omask is weird because it's a single-register header that's a 16-wide UW type. The previous guessing code was actually getting it wrong and copying it with an 8-wide UD move that was not force_writemask all. Now it is getting copied with force_writemask_all set correctly (because it's a header source). It seems like copy propagation can now come through and clean it up. (At least that's my theory; I didn't look at it that hard.) Francisco Jerez (1): i965/fs: Fix passing an immediate to half(). Jason Ekstrand (13): i965/fs: Make half(fs_reg, unsigned) handle register files more explicitly i965/fs_cse: Factor out code to create copy instructions i965: Change header_present to header_size in backend_instruction i965/fs_inst: Add an is_copy_payload helper i965/fs: Make emit_single_fb_write take an explicit exec_size i965/fs: Make LOAD_PAYLOAD take a header size i965/fs: Rework fs_visitor::LOAD_PAYLOAD SQUASH: i965/fs: Make destinations of load_payload have the appropreate width SQUASH: i965/fs: Rework fs_visitor::lower_load_payload SQUASH: i965/fs_cse: Support the new-style LOAD_PAYLOAD SQUASH: i965/fs_inst::is_copy_payload: Support the new-style LOAD_PAYLOAD SQUASH: i965/fs: Simplify setup_color_payload i965/fs_inst: Get rid of the effective_width field Kenneth Graunke (1): i965/fs: Allow copy propagation on ATTR file registers. src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 4 +- src/mesa/drivers/dri/i965/brw_fs.cpp | 270 +++++++++++---------- src/mesa/drivers/dri/i965/brw_fs.h | 9 +- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 12 +- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 117 ++++----- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 20 +- .../drivers/dri/i965/brw_fs_register_coalesce.cpp | 17 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 237 +++++++----------- src/mesa/drivers/dri/i965/brw_ir_fs.h | 30 ++- src/mesa/drivers/dri/i965/brw_shader.h | 4 +- src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 10 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 16 +- 13 files changed, 338 insertions(+), 410 deletions(-) -- 2.3.6 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev