This patch series is kind of a pot luck of dword multiplication patches. It started off as an attempt to make sure we're adhering to the rigorous requirements on Cherryview, but turned into making sure we could properly due SIMD16 multiples across the board.
The series primarily accomplishes 3 things: 1. Enable ARB_gpu_shader5 on gen8+ 1b. Enable the new QW type for full 32x32 multiplies on GEN8 2. Enable SIMD16 programs with dword multiplies (falling back as needed to SIMD8 for just the mul operations) 3. Provide an internal dword multiply interface 3b. Add a bunch of assertions for the various cases. I don't have performance data, and I don't expect measurable improvements. Although, it is possible that #1b and #2 actually do improve things. Oddly I was expecting to see some tests get fixed by the patch series, but other than enabling 1242 new passing tests (and 20 new failures that match other generations), I see no change. I am however optimistic that the series will prevent future failures. (BDW does add 1 new pass, but I am not sure if it's real): spec/ARB_texture_multisample/texelFetch fs sampler2DMSArray 4 98x1x9-98x129x9:fail pass Ben Widawsky (19): i965/vec4: Correct MUL destination hazard i965: Fix assertion in brw_reg_type_letters i965: Add QWORD sizes to type_sz macro i965/fs: Disallow SIMD16 multiplies on GEN8 i965/fs: Implement SIMD16 64-bit integer multiplies on Gen 8. i965/vec4: implement imul_high for gen8 i965: Extract is_dword test i965/vec4: Address mul/mach constraints i965: Add assertions for MACH instruction i965: Extract scalar region checking logic i965/eu: Add new MUL assertions i965: Disallow saturation for MACH operations i965: Enable ARB_GPU_SHADER5 for gen8+ i965/fs: Extract dword multiplies i965/fs: Add W*W mul shortcut i965/vec4: Extract dword multiplies i965: Add some dw mul operands fixes i965/fs: Add users of emit_mul_dw i965/vec4: Add users of emit_mul_dw Matt Turner (2): i965/fs: Implement SIMD16 integer multiplies on Gen 7. i965/fs: Implement SIMD16 64-bit integer multiplies on Gen 7. src/mesa/drivers/dri/i965/brw_defines.h | 3 + src/mesa/drivers/dri/i965/brw_eu.c | 2 +- src/mesa/drivers/dri/i965/brw_eu.h | 21 +++ src/mesa/drivers/dri/i965/brw_eu_emit.c | 148 +++++++++++++++++---- src/mesa/drivers/dri/i965/brw_fs.cpp | 125 +++++++++++++++++ src/mesa/drivers/dri/i965/brw_fs.h | 5 + src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 37 ++---- .../dri/i965/brw_fs_saturate_propagation.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 72 ++-------- src/mesa/drivers/dri/i965/brw_reg.h | 22 +++ src/mesa/drivers/dri/i965/brw_shader.cpp | 12 +- src/mesa/drivers/dri/i965/brw_shader.h | 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp | 103 ++++++++++++-- src/mesa/drivers/dri/i965/brw_vec4.h | 6 + .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 3 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 12 ++ src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 38 ++---- src/mesa/drivers/dri/i965/intel_extensions.c | 5 +- 18 files changed, 457 insertions(+), 161 deletions(-) -- 2.2.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev