We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can.
All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 827dbee..788315f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1850,8 +1850,11 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld, fs_inst *inst; if (indirect_offset.file == BAD_FILE) { /* Replicate the patch handle to all enabled channels */ + const fs_reg srcs[] = { + retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UD) + }; fs_reg patch_handle = bld.vgrf(BRW_REGISTER_TYPE_UD, 1); - bld.MOV(patch_handle, retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UD)); + bld.LOAD_PAYLOAD(patch_handle, srcs, ARRAY_SIZE(srcs), 0); inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dest, patch_handle); inst->mlen = 1; -- 2.6.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev