This implementation avoids two unneeded MOVs for each 64-bit component. One was done in the old shuffle, to avoid cases of src/dst overlap but this is not the case. And the removed MOV was already being being done in the shuffle.
Copy propagation wasn't able to remove them because shuffle destination values are defined with partial writes because they have stride == 2. --- src/intel/compiler/brw_fs_nir.cpp | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 780a9e228de..6abc7c0174d 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -2305,11 +2305,11 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst, } if (type_sz(dst.type) == 8) { - shuffle_32bit_load_result_to_64bit_data( - bld, tmp_dst, retype(tmp_dst, BRW_REGISTER_TYPE_F), num_components); - - for (unsigned c = 0; c < num_components; c++) - bld.MOV(offset(dst, bld, iter * 2 + c), offset(tmp_dst, bld, c)); + shuffle_from_32bit_read(bld, + offset(dst, bld, iter * 2), + retype(tmp_dst, BRW_REGISTER_TYPE_D), + 0, + num_components); } if (num_iterations > 1) { -- 2.17.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev