Also... it looks like only a few patches in this series got reviewed. Did the whole thing actually land?
On 11/28/2017 05:54 AM, Andres Gomez wrote: > Matt, this looks like a good candidate to nominate for inclusion in > the 17.2 stable queue. > > What do you think? > > On Thu, 2017-09-28 at 23:05 -0700, Matt Turner wrote: >> 64-bit operations on Atom parts have additional restrictions over their >> big-core counterparts (validated by later patches). >> >> Specifically, the restriction that "Source and Destination horizontal >> stride must be aligned to the same qword" is violated by most shift >> operations since NIR uses a 32-bit value as the shift count argument, >> and this causes instructions like >> >> shl(8) g19<1>Q g5<4,4,1>Q g23<4,4,1>UD >> >> where src1 has a 32-bit stride, but the dest and src0 have a 64-bit >> stride. >> >> This caused ~4 pixels in the ARB_shader_ballot piglit test >> fs-readInvocation-uint.shader_test to be incorrect. Unfortunately no >> ARB_gpu_shader_int64 test hit this case because they operate on >> uniforms, and their scalar regions are an exception to the restriction. >> >> We work around this by effectively unpacking the shift count, so that we >> can read it with a 64-bit stride in the shift instruction. Unfortunately >> the unpack (a MOV with a dst stride of 2) is a partial write, and cannot >> be copy-propagated or CSE'd. >> --- >> src/intel/compiler/brw_fs_nir.cpp | 34 ++++++++++++++++++++++++++++------ >> 1 file changed, 28 insertions(+), 6 deletions(-) >> >> diff --git a/src/intel/compiler/brw_fs_nir.cpp >> b/src/intel/compiler/brw_fs_nir.cpp >> index d760946e62..8ea4297ceb 100644 >> --- a/src/intel/compiler/brw_fs_nir.cpp >> +++ b/src/intel/compiler/brw_fs_nir.cpp >> @@ -1267,14 +1267,36 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, >> nir_alu_instr *instr) >> unreachable("not reached: should have been lowered"); >> >> case nir_op_ishl: >> - bld.SHL(result, op[0], op[1]); >> - break; >> case nir_op_ishr: >> - bld.ASR(result, op[0], op[1]); >> - break; >> - case nir_op_ushr: >> - bld.SHR(result, op[0], op[1]); >> + case nir_op_ushr: { >> + fs_reg shift_count = op[1]; >> + >> + if (devinfo->is_cherryview || gen_device_info_is_9lp(devinfo)) { >> + if (op[1].file == VGRF && >> + (result.type == BRW_REGISTER_TYPE_Q || >> + result.type == BRW_REGISTER_TYPE_UQ)) { >> + shift_count = fs_reg(VGRF, alloc.allocate(dispatch_width / 4), >> + BRW_REGISTER_TYPE_UD); >> + shift_count.stride = 2; >> + bld.MOV(shift_count, op[1]); >> + } >> + } >> + >> + switch (instr->op) { >> + case nir_op_ishl: >> + bld.SHL(result, op[0], shift_count); >> + break; >> + case nir_op_ishr: >> + bld.ASR(result, op[0], shift_count); >> + break; >> + case nir_op_ushr: >> + bld.SHR(result, op[0], shift_count); >> + break; >> + default: >> + unreachable("not reached"); >> + } >> break; >> + } >> >> case nir_op_pack_half_2x16_split: >> bld.emit(FS_OPCODE_PACK_HALF_2x16_SPLIT, result, op[0], op[1]); _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev