Ilia Mirkin <imir...@alum.mit.edu> writes: > FYI there's already a lowering pass that does this in the GLSL IR > (CARRY_TO_ARITH in lower_instructions). Perhaps the right place to do > this is NIR though, just wanted to let you know. > Ah, I wasn't aware of that flag, that seems even better. I just tried it and it seems to generate one instruction more per op than my assembly code (apparently because our implementation of b2i is suboptimal, could probably be fixed), but it would also work to get rid of the no16() calls, which is all I care about right now.
I'll resend using your approach tomorrow. > On Thu, Jul 9, 2015 at 3:51 PM, Francisco Jerez <curroje...@riseup.net> wrote: >> This gets rid of two no16() fall-backs and should allow better >> scheduling of the generated IR. There are no uses of usubBorrow() or >> uaddCarry() in shader-db so no changes are expected. However the >> "arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and >> "arb_gpu_shader5/execution/built-in-functions/fs-uaddCarry" piglit >> tests go from 40 to 28 instructions. The reason is that the plain ADD >> instruction can easily be CSE'ed with the original addition, and the >> negation can easily be propagated into the source modifier of another >> instruction, so effectively both operations can be performed with just >> one instruction. >> >> No piglit regressions. >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 33 >> +++++++++++++------------------- >> 1 file changed, 13 insertions(+), 20 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index 6d9e9d3..3b6aa0a 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -829,29 +829,22 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, >> nir_alu_instr *instr) >> bld.emit(SHADER_OPCODE_INT_QUOTIENT, result, op[0], op[1]); >> break; >> >> - case nir_op_uadd_carry: { >> - if (devinfo->gen >= 7) >> - no16("SIMD16 explicit accumulator operands unsupported\n"); >> - >> - struct brw_reg acc = retype(brw_acc_reg(dispatch_width), >> - BRW_REGISTER_TYPE_UD); >> - >> - bld.ADDC(bld.null_reg_ud(), op[0], op[1]); >> - bld.MOV(result, fs_reg(acc)); >> + case nir_op_uadd_carry: >> + /* Use signed operands for the ADD to be easily CSE'ed with the >> original >> + * addition (e.g. in case we're implementing the uaddCarry() GLSL >> + * built-in). >> + */ >> + bld.ADD(result, retype(op[0], BRW_REGISTER_TYPE_D), >> + retype(op[1], BRW_REGISTER_TYPE_D)); >> + bld.CMP(result, retype(result, BRW_REGISTER_TYPE_UD), op[0], >> + BRW_CONDITIONAL_L); >> + bld.MOV(result, negate(result)); >> break; >> - } >> >> - case nir_op_usub_borrow: { >> - if (devinfo->gen >= 7) >> - no16("SIMD16 explicit accumulator operands unsupported\n"); >> - >> - struct brw_reg acc = retype(brw_acc_reg(dispatch_width), >> - BRW_REGISTER_TYPE_UD); >> - >> - bld.SUBB(bld.null_reg_ud(), op[0], op[1]); >> - bld.MOV(result, fs_reg(acc)); >> + case nir_op_usub_borrow: >> + bld.CMP(result, op[0], op[1], BRW_CONDITIONAL_L); >> + bld.MOV(result, negate(result)); >> break; >> - } >> >> case nir_op_umod: >> bld.emit(SHADER_OPCODE_INT_REMAINDER, result, op[0], op[1]); >> -- >> 2.4.3 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev