On 04/11/15 20:13, Matt Turner wrote: > On Fri, Oct 23, 2015 at 7:17 AM, Alejandro Piñeiro <apinhe...@igalia.com> > wrote: >> Equivalent to commit 4eebeb but with sel operations. In this case > That commit sha doesn't exist. I suspect you meant commit 8ac3b525c.
Yes, probably that sha was the one on my wip branches. >> we select the PredCtrl based on the writemask. >> >> This change allows cmod propagation to optimize out several >> instructions. > Okay, so this patch helps a case like: I don't think that the case you pointed is a good one to explain why this is helping. Because this seems to suggest that this case is without the patch but... > > cmp.g.f0.0 vgrf32.0.x:F, vgrf31.xxxx:F, 0.000000F > cmp.nz.f0.0 null.x:D, vgrf32.xxxx:D, 0D ... this patch is needed to arrive to this point (null destination writemask being updated). > (+f0.0) sel vgrf2.0.x:F, |vgrf27.xxxx|:F, 0.000000F > > where our code thinks the SEL reads all four channels of the flag > register when it actually only reads .x because of the writemask. > > Adding something like this to the commit message would be good. So if we want to go into the details of why this change helps to reduce instructions, what about something like: "This patch helps on cases like this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD In this case, cmod propagation can't optimize instruction #2, because instructions #1 and #2 have different writemasks, and we can't update directly instruction #2 writemask because our code thinks that sel reads all four channels of the flag, when it actually only reads .x. So, with this patch, the previous case becames this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Now only the x channel of the flag is used, allowing dead code eliminate to update the writemask at the second instruction: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null.x:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD So now cmod propagation can simplify out #2: 1: cmp.l.f0.0 vgrf40.0.x:F, attr18.wwww:F, vgrf7.xxxx:F 2: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD " Although not sure if this explanation is too long. > >> Shader-db numbers: >> total instructions in shared programs: 6235835 -> 6228008 (-0.13%) >> instructions in affected programs: 219850 -> 212023 (-3.56%) >> total loops in shared programs: 1979 -> 1979 (0.00%) >> helped: 1192 >> HURT: 0 >> --- >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 18 +++++++++++++++++- >> 1 file changed, 17 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> index 0f04f65..bc86be6 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> @@ -1437,8 +1437,24 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) >> case nir_op_bcsel: >> emit(CMP(dst_null_d(), op[0], src_reg(0), BRW_CONDITIONAL_NZ)); >> inst = emit(BRW_OPCODE_SEL, dst, op[1], op[2]); >> - inst->predicate = BRW_PREDICATE_NORMAL; >> + switch (dst.writemask) { >> + case WRITEMASK_X: >> + inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_X; >> + break; >> + case WRITEMASK_Y: >> + inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_Y; >> + break; >> + case WRITEMASK_Z: >> + inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_Z; >> + break; >> + case WRITEMASK_W: >> + inst->predicate = BRW_PREDICATE_ALIGN16_REPLICATE_W; >> + break; >> + default: >> + inst->predicate = BRW_PREDICATE_NORMAL; >> break; > This break, and the one two lines below it are indented incorrectly. Ok. >> + } >> + break; >> >> case nir_op_fdot_replicated2: >> inst = emit(BRW_OPCODE_DP2, dst, op[0], op[1]); >> -- >> 2.1.4 >> > With the commit message fixed/updated and the breaks properly indented, As I changed totally the use case and description, I would prefer to confirm that I explained myself properly before pushing. > > Reviewed-by: Matt Turner <matts...@gmail.com> > > Thanks! > Thanks for reviewing. BR -- Alejandro Piñeiro (apinhe...@igalia.com) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev