On Fri, Aug 14, 2015 at 4:56 AM, Antia Puentes <apuen...@igalia.com> wrote: > If the register types do not match and the instruction > that contains the final destination is saturated, register > coalescing generated non-equivalent code. > > This did not happen when using IR because types usually > matched, but it is visible in nir-vec4. > > For example, > mov vgrf7:D vgrf2:D > mov.sat m4:F vgrf7:F > > is coalesced to: > mov.sat m4:D vgrf2:D > > The patch prevents coalescing in such scenario, unless the > instruction we want to coalesce into is a MOV. In that case, > the patch sets the register types to the type of the final > destination. > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 18 ++++++++++++++++++ > 1 file changed, 18 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index f18915a..52982c3 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1053,6 +1053,15 @@ vec4_visitor::opt_register_coalesce() > } > } > > + /* This doesn't handle saturation on the instruction we > + * want to coalesce away if the register types do not match. > + * But if scan_inst is a 'mov' we can fix the types later. > + */ > + if (inst->saturate && > + inst->dst.type != scan_inst->dst.type && > + scan_inst->opcode != BRW_OPCODE_MOV) > + break; > + > /* If we can't handle the swizzle, bail. */ > if (!scan_inst->can_reswizzle(inst->dst.writemask, > inst->src[0].swizzle, > @@ -1128,6 +1137,15 @@ vec4_visitor::opt_register_coalesce() > scan_inst->dst.file = inst->dst.file; > scan_inst->dst.reg = inst->dst.reg; > scan_inst->dst.reg_offset = inst->dst.reg_offset; > + if (inst->saturate && > + inst->dst.type != scan_inst->dst.type) { > + /* If we have reached this point, scan_inst is a 'mov' and > + * we can modify its register types to match the ones in > inst. > + * Otherwise, we could have an incorrect saturation result. > + */ > + scan_inst->dst.type = inst->dst.type; > + scan_inst->src[0].type = inst->src[0].type;
This is *almost* correct. However, if scan_inst is a type-converting move, we get mov b:D, a:F mov.sat c:F, b:F this will map to mov.sat c:F, a:F which is obviously not correct. Now, why anyone would want to do that conversion is beyond me, but it's theoretically possible. If we're going to try and change the types to make the sat work, then we need to also make sure that the two types match. So the condition you check should be something like if (inst->saturate && inst->dst.type != scan_inst->dst.type && !(scan_inst->opcode == BRW_OPCODE_MOV && scan_inst->dst.type == scan_inst->src[0].type) break; Does that work? With that changed, Reviewed-by: Jason Ekstrand <jason.ekstr...@intel.com> Also, please add the following before pushing: Cc: "10.6 11.0" <mesa-sta...@lists.freedesktop.org> > + } > scan_inst->saturate |= inst->saturate; > } > scan_inst = (vec4_instruction *)scan_inst->next; > -- > 2.1.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev