Jason Ekstrand <ja...@jlekstrand.net> writes: > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez <curroje...@riseup.net> > wrote: > >> Because the "low" temporary needs to be accessed with word type and >> twice the original stride, attempting to preserve the alignment of the >> original destination can potentially lead to instructions with illegal >> destination stride greater than four. Because the CHV/BXT alignment >> restrictions are now being enforced by the regioning lowering pass run >> after lower_integer_multiplication(), there is no real need to >> preserve the original strides anymore. >> >> Note that this bug can be reproduced on stable branches, but >> back-porting would be non-trivial, because the fix relies on the >> regioning lowering pass recently introduced. >> --- >> src/intel/compiler/brw_fs.cpp | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp >> index f475b617df2..5768e0d6542 100644 >> --- a/src/intel/compiler/brw_fs.cpp >> +++ b/src/intel/compiler/brw_fs.cpp >> @@ -3962,13 +3962,11 @@ fs_visitor::lower_integer_multiplication() >> regions_overlap(inst->dst, inst->size_written, >> inst->src[0], inst->size_read(0)) || >> regions_overlap(inst->dst, inst->size_written, >> - inst->src[1], inst->size_read(1))) { >> + inst->src[1], inst->size_read(1)) || >> + inst->dst.stride >= 4) { >> > > It would be nice to throw in a quick comment as to why we're adding a > temporary when stride >= 4. >
There seemed to be no pre-existing comment about any of the other conditions to allocate a temporary, so I've added the following: + /* Get a new VGRF for the "low" 32x16-bit multiplication result if + * reusing the original destination is impossible due to hardware + * restrictions, source/destination overlap, or it being the null + * register. + */ > >> needs_mov = true; >> - /* Get a new VGRF but keep the same stride as inst->dst */ >> low = fs_reg(VGRF, alloc.allocate(regs_written(inst)), >> inst->dst.type); >> - low.stride = inst->dst.stride; >> - low.offset = inst->dst.offset % REG_SIZE; >> } >> >> /* Get a new VGRF but keep the same stride as inst->dst */ >> -- >> 2.19.2 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev