Jason Ekstrand <ja...@jlekstrand.net> writes: > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez <curroje...@riseup.net> > wrote: > >> Strides up to 32B can be implemented for the source regions of most >> instructions by leveraging either the vertical or the horizontal >> stride of the hardware Align1 region. The main motivation for this is >> that currently the lower_integer_multiplication() pass will happily >> double the stride of one of the 32-bit sources, which can blow up if >> the stride of the original source was already the maximum value >> allowed by the hardware. >> > > I thought this looked familiar so I did some digging... > > On Nov 2 of 2017, I wrote almost exactly this same patch which was > committed on Nov 7 as e8c9e65185de3e821e1 > On Nov 14, Matt reverted it in a31d0382084c8aa8 because it wasn't needed > anymore and he wasn't sure of its correctness. >
That's funny, I wasn't aware of e8c9e65185de3e821e1 nor of its revert. Change was certainly still needed on Nov 14 due to lower_integer_multiplication(). > And here we are again.... > > I still believe it to be correct so it is > > Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > > My one major request is that you include some of the history of this change > in the commit message. As far as the patch itself goes, it's identical to > mine except for the unneeded whitespace change and one additional assert > which I believe to be a good addition. > Added a comment locally about your previous attempt to do the same thing. > I've also CC'd matt in case he wants to throw in his $.02 > > --Jason > > An alternative would be to use the regioning legalization pass in >> order to lower such strides into the composition of multiple legal >> strides, but that would be somewhat less efficient. >> >> This showed up as a regression from my commit cbea91eb57a501bebb1ca2 >> in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a >> pre-existing problem that had affected conformance on other platforms >> without native support for integer multiplication. CHV/BXT were >> getting around it because the code I removed in that commit had the >> "fortunate" side effect of emitting narrower regions that didn't hit >> the hardware stride limit after lowering. Beyond fixing the >> regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on >> ICL (that's why this patch is marked for inclusion in mesa-stable even >> though the original regressing patch was not). >> >> Cc: mesa-sta...@lists.freedesktop.org >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328 >> Reported-by >> <https://bugs.freedesktop.org/show_bug.cgi?id=109328Reported-by>: Mark >> Janes <mark.a.ja...@intel.com> >> --- >> src/intel/compiler/brw_fs_generator.cpp | 14 +++++++++++--- >> 1 file changed, 11 insertions(+), 3 deletions(-) >> >> diff --git a/src/intel/compiler/brw_fs_generator.cpp >> b/src/intel/compiler/brw_fs_generator.cpp >> index 5fc6cf5f8cc..b169eacf15b 100644 >> --- a/src/intel/compiler/brw_fs_generator.cpp >> +++ b/src/intel/compiler/brw_fs_generator.cpp >> @@ -90,9 +90,17 @@ brw_reg_from_fs_reg(const struct gen_device_info >> *devinfo, fs_inst *inst, >> * different execution size when the number of components >> * written to each destination GRF is not the same. >> */ >> - const unsigned width = MIN2(reg_width, phys_width); >> - brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), reg->nr, >> 0); >> - brw_reg = stride(brw_reg, width * reg->stride, width, >> reg->stride); >> + if (reg->stride > 4) { >> + assert(reg != &inst->dst); >> + assert(reg->stride * type_sz(reg->type) <= REG_SIZE); >> + brw_reg = brw_vecn_reg(1, brw_file_from_reg(reg), reg->nr, 0); >> + brw_reg = stride(brw_reg, reg->stride, 1, 0); >> + >> > > Extra whitespace? > > >> + } else { >> + const unsigned width = MIN2(reg_width, phys_width); >> + brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), >> reg->nr, 0); >> + brw_reg = stride(brw_reg, width * reg->stride, width, >> reg->stride); >> + } >> >> if (devinfo->gen == 7 && !devinfo->is_haswell) { >> /* From the IvyBridge PRM (EU Changes by Processor >> Generation, page 13): >> -- >> 2.19.2 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev