On Thu, 2017-01-12 at 16:45 -0800, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > > > From: "Juan A. Suarez Romero" <jasua...@igalia.com> > > > > Previous to Broadwell, we have 8 registers for MOV_INDIRECT. But if > > IVB/VLV deal with DFs, we will duplicate the exec_size from 8 to > > 16. > > > > This patch limits the SIMD width to 4 in this case. > > --- > > src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > > b/src/mesa/drivers/dri/i965/brw_fs.cpp > > index cfce364..45d320d 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > > @@ -4959,8 +4959,13 @@ get_lowered_simd_width(const struct > > gen_device_info *devinfo, > > return MIN2(8, inst->exec_size); > > > > case SHADER_OPCODE_MOV_INDIRECT: > > - /* Prior to Broadwell, we only have 8 address subregisters > > */ > > - return MIN3(devinfo->gen >= 8 ? 16 : 8, > > + /* Prior to Broadwell, we only have 8 address subregisters. > > Special case > > + * for IVB/VLV and DF types: set to 4 (exec_size will be > > later > > + * duplicated). > > The comment seems rather misleading, exec size doubling is unlikely > to > have anything to do with this problem. > > > + */ > > + return MIN3(devinfo->gen >= 8 ? 16 : ((devinfo->gen == 7 && > > + !devinfo->is_haswell > > && > > + inst- > > >exec_data_size() == 8) ? 4 : 8), > > 2 * REG_SIZE / (inst->dst.stride * type_sz(inst- > > >dst.type)), > > inst->exec_size); > > I'm amazed that this works at all on HSW, according to the IVB and > HSW > PRMs: > > "2.When the destination requires two registers and the sources are > indirect, the sources must use 1x1 regioning mode. In addition, the > sources must be assembled from GRF registers each accessed by > adjacent > index registers in 1x1 regioning modes." > > So for DF instructions the execution size is not limited by the > number > of address registers you have available, but by the EU decompression > logic not handling VxH indirect addressing correctly. >
Right. I will change the comment and the commit log too. > I think this should be something along the lines of: > > > const unsigned max_size = (devinfo->gen >= 8 ? 2 : 1) * REG_SIZE; > > return MIN3(devinfo->gen >= 8 ? 16 : 8, > > max_size / (inst->dst.stride * type_sz(inst- > > >dst.type)), > > inst->exec_size); > > OK, thanks. Sam > > -- > > 2.9.3 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev