Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > From: "Juan A. Suarez Romero" <jasua...@igalia.com> > > Previous to Broadwell, we have 8 registers for MOV_INDIRECT. But if > IVB/VLV deal with DFs, we will duplicate the exec_size from 8 to 16. > > This patch limits the SIMD width to 4 in this case. > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index cfce364..45d320d 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -4959,8 +4959,13 @@ get_lowered_simd_width(const struct gen_device_info > *devinfo, > return MIN2(8, inst->exec_size); > > case SHADER_OPCODE_MOV_INDIRECT: > - /* Prior to Broadwell, we only have 8 address subregisters */ > - return MIN3(devinfo->gen >= 8 ? 16 : 8, > + /* Prior to Broadwell, we only have 8 address subregisters. Special > case > + * for IVB/VLV and DF types: set to 4 (exec_size will be later > + * duplicated).
The comment seems rather misleading, exec size doubling is unlikely to have anything to do with this problem. > + */ > + return MIN3(devinfo->gen >= 8 ? 16 : ((devinfo->gen == 7 && > + !devinfo->is_haswell && > + inst->exec_data_size() == 8) ? > 4 : 8), > 2 * REG_SIZE / (inst->dst.stride * > type_sz(inst->dst.type)), > inst->exec_size); I'm amazed that this works at all on HSW, according to the IVB and HSW PRMs: "2.When the destination requires two registers and the sources are indirect, the sources must use 1x1 regioning mode. In addition, the sources must be assembled from GRF registers each accessed by adjacent index registers in 1x1 regioning modes." So for DF instructions the execution size is not limited by the number of address registers you have available, but by the EU decompression logic not handling VxH indirect addressing correctly. I think this should be something along the lines of: | const unsigned max_size = (devinfo->gen >= 8 ? 2 : 1) * REG_SIZE; | return MIN3(devinfo->gen >= 8 ? 16 : 8, | max_size / (inst->dst.stride * type_sz(inst->dst.type)), | inst->exec_size); > > -- > 2.9.3 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev