Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > On Mon, 2017-01-09 at 16:18 -0800, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: >> >> > From: Iago Toral Quiroga <ito...@igalia.com> >> > >> > It seems to use 1 channel por DF, just like later hardware. The >> > docs say things >> > like: >> > >> > "Each DF operand uses a pair of channels and all masking and >> > swizzling >> > should be adjusted appropriately." >> > >> > "In Align16, all regioning parameters must use the syntax of a pair >> > of packed >> > floats, including channel selects and channel enables." >> > >> > In these cases, masking and channel select/enables seem to refere >> > only to >> > Align16's swizzle and writemasks respectively, not to execution >> > masking. >> > >> > In any case, this means that execution masking controls and >> > execsize go >> > in different units and we need to adjust the assertion on the >> > relation >> > between the two accordingly. >> > --- >> > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 6 +++++- >> > 1 file changed, 5 insertions(+), 1 deletion(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > index 90ee7c1..ac2d8ad 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > @@ -1646,7 +1646,11 @@ fs_generator::generate_code(const cfg_t >> > *cfg, int dispatch_width) >> > brw_set_default_exec_size(p, cvt(inst->exec_size) - 1); >> > >> > assert(inst->force_writemask_all || inst->exec_size >= 4); >> > - assert(inst->force_writemask_all || inst->group % inst- >> > >exec_size == 0); >> > + assert(inst->force_writemask_all || >> > + ((devinfo->gen != 7 || devinfo->is_haswell) && >> > + inst->group % inst->exec_size == 0) || >> > + ((devinfo->gen == 7 && !devinfo->is_haswell) && >> > + (2 * inst->group) % inst->exec_size == 0)); >> >> NAK, exec size doubling happens at the codegen level so the IR group >> and >> exec_size controls should still be expressed in the same units. >> > > Actually the patch is right because the exec size doubling already > happened at this point, so this is fixing the difference in units. >
It may match the expectations of PATCH 3 from this series, but at the same time it breaks the expectations of the rest of the back-end IR infrastructure. E.g. regs_read() will now return bogus results from the point you whacked the execution size of DF instructions. I don't want to have to think which portion of the public IR interface still behaves sensibly and which portion will give me garbage values when I use it From the generator. To make the matter scarier this happens from the back-end code generator which isn't even supposed to modify its input in a semantics-preserving way (it's supposed to emit machine code matching the IR given as input). PATCH 3 causes it to destroy its input (try invoking the code generator a second time on the same shader and you'll get garbage as result...). > One alternative to this patch is to move the exec_size doubling code to > just after this assert, so we can discard this patch. > > What do you think? > Please let's not modify the IR from the codegen pass [this goes as modification request for PATCH 3 too]. Just have the generator emit machine code with twice the IR execution size for double-precision instructions on IVB. > Sam > >> > assert(inst->base_mrf + inst->mlen <= BRW_MAX_MRF(devinfo- >> > >gen)); >> > assert(inst->mlen <= BRW_MAX_MSG_LENGTH); >> > >> > -- >> > 2.9.3 >> > >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev