On Fri, 2018-12-07 at 15:06 +0200, Pohjolainen, Topi wrote: > On Tue, Dec 04, 2018 at 08:16:58AM +0100, Iago Toral Quiroga wrote: > > We use ALign16 mode for this, since it is more convenient, but the > > PRM > > for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register > > region > > restrictions', Section '1. Special Restrictions': > > > > "In Align16 mode, the channel selects and channel enables apply > > to a > > pair of half-floats, because these parameters are defined for > > DWord > > elements ONLY. This is applicable when both source and > > destination > > are half-floats." > > > > This means that we cannot select individual HF elements using > > swizzles > > like we do with 32-bit floats so we can't implement the required > > regioning for this. > > > > Use the gen11 path for this instead, which uses Align1 mode. > > > > The restriction is not present in gen9 of gen10, where the Align16 > > or?
Right, the issue is exclusive to gen8. Iago > > implementation seems to work just fine. > > --- > > src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs_generator.cpp > > b/src/intel/compiler/brw_fs_generator.cpp > > index d8e4bae17e0..ba7ed07e692 100644 > > --- a/src/intel/compiler/brw_fs_generator.cpp > > +++ b/src/intel/compiler/brw_fs_generator.cpp > > @@ -1281,8 +1281,14 @@ fs_generator::generate_ddy(const fs_inst > > *inst, > > const uint32_t type_size = type_sz(src.type); > > > > if (inst->opcode == FS_OPCODE_DDY_FINE) { > > - /* produce accurate derivatives */ > > - if (devinfo->gen >= 11) { > > + /* produce accurate derivatives. We can do this easily in > > Align16 > > + * but this is not supported in gen11+ and gen8 Align16 > > swizzles > > + * for Half-Float operands work in units of 32-bit and > > always > > + * select pairs of consecutive half-float elements, so we > > can't use > > + * use it for this. > > + */ > > + if (devinfo->gen >= 11 || > > + (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF)) > > { > > src = stride(src, 0, 2, 1); > > struct brw_reg src_0 = byte_offset(src, 0 * type_size); > > struct brw_reg src_2 = byte_offset(src, 2 * type_size); > > -- > > 2.17.1 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev