I've received confirmation from the HW team that the extra doubling is only needed on Haswell GT3.
On Tue, May 15, 2018 at 5:28 PM Jason Ekstrand <ja...@jlekstrand.net> wrote: > The data in the commit message is a bit sketchy for Ivybridge. We don't > run dEQP or any of the CTSs on Ivybridge in CI so all the data we have > is piglit. On Haswell, piglit didn't catch anything so we don't have > anything to go off of for Ivybridge besides the fact that the restriction > wasn't added until Haswell. > --- > src/intel/blorp/blorp_clear.c | 66 > ++++++++++++++++++++++++++++++++++++------- > 1 file changed, 56 insertions(+), 10 deletions(-) > > diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c > index 832e8ee..618625b 100644 > --- a/src/intel/blorp/blorp_clear.c > +++ b/src/intel/blorp/blorp_clear.c > @@ -235,16 +235,62 @@ get_fast_clear_rect(const struct isl_device *dev, > x_scaledown = x_align / 2; > y_scaledown = y_align / 2; > > - /* From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel > Pixel > - * Backend > MCS Buffer for Render Target(s) [DevIVB+] > Table > "Color > - * Clear of Non-MultiSampled Render Target Restrictions": > - * > - * Clear rectangle must be aligned to two times the number of > - * pixels in the table shown below due to 16x16 hashing across the > - * slice. > - */ > - x_align *= 2; > - y_align *= 2; > + if (ISL_DEV_IS_HASWELL(dev)) { > + /* The following text was added in the Haswell PRM, "3D Media > GPGPU > + * Engine" >> "MCS Buffer for Render Target(s)" >> Table "Color > Clear > + * of Non-MultiSampler Render Target Restrictions": > + * > + * "Clear rectangle must be aligned to two times the number of > + * pixels in the table shown below due to 16X16 hashing > across the > + * slice." > + * > + * It has persisted in the documentation for all platforms up > until > + * Cannonlake and possibly even beyond. However, we believe > that it > + * is only needed on Haswell. > + * > + * There are a couple possible explanations for this restriction: > + * > + * 1) If you assume that the hardware is writing to the CCS as > + * bytes, then the x/y_align computed above gives you an > alignment > + * in the CCS of 8x8 bytes and, if 16x16 is needed for > hashing, we > + * need to multiply by 2. > + * > + * 2) Haswell is a bit unique in that it's CCS tiling does not > line > + * up with Y-tiling on a cache-line granularity. Instead, it > has > + * an extra bit of swizzling in bit 9. Also, bit 6 swizzling > + * applies to the CCS on Haswell. This means that Haswell CTS > + * does not match on a cache-line granularity but it does > match on > + * a 2x2 cache line granularity. > + * > + * Clearly, the first explanation seems to follow documentation > the > + * best but they may be related. In any case, empirical evidence > + * seems to confirm that it is, indeed required on Haswell. > + * > + * On Broadwell things get a bit stickier. Broadwell adds > support > + * for mip-mapped CCS with an alignment in the CCS of 256x128. > For a > + * 32bpb main surface, the above computation will yield a > x/y_align > + * of 128x128 for a Y-tiled main surface and 256x64 for > X-tiled. In > + * either case, if we double the alignment, we will get an > alignment > + * bigger than horizontal and vertical alignment of the CCS and > fast > + * clears of one LOD may leak into others. > + * > + * Starting with Skylake, the image alignment for the CCS is only > + * 128x64 which is exactly the x/h_align computed above if the > main > + * surface has a 32bpb format. Also, the "Render Target Resolve" > + * page in the bspec (not the PRM) says, "The Resolve Rectangle > size > + * is same as Clear Rectangle size from SKL+". The x/y_align > + * computed above (without doubling) match the resolve rectangle > + * calculation perfectly. > + * > + * Finally, to confirm all this, a full test run was performed on > + * Feb. 9, 2018 with this doubling removed and the only platform > + * which seemed to be affected was Haswell. The run consisted of > + * piglit, dEQP, the Vulkan CTS 1.0.2, the OpenGL 4.5 CTS, and > the > + * OpenGL ES 3.2 CTS. > + */ > + x_align *= 2; > + y_align *= 2; > + } > } else { > assert(aux_surf->usage == ISL_SURF_USAGE_MCS_BIT); > > -- > 2.5.0.400.gff86faf > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev