From: Chia-I Wu <o...@lunarg.com> Replicate the gradient of the top-left pixel to the other three pixels in the subspan, as how DDY is implemented. Before, different graidents were used for pixels in the top row and pixels in the bottom row.
This change results in a less accurate approximation. However, it improves the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at 95.0% confidence) on Haswell. No noticeable image quality difference observed. No piglit gpu.tests regressions. I failed to come up with an explanation for the performance difference. The change does not make a difference on Ivy Bridge either. If anyone has the insight, please kindly enlighten me. Performance differences may also be observed on other games that call textureGrad and dFdx. Signed-off-by: Chia-I Wu <o...@lunarg.com> --- src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp index bfb3d33..c0d24a0 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src void fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct brw_reg src) { + /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on Haswell, + * which gives much better performance when the result is used with + * sample_d + */ + unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 : + BRW_VERTICAL_STRIDE_2; + unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 : + BRW_WIDTH_2; + struct brw_reg src0 = brw_reg(src.file, src.nr, 1, BRW_REGISTER_TYPE_F, - BRW_VERTICAL_STRIDE_2, - BRW_WIDTH_2, + vstride, + width, BRW_HORIZONTAL_STRIDE_0, BRW_SWIZZLE_XYZW, WRITEMASK_XYZW); struct brw_reg src1 = brw_reg(src.file, src.nr, 0, BRW_REGISTER_TYPE_F, - BRW_VERTICAL_STRIDE_2, - BRW_WIDTH_2, + vstride, + width, BRW_HORIZONTAL_STRIDE_0, BRW_SWIZZLE_XYZW, WRITEMASK_XYZW); brw_ADD(p, dst, src0, negate(src1)); -- 1.8.3.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev