On 09/30/2013 07:16 PM, Ian Romanick wrote: > On 09/11/2013 10:00 PM, Chia-I Wu wrote: >> From: Chia-I Wu <o...@lunarg.com> >> >> Replicate the gradient of the top-left pixel to the other three pixels in the >> subspan, as how DDY is implemented. Before, different graidents were used >> for >> pixels in the top row and pixels in the bottom row. >> >> This change results in a less accurate approximation. However, it improves >> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at >> 95.0% confidence) on Haswell. No noticeable image quality difference >> observed. >> >> No piglit gpu.tests regressions. >> >> I failed to come up with an explanation for the performance difference. The >> change does not make a difference on Ivy Bridge either. If anyone has the >> insight, please kindly enlighten me. Performance differences may also be >> observed on other games that call textureGrad and dFdx. > > After all the experiments and discussions with the hardware guys, lets > go ahead and do this. We should do a couple things, however. > > 1. Disable the optimization if the application explicitly sets > GL_FRAGMENT_SHADER_DERIVATIVE_HINT to GL_NICEST.
Urgh...I always hate adding more state-dependent recompiles... To accomplish this, you'll have to: - Add a new high_quality_derivatives flag to brw_wm_prog_key. - In brw_wm_populate_key, add: /* _NEW_HINT */ key->high_quality_derivatives = ctx->Hint.FragmentShaderDerivative == GL_NICEST; - Add the _NEW_HINT dependency to brw_wm_prog's dirty flags. > 2. Add a driconf option, as suggested by Chris, to disable the optimization. ...which means changing the key setup to: if (brw->disable_derivative_optimization) { key->high_quality_derivatives = ctx->Hint.FragmentShaderDerivative != GL_FASTEST; } else { key->high_quality_derivatives = ctx->Hint.FragmentShaderDerivative == GL_NICEST; } and, in brw_fs_precompile, setting key->high_quality_derivatives = brw->disable_derivative_optimization; This all seems pretty awful to me...but I guess there's not really any getting around it. If the register had worked out, we could've just added a Hint() driver hook that programmed it appropriately. But alas. > 3. Use the same DDX / DDY calculation on all platforms. > > 4. Update the commit message and the comment in the code with the > explanation of the optimization (the HSW sample_d instruction does some > optimizations if the same LOD is used for all pixels, etc.). > >> Signed-off-by: Chia-I Wu <o...@lunarg.com> >> --- >> src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +++++++++++++---- >> 1 file changed, 13 insertions(+), 4 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >> index bfb3d33..c0d24a0 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct >> brw_reg dst, struct brw_reg src >> void >> fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct >> brw_reg src) >> { >> + /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on Haswell, >> + * which gives much better performance when the result is used with >> + * sample_d >> + */ >> + unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 : >> + BRW_VERTICAL_STRIDE_2; >> + unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 : >> + BRW_WIDTH_2; >> + >> struct brw_reg src0 = brw_reg(src.file, src.nr, 1, >> BRW_REGISTER_TYPE_F, >> - BRW_VERTICAL_STRIDE_2, >> - BRW_WIDTH_2, >> + vstride, >> + width, >> BRW_HORIZONTAL_STRIDE_0, >> BRW_SWIZZLE_XYZW, WRITEMASK_XYZW); >> struct brw_reg src1 = brw_reg(src.file, src.nr, 0, >> BRW_REGISTER_TYPE_F, >> - BRW_VERTICAL_STRIDE_2, >> - BRW_WIDTH_2, >> + vstride, >> + width, >> BRW_HORIZONTAL_STRIDE_0, >> BRW_SWIZZLE_XYZW, WRITEMASK_XYZW); >> brw_ADD(p, dst, src0, negate(src1)); >> > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev