On Wed, Feb 24, 2016 at 12:22 PM, Grigori Goronzy <g...@chown.ath.cx> wrote: > On 2016-02-23 17:45, Marek Olšák wrote: >> >> From: Marek Olšák <marek.ol...@amd.com> >> >> This can increase perf for shaders that kill pixels (kill, alpha-test, >> alpha-to-coverage). >> --- >> src/gallium/drivers/radeonsi/si_shader.h | 1 + >> src/gallium/drivers/radeonsi/si_state.c | 6 +++--- >> src/gallium/drivers/radeonsi/si_state_shaders.c | 16 +++++++++++++--- >> 3 files changed, 17 insertions(+), 6 deletions(-) >> >> diff --git a/src/gallium/drivers/radeonsi/si_shader.h >> b/src/gallium/drivers/radeonsi/si_shader.h >> index ff5c24d..637d264 100644 >> --- a/src/gallium/drivers/radeonsi/si_shader.h >> +++ b/src/gallium/drivers/radeonsi/si_shader.h >> @@ -365,6 +365,7 @@ struct si_shader { >> struct r600_resource *scratch_bo; >> union si_shader_key key; >> bool is_binary_shared; >> + unsigned z_order; >> >> /* The following data is all that's needed for binary shaders. */ >> struct radeon_shader_binary binary; >> diff --git a/src/gallium/drivers/radeonsi/si_state.c >> b/src/gallium/drivers/radeonsi/si_state.c >> index 2dfdbeb..b23b17a 100644 >> --- a/src/gallium/drivers/radeonsi/si_state.c >> +++ b/src/gallium/drivers/radeonsi/si_state.c >> @@ -1339,10 +1339,10 @@ static void si_emit_db_render_state(struct >> si_context *sctx, struct r600_atom *s >> sctx->ps_db_shader_control; >> >> /* Bug workaround for smoothing (overrasterization) on SI. */ >> - if (sctx->b.chip_class == SI && sctx->smoothing_enabled) >> + if (sctx->b.chip_class == SI && sctx->smoothing_enabled) { >> + db_shader_control &= C_02880C_Z_ORDER; >> db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z); >> - else >> - db_shader_control |= >> S_02880C_Z_ORDER(V_02880C_EARLY_Z_THEN_LATE_Z); >> + } >> >> /* Disable the gl_SampleMask fragment shader output if MSAA is >> disabled. */ >> if (sctx->framebuffer.nr_samples <= 1 || (rs && >> !rs->multisample_enable)) >> diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c >> b/src/gallium/drivers/radeonsi/si_state_shaders.c >> index a6753a7..c220185 100644 >> --- a/src/gallium/drivers/radeonsi/si_state_shaders.c >> +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c >> @@ -789,6 +789,13 @@ static void si_shader_ps(struct si_shader *shader) >> S_00B02C_EXTRA_LDS_SIZE(shader->config.lds_size) | >> S_00B02C_USER_SGPR(num_user_sgprs) | >> >> S_00B32C_SCRATCH_EN(shader->config.scratch_bytes_per_wave > 0)); >> + >> + /* Prefer RE_Z if the shader is complex enough. */ >> + if (info->num_memory_instructions >= 2 || >> + shader->binary.code_size > 100*4) >> + shader->z_order = V_02880C_EARLY_Z_THEN_RE_Z; >> + else >> + shader->z_order = V_02880C_EARLY_Z_THEN_LATE_Z; >> } >> > > Are these thresholds for switching to re-Z based on measurements, feedback > by the HW team or are they just a shot in the dark? > Either way, the magic numbers don't look particularly nice. Maybe > preprocessor constants should be introduced for them?
They are not so magic. The meaning is 2 memory instructions or instruction count between 50 and 100. They are based on my estimates and expectations. No benchmarking has been done, but there is a potential to gain some performance with shaders killing pixels. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev