From: Marek Olšák <marek.ol...@amd.com> The workaround causes a massive performance decrease on 1-SE parts. (Cape Verde, Hainan, Oland)
The performance regression is already part of 17.0 and 17.1. Cc: 17.0 17.1 <mesa-sta...@lists.freedesktop.org> --- src/gallium/drivers/radeonsi/si_state_draw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index cd069e3..75e83ff 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -188,21 +188,21 @@ static void si_emit_derived_tess_state(struct si_context *sctx, */ *num_patches = MIN2(*num_patches, 40); if (sctx->b.chip_class == SI) { /* SI bug workaround, related to power management. Limit LS-HS * threadgroups to only one wave. */ unsigned one_wave = 64 / MAX2(num_tcs_input_cp, num_tcs_output_cp); *num_patches = MIN2(*num_patches, one_wave); - if (sctx->screen->b.info.max_se == 1) { + if (sctx->screen->b.info.max_se == 1 && tcs->info.uses_primid) { /* The VGT HS block increments the patch ID unconditionally * within a single threadgroup. This results in incorrect * patch IDs when instanced draws are used. * * The intended solution is to restrict threadgroups to * a single instance by setting SWITCH_ON_EOI, which * should cause IA to split instances up. However, this * doesn't work correctly on SI when there is no other * SE to switch to. */ -- 2.7.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev