On 06.06.2017 16:30, Marek Olšák wrote:
From: Marek Olšák <marek.ol...@amd.com>
The workaround causes a massive performance decrease on 1-SE parts.
(Cape Verde, Hainan, Oland)
The performance regression is already part of 17.0 and 17.1.
Cc: 17.0 17.1 <mesa-sta...@lists.freedesktop.org>
---
src/gallium/drivers/radeonsi/si_state_draw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
b/src/gallium/drivers/radeonsi/si_state_draw.c
index cd069e3..75e83ff 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -188,21 +188,21 @@ static void si_emit_derived_tess_state(struct si_context
*sctx,
*/
*num_patches = MIN2(*num_patches, 40);
if (sctx->b.chip_class == SI) {
/* SI bug workaround, related to power management. Limit LS-HS
* threadgroups to only one wave.
*/
unsigned one_wave = 64 / MAX2(num_tcs_input_cp,
num_tcs_output_cp);
*num_patches = MIN2(*num_patches, one_wave);
- if (sctx->screen->b.info.max_se == 1) {
+ if (sctx->screen->b.info.max_se == 1 && tcs->info.uses_primid) {
This is insufficient. All downstream shader stages are affected,
including TES, GS, and even PS (unless there's an API GS -- in that
case, the PS gets the GS primid output).
Cheers,
Nicolai
/* The VGT HS block increments the patch ID
unconditionally
* within a single threadgroup. This results in
incorrect
* patch IDs when instanced draws are used.
*
* The intended solution is to restrict threadgroups to
* a single instance by setting SWITCH_ON_EOI, which
* should cause IA to split instances up. However, this
* doesn't work correctly on SI when there is no other
* SE to switch to.
*/
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev