On 05/25/2017 07:04 PM, Marek Olšák wrote:
From: Marek Olšák <marek.ol...@amd.com>

Or do they? This doesn't hang, so it seems right, but I'm not 100% sure.
Setting VGPRS=256 (i.e. above the limit) with big threadgroups works fine.

shader-db: Spilled VGPRs: 107 -> 50 (-53.27 %)

DiRT Showdown and GRID Autosport have 100% reduction in VGPR spilling.
There are no other changes for shader-db.

That would be awesome, maybe this will fix the performance issue with advanced lighting and DiRT Showdown? Did you check?

Either way, patches 1, 3-5 are:

Reviewed-by: Samuel Pitoiset <samuel.pitoi...@gmail.com>

---
  src/gallium/drivers/radeonsi/si_shader.c | 9 +++++++++
  1 file changed, 9 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 61f1384..0ffe402 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -4040,20 +4040,29 @@ static unsigned si_get_max_workgroup_size(const struct 
si_shader *shader)
                       properties[TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH] *
                       properties[TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT] *
                       properties[TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH];
if (!max_work_group_size) {
                /* This is a variable group size compute shader,
                 * compile it for the maximum possible group size.
                 */
                max_work_group_size = SI_MAX_VARIABLE_THREADS_PER_BLOCK;
        }
+
+       /* Compute shader threadgroups without LDS usage and barriers don't
+        * have to be stuck on the same compute unit, and so register usage
+        * doesn't have to be limited.
+        */
+       if (!shader->selector->local_size &&
+           !shader->selector->info.uses_barrier)
+               return MIN2(64, max_work_group_size);
+
        return max_work_group_size;
  }
static void declare_per_stage_desc_pointers(struct si_shader_context *ctx,
                                            LLVMTypeRef *params,
                                            unsigned *num_params,
                                            bool assign_params)
  {
        params[(*num_params)++] = si_const_array(ctx->v4i32,
                                                 SI_NUM_SHADER_BUFFERS + 
SI_NUM_CONST_BUFFERS);

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to