From: Marek Olšák <marek.ol...@amd.com> Or do they? This doesn't hang, so it seems right, but I'm not 100% sure. Setting VGPRS=256 (i.e. above the limit) with big threadgroups works fine.
shader-db: Spilled VGPRs: 107 -> 50 (-53.27 %) DiRT Showdown and GRID Autosport have 100% reduction in VGPR spilling. There are no other changes for shader-db. --- src/gallium/drivers/radeonsi/si_shader.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 61f1384..0ffe402 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -4040,20 +4040,29 @@ static unsigned si_get_max_workgroup_size(const struct si_shader *shader) properties[TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH] * properties[TGSI_PROPERTY_CS_FIXED_BLOCK_HEIGHT] * properties[TGSI_PROPERTY_CS_FIXED_BLOCK_DEPTH]; if (!max_work_group_size) { /* This is a variable group size compute shader, * compile it for the maximum possible group size. */ max_work_group_size = SI_MAX_VARIABLE_THREADS_PER_BLOCK; } + + /* Compute shader threadgroups without LDS usage and barriers don't + * have to be stuck on the same compute unit, and so register usage + * doesn't have to be limited. + */ + if (!shader->selector->local_size && + !shader->selector->info.uses_barrier) + return MIN2(64, max_work_group_size); + return max_work_group_size; } static void declare_per_stage_desc_pointers(struct si_shader_context *ctx, LLVMTypeRef *params, unsigned *num_params, bool assign_params) { params[(*num_params)++] = si_const_array(ctx->v4i32, SI_NUM_SHADER_BUFFERS + SI_NUM_CONST_BUFFERS); -- 2.7.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev