Ilia, for stable, this is quite a big patch, adding a new class and source file.
As per documentation, our criteria to accept a patch in the stable queue includes not being larger than 100 lines, which this ones exceeds by far: https://www.mesa3d.org/submittingpatches.html#criteria Hence, by now I'm inclined to reject it. Let me know what you think. On Sun, 2017-10-22 at 17:37 -0400, Ilia Mirkin wrote: > There are two issues with the current implementation. First, it relies > on the layout(local_size_*) happening in the same shader as the main > function, and secondly it doesn't work for variable group sizes. > > In both cases, the simplest fix is to move the setup of these derived > values to a later time, similar to how the gl_VertexID workarounds are > done. There already exist system values defined for both of the derived > values, so we use them unconditionally, and lower them after linking is > performed. > > While we're at it, we move to using gl_LocalGroupSizeARB instead of > gl_WorkGroupSize for variable group sizes. > > Also the dead code elimination avoidance can be removed, since there > can be situations where gl_LocalGroupSizeARB is needed but has not been > inserted for the shader with main function. As a result, the lowering > code has to insert its own copies of the system values if needed. > > Reported-by: Stephane Chevigny <stephane.chevi...@polymtl.ca> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103393 > Cc: mesa-sta...@lists.freedesktop.org > Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> > --- > > Well this all turned out to be a deeper rabbit-hole than I was anticipating. > I'm going to write up some piglit tests for these cases as they are fairly > subtle, and unlikely to occur in practice. > > I figured it was easier to just re-add the system values instead of having > the dead code not-remove them, since gl_LocalGroupSizeARB might not be in > the shader with the main function (it's only added if the variable size ext > is enabled, which it might only be in a second shader which sets the > local_size_*). So we'd have to have it anyways, might as well do it for all. > > src/compiler/Makefile.sources | 1 + > src/compiler/glsl/builtin_variables.cpp | 94 +-------- > src/compiler/glsl/glsl_parser_extras.cpp | 2 - > src/compiler/glsl/ir.h | 4 - > src/compiler/glsl/ir_optimization.h | 1 + > src/compiler/glsl/linker.cpp | 3 + > src/compiler/glsl/lower_cs_derived.cpp | 234 > +++++++++++++++++++++++ > src/compiler/glsl/meson.build | 1 + > src/compiler/glsl/opt_dead_builtin_variables.cpp | 22 --- > 9 files changed, 244 insertions(+), 118 deletions(-) > create mode 100644 src/compiler/glsl/lower_cs_derived.cpp > > diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources > index 2724a41286e..27cc33ab835 100644 > --- a/src/compiler/Makefile.sources > +++ b/src/compiler/Makefile.sources > @@ -85,6 +85,7 @@ LIBGLSL_FILES = \ > glsl/lower_buffer_access.cpp \ > glsl/lower_buffer_access.h \ > glsl/lower_const_arrays_to_uniforms.cpp \ > + glsl/lower_cs_derived.cpp \ > glsl/lower_discard.cpp \ > glsl/lower_discard_flow.cpp \ > glsl/lower_distance.cpp \ > diff --git a/src/compiler/glsl/builtin_variables.cpp > b/src/compiler/glsl/builtin_variables.cpp > index ea2d897cc8e..00bc99dd619 100644 > --- a/src/compiler/glsl/builtin_variables.cpp > +++ b/src/compiler/glsl/builtin_variables.cpp > @@ -1295,15 +1295,10 @@ builtin_variable_generator::generate_cs_special_vars() > uvec3_t, "gl_LocalGroupSizeARB"); > } > > - if (state->ctx->Const.LowerCsDerivedVariables) { > - add_variable("gl_GlobalInvocationID", uvec3_t, ir_var_auto, 0); > - add_variable("gl_LocalInvocationIndex", uint_t, ir_var_auto, 0); > - } else { > - add_system_value(SYSTEM_VALUE_GLOBAL_INVOCATION_ID, > - uvec3_t, "gl_GlobalInvocationID"); > - add_system_value(SYSTEM_VALUE_LOCAL_INVOCATION_INDEX, > - uint_t, "gl_LocalInvocationIndex"); > - } > + add_system_value(SYSTEM_VALUE_GLOBAL_INVOCATION_ID, > + uvec3_t, "gl_GlobalInvocationID"); > + add_system_value(SYSTEM_VALUE_LOCAL_INVOCATION_INDEX, > + uint_t, "gl_LocalInvocationIndex"); > } > > > @@ -1474,84 +1469,3 @@ _mesa_glsl_initialize_variables(exec_list > *instructions, > break; > } > } > - > - > -/** > - * Initialize compute shader variables with values that are derived from > other > - * compute shader variable. > - */ > -static void > -initialize_cs_derived_variables(gl_shader *shader, > - ir_function_signature *const main_sig) > -{ > - assert(shader->Stage == MESA_SHADER_COMPUTE); > - > - ir_variable *gl_GlobalInvocationID = > - shader->symbols->get_variable("gl_GlobalInvocationID"); > - assert(gl_GlobalInvocationID); > - ir_variable *gl_WorkGroupID = > - shader->symbols->get_variable("gl_WorkGroupID"); > - assert(gl_WorkGroupID); > - ir_variable *gl_WorkGroupSize = > - shader->symbols->get_variable("gl_WorkGroupSize"); > - if (gl_WorkGroupSize == NULL) { > - void *const mem_ctx = ralloc_parent(shader->ir); > - gl_WorkGroupSize = new(mem_ctx) ir_variable(glsl_type::uvec3_type, > - "gl_WorkGroupSize", > - ir_var_auto); > - gl_WorkGroupSize->data.how_declared = ir_var_declared_implicitly; > - gl_WorkGroupSize->data.read_only = true; > - shader->ir->push_head(gl_WorkGroupSize); > - } > - ir_variable *gl_LocalInvocationID = > - shader->symbols->get_variable("gl_LocalInvocationID"); > - assert(gl_LocalInvocationID); > - > - /* gl_GlobalInvocationID = > - * gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID > - */ > - ir_instruction *inst = > - assign(gl_GlobalInvocationID, > - add(mul(gl_WorkGroupID, gl_WorkGroupSize), > - gl_LocalInvocationID)); > - main_sig->body.push_head(inst); > - > - /* gl_LocalInvocationIndex = > - * gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y + > - * gl_LocalInvocationID.y * gl_WorkGroupSize.x + > - * gl_LocalInvocationID.x; > - */ > - ir_expression *index_z = > - mul(mul(swizzle_z(gl_LocalInvocationID), swizzle_x(gl_WorkGroupSize)), > - swizzle_y(gl_WorkGroupSize)); > - ir_expression *index_y = > - mul(swizzle_y(gl_LocalInvocationID), swizzle_x(gl_WorkGroupSize)); > - ir_expression *index_y_plus_z = add(index_y, index_z); > - operand index_x(swizzle_x(gl_LocalInvocationID)); > - ir_expression *index_x_plus_y_plus_z = add(index_y_plus_z, index_x); > - ir_variable *gl_LocalInvocationIndex = > - shader->symbols->get_variable("gl_LocalInvocationIndex"); > - assert(gl_LocalInvocationIndex); > - inst = assign(gl_LocalInvocationIndex, index_x_plus_y_plus_z); > - main_sig->body.push_head(inst); > -} > - > - > -/** > - * Initialize builtin variables with values based on other builtin variables. > - * These are initialized in the main function. > - */ > -void > -_mesa_glsl_initialize_derived_variables(struct gl_context *ctx, > - gl_shader *shader) > -{ > - /* We only need to set CS variables currently. */ > - if (shader->Stage == MESA_SHADER_COMPUTE && > - ctx->Const.LowerCsDerivedVariables) { > - ir_function_signature *const main_sig = > - _mesa_get_main_function_signature(shader->symbols); > - > - if (main_sig != NULL) > - initialize_cs_derived_variables(shader, main_sig); > - } > -} > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp > b/src/compiler/glsl/glsl_parser_extras.cpp > index 764c05ad802..51d835bc535 100644 > --- a/src/compiler/glsl/glsl_parser_extras.cpp > +++ b/src/compiler/glsl/glsl_parser_extras.cpp > @@ -2009,8 +2009,6 @@ opt_shader_and_create_symbol_table(struct gl_context > *ctx, > break; > } > } > - > - _mesa_glsl_initialize_derived_variables(ctx, shader); > } > > void > diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h > index 27eafe8e22b..070d7b3ffdd 100644 > --- a/src/compiler/glsl/ir.h > +++ b/src/compiler/glsl/ir.h > @@ -2413,10 +2413,6 @@ _mesa_glsl_initialize_variables(exec_list > *instructions, > struct _mesa_glsl_parse_state *state); > > extern void > -_mesa_glsl_initialize_derived_variables(struct gl_context *ctx, > - gl_shader *shader); > - > -extern void > reparent_ir(exec_list *list, void *mem_ctx); > > extern void > diff --git a/src/compiler/glsl/ir_optimization.h > b/src/compiler/glsl/ir_optimization.h > index eb3ec3b0c7d..f44ddcb05be 100644 > --- a/src/compiler/glsl/ir_optimization.h > +++ b/src/compiler/glsl/ir_optimization.h > @@ -166,6 +166,7 @@ void optimize_dead_builtin_variables(exec_list > *instructions, > bool lower_tess_level(gl_linked_shader *shader); > > bool lower_vertex_id(gl_linked_shader *shader); > +bool lower_cs_derived(gl_linked_shader *shader); > bool lower_blend_equation_advanced(gl_linked_shader *shader); > > bool lower_subroutine(exec_list *instructions, struct _mesa_glsl_parse_state > *state); > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp > index 03eb05bf637..f72bff53bf5 100644 > --- a/src/compiler/glsl/linker.cpp > +++ b/src/compiler/glsl/linker.cpp > @@ -2374,6 +2374,9 @@ link_intrastage_shaders(void *mem_ctx, > if (ctx->Const.VertexID_is_zero_based) > lower_vertex_id(linked); > > + if (ctx->Const.LowerCsDerivedVariables) > + lower_cs_derived(linked); > + > #ifdef DEBUG > /* Compute the source checksum. */ > linked->SourceChecksum = 0; > diff --git a/src/compiler/glsl/lower_cs_derived.cpp > b/src/compiler/glsl/lower_cs_derived.cpp > new file mode 100644 > index 00000000000..f7ec2a48b47 > --- /dev/null > +++ b/src/compiler/glsl/lower_cs_derived.cpp > @@ -0,0 +1,234 @@ > +/* > + * Copyright © 2017 Ilia Mirkin > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + */ > + > +/** > + * \file lower_cs_derived.cpp > + * > + * For hardware that does not support the gl_GlobalInvocationID and > + * gl_LocalInvocationIndex system values, replace them with fresh > + * globals. Note that we can't rely on gl_WorkGroupSize or > + * gl_LocalGroupSizeARB being available, since they may only have been > defined > + * in a non-main shader. > + * > + * [ This can happen if only a secondary shader has the layout(local_size_*) > + * declaration. ] > + * > + * This is meant to be run post-linking. > + */ > + > +#include "glsl_symbol_table.h" > +#include "ir_hierarchical_visitor.h" > +#include "ir.h" > +#include "ir_builder.h" > +#include "linker.h" > +#include "program/prog_statevars.h" > +#include "builtin_functions.h" > + > +using namespace ir_builder; > + > +namespace { > + > +class lower_cs_derived_visitor : public ir_hierarchical_visitor { > +public: > + explicit lower_cs_derived_visitor(gl_linked_shader *shader) > + : progress(false), > + shader(shader), > + local_size_variable(shader->Program->info.cs.local_size_variable), > + gl_WorkGroupSize(NULL), > + gl_WorkGroupID(NULL), > + gl_LocalInvocationID(NULL), > + gl_GlobalInvocationID(NULL), > + gl_LocalInvocationIndex(NULL) > + { > + main_sig = _mesa_get_main_function_signature(shader->symbols); > + assert(main_sig); > + } > + > + virtual ir_visitor_status visit(ir_dereference_variable *); > + > + ir_variable *add_system_value( > + int slot, const glsl_type *type, const char *name); > + void find_sysvals(); > + void make_gl_GlobalInvocationID(); > + void make_gl_LocalInvocationIndex(); > + > + bool progress; > + > +private: > + gl_linked_shader *shader; > + bool local_size_variable; > + ir_function_signature *main_sig; > + > + ir_rvalue *gl_WorkGroupSize; > + ir_variable *gl_WorkGroupID; > + ir_variable *gl_LocalInvocationID; > + > + ir_variable *gl_GlobalInvocationID; > + ir_variable *gl_LocalInvocationIndex; > +}; > + > +} /* anonymous namespace */ > + > +ir_variable * > +lower_cs_derived_visitor::add_system_value( > + int slot, const glsl_type *type, const char *name) > +{ > + ir_variable *var = new(shader) ir_variable(type, name, > ir_var_system_value); > + var->data.how_declared = ir_var_declared_implicitly; > + var->data.read_only = true; > + var->data.location = slot; > + var->data.explicit_location = true; > + var->data.explicit_index = 0; > + shader->ir->push_head(var); > + > + return var; > +} > + > +void > +lower_cs_derived_visitor::find_sysvals() > +{ > + if (gl_WorkGroupSize != NULL) > + return; > + > + ir_variable *WorkGroupSize; > + if (local_size_variable) > + WorkGroupSize = shader->symbols->get_variable("gl_LocalGroupSizeARB"); > + else > + WorkGroupSize = shader->symbols->get_variable("gl_WorkGroupSize"); > + if (WorkGroupSize) > + gl_WorkGroupSize = new(shader) ir_dereference_variable(WorkGroupSize); > + gl_WorkGroupID = shader->symbols->get_variable("gl_WorkGroupID"); > + gl_LocalInvocationID = > shader->symbols->get_variable("gl_LocalInvocationID"); > + > + /* > + * These may be missing due to either dead code elimination, or, in the > + * case of the group size, due to the layout being declared in a non-main > + * shader. Re-create them. > + */ > + > + if (!gl_WorkGroupID) > + gl_WorkGroupID = add_system_value( > + SYSTEM_VALUE_WORK_GROUP_ID, glsl_type::uvec3_type, > "gl_WorkGroupID"); > + if (!gl_LocalInvocationID) > + gl_LocalInvocationID = add_system_value( > + SYSTEM_VALUE_LOCAL_INVOCATION_ID, glsl_type::uvec3_type, > + "gl_LocalInvocationID"); > + if (!WorkGroupSize) { > + if (local_size_variable) { > + gl_WorkGroupSize = new(shader) ir_dereference_variable( > + add_system_value( > + SYSTEM_VALUE_LOCAL_GROUP_SIZE, glsl_type::uvec3_type, > + "gl_LocalGroupSizeARB")); > + } else { > + ir_constant_data data; > + memset(&data, 0, sizeof(data)); > + for (int i = 0; i < 3; i++) > + data.u[i] = shader->Program->info.cs.local_size[i]; > + gl_WorkGroupSize = new(shader) ir_constant(glsl_type::uvec3_type, > &data); > + } > + } > +} > + > +void > +lower_cs_derived_visitor::make_gl_GlobalInvocationID() > +{ > + if (gl_GlobalInvocationID != NULL) > + return; > + > + find_sysvals(); > + > + /* gl_GlobalInvocationID = > + * gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID > + */ > + gl_GlobalInvocationID = new(shader) ir_variable( > + glsl_type::uvec3_type, "__GlobalInvocationID", ir_var_temporary); > + shader->ir->push_head(gl_GlobalInvocationID); > + > + ir_instruction *inst = > + assign(gl_GlobalInvocationID, > + add(mul(gl_WorkGroupID, gl_WorkGroupSize->clone(shader, NULL)), > + gl_LocalInvocationID)); > + main_sig->body.push_head(inst); > +} > + > +void > +lower_cs_derived_visitor::make_gl_LocalInvocationIndex() > +{ > + if (gl_LocalInvocationIndex != NULL) > + return; > + > + find_sysvals(); > + > + /* gl_LocalInvocationIndex = > + * gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y + > + * gl_LocalInvocationID.y * gl_WorkGroupSize.x + > + * gl_LocalInvocationID.x; > + */ > + gl_LocalInvocationIndex = new(shader) > + ir_variable(glsl_type::uint_type, "__LocalInvocationIndex", > ir_var_temporary); > + shader->ir->push_head(gl_LocalInvocationIndex); > + > + ir_expression *index_z = > + mul(mul(swizzle_z(gl_LocalInvocationID), > swizzle_x(gl_WorkGroupSize->clone(shader, NULL))), > + swizzle_y(gl_WorkGroupSize->clone(shader, NULL))); > + ir_expression *index_y = > + mul(swizzle_y(gl_LocalInvocationID), > swizzle_x(gl_WorkGroupSize->clone(shader, NULL))); > + ir_expression *index_y_plus_z = add(index_y, index_z); > + operand index_x(swizzle_x(gl_LocalInvocationID)); > + ir_expression *index_x_plus_y_plus_z = add(index_y_plus_z, index_x); > + ir_instruction *inst = > + assign(gl_LocalInvocationIndex, index_x_plus_y_plus_z); > + main_sig->body.push_head(inst); > +} > + > +ir_visitor_status > +lower_cs_derived_visitor::visit(ir_dereference_variable *ir) > +{ > + if (ir->var->data.mode == ir_var_system_value && > + ir->var->data.location == SYSTEM_VALUE_GLOBAL_INVOCATION_ID) { > + make_gl_GlobalInvocationID(); > + ir->var = gl_GlobalInvocationID; > + progress = true; > + } > + > + if (ir->var->data.mode == ir_var_system_value && > + ir->var->data.location == SYSTEM_VALUE_LOCAL_INVOCATION_INDEX) { > + make_gl_LocalInvocationIndex(); > + ir->var = gl_LocalInvocationIndex; > + progress = true; > + } > + > + return visit_continue; > +} > + > +bool > +lower_cs_derived(gl_linked_shader *shader) > +{ > + if (shader->Stage != MESA_SHADER_COMPUTE) > + return false; > + > + lower_cs_derived_visitor v(shader); > + v.run(shader->ir); > + > + return v.progress; > +} > diff --git a/src/compiler/glsl/meson.build b/src/compiler/glsl/meson.build > index d1a75eb8c36..76fcafb9910 100644 > --- a/src/compiler/glsl/meson.build > +++ b/src/compiler/glsl/meson.build > @@ -124,6 +124,7 @@ files_libglsl = files( > 'lower_buffer_access.cpp', > 'lower_buffer_access.h', > 'lower_const_arrays_to_uniforms.cpp', > + 'lower_cs_derived.cpp', > 'lower_discard.cpp', > 'lower_discard_flow.cpp', > 'lower_distance.cpp', > diff --git a/src/compiler/glsl/opt_dead_builtin_variables.cpp > b/src/compiler/glsl/opt_dead_builtin_variables.cpp > index 03e578982b9..0d4e3a8f00a 100644 > --- a/src/compiler/glsl/opt_dead_builtin_variables.cpp > +++ b/src/compiler/glsl/opt_dead_builtin_variables.cpp > @@ -62,23 +62,6 @@ optimize_dead_builtin_variables(exec_list *instructions, > * information, so removing these variables from the user shader will > * cause problems later. > * > - * For compute shaders, gl_GlobalInvocationID has some dependencies, so > - * we avoid removing these dependencies. > - * > - * We also avoid removing gl_GlobalInvocationID at this stage because > it > - * might be used by a linked shader. In this case it still needs to be > - * initialized by the main function. > - * > - * gl_GlobalInvocationID = > - * gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID > - * > - * Similarly, we initialize gl_LocalInvocationIndex in the main > function: > - * > - * gl_LocalInvocationIndex = > - * gl_LocalInvocationID.z * gl_WorkGroupSize.x * > gl_WorkGroupSize.y + > - * gl_LocalInvocationID.y * gl_WorkGroupSize.x + > - * gl_LocalInvocationID.x; > - * > * Matrix uniforms with "Transpose" are not eliminated because there's > * an optimization pass that can turn references to the regular matrix > * into references to the transpose matrix. Eliminating the transpose > @@ -90,11 +73,6 @@ optimize_dead_builtin_variables(exec_list *instructions, > */ > if (strcmp(var->name, "gl_ModelViewProjectionMatrix") == 0 > || strcmp(var->name, "gl_Vertex") == 0 > - || strcmp(var->name, "gl_WorkGroupID") == 0 > - || strcmp(var->name, "gl_WorkGroupSize") == 0 > - || strcmp(var->name, "gl_LocalInvocationID") == 0 > - || strcmp(var->name, "gl_GlobalInvocationID") == 0 > - || strcmp(var->name, "gl_LocalInvocationIndex") == 0 > || strstr(var->name, "Transpose") != NULL) > continue; > -- Br, Andres _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev