On Wed, 2015-06-24 at 15:43 +0300, Francisco Jerez wrote: > Iago Toral <ito...@igalia.com> writes: > > > On Wed, 2015-06-17 at 17:20 -0700, Jordan Justen wrote: > >> I wanted to question whether this was required, based on this text > >> from the extension spec: > >> > >> "The ability to write to buffer objects creates the potential for > >> multiple independent shader invocations to read and write the same > >> underlying memory. The same issue exists with the > >> ARB_shader_image_load_store extension provided in OpenGL 4.2, which > >> can write to texture objects and buffers. In both cases, the > >> specification makes few guarantees related to the relative order of > >> memory reads and writes performed by the shader invocations." > >> > >> But I'm not sure if we can reconcile CSE with 'memoryBarrier' and > >> 'barrier'. curro, any thoughts from image load/store? > > > > I think the problem is within the same thread, that text above talks > > about multiple invocations reading from and writing to the same > > location, but within the same invocation, the order of reads and writes > > must be preserved: > > > > "Buffer variable memory reads and writes within a single shader > > invocation are processed in order. However, the order of reads and > > writes performed in one invocation relative to those performed by > > another invocation is largely undefined." > > > > For example, if X is a shader storage buffer variable and we have code > > like this with just one invocation: > > > > ssbo_store(X, 1); > > a = ssbo_load(X) + 1 // a = 2 > > ssbo_store(X, 2); > > b = ssbo_load(X) + 1; // b = 3 > > > > CSE could mess it up like this: > > > > ssbo_store(X, 1); > > tmp = ssbo_load(X) + 1 // tmp = 2 > > a = tmp; > > ssbo_store(X, 2); > > b = tmp; > > > > which would be incorrect. I think I wrote this patch after seeing > > something like this happening. The CSE pass clearly states that it does > > not support write variables after all. > > > > Also, notice the same would apply if there are multiple invocations but > > the shader code used something like gl_VertexID or gl_FragCoord to make > > each invocation read from/write to a different address within the SSBO > > buffer (I imagine this is the usual way to operate with SSBOs). In these > > cases, even if we have multiple invocations, keeping the relative order > > of reads and writes within each one is necessary. > > > > AFAICT the reason why this (and many of the other changes in GLSL > optimization passes) is needed is because SSBO loads have been > implemented as ir_expression nodes instead of being lowered into > intrinsics (as other side-effectful operations do like > ARB_shader_image_load_store and ARB_shader_atomic_counters). This > surely broke the assumption of a number of optimization passes that > ir_expression nodes behave as pure functions. I guess the reason why > you've done it this way is because UBO loads were already being > represented as expressions, so I see why you may have wanted to use the > same approach for SSBOs even though there is a fundamental difference > between the two: UBO loads have no side effects and are constant for a > given set of arguments and a given shader execution, SSBO loads and > stores are not. SSBO stores couldn't be accommodated into the same > framework so easily, and you decided to create a separate ir node for > them, what seems inconsistent with loads. Intrinsics would probably > have been a good fit for both loads and stores, and would have made all > these optimization changes unnecessary... > > P.S.: Sorry for the late reply, I was on vacation when I was CC'ed.
Right, your assessment about the reasons behind the current implementation is correct. I did not realize of these issues when I decided to go with the current implementation, now it does look like going with GLSL intrinsics would have made things a bit easier. I suppose it would make sense to revisit the implementation in the near future taking your work on arb_shader_image_load_store as a reference. Iago > > Iago > > > >> -Jordan > >> > >> On 2015-06-03 00:01:13, Iago Toral Quiroga wrote: > >> > SSBOs are read/write and this CSE pass only handles read-only variables. > >> > --- > >> > src/glsl/opt_cse.cpp | 33 ++++++++++++++++++++++++++++++++- > >> > 1 file changed, 32 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp > >> > index 4b8e9a0..a05ab46 100644 > >> > --- a/src/glsl/opt_cse.cpp > >> > +++ b/src/glsl/opt_cse.cpp > >> > @@ -245,6 +245,28 @@ contains_rvalue(ir_rvalue *haystack, ir_rvalue > >> > *needle) > >> > } > >> > > >> > static bool > >> > +expression_contains_ssbo_load(ir_expression *expr) > >> > +{ > >> > + if (expr->operation == ir_binop_ssbo_load) > >> > + return true; > >> > + > >> > + for (unsigned i = 0; i < expr->get_num_operands(); i++) { > >> > + ir_rvalue *op = expr->operands[i]; > >> > + if (op->ir_type == ir_type_expression && > >> > + expression_contains_ssbo_load(op->as_expression())) { > >> > + return true; > >> > + } else if (op->ir_type == ir_type_swizzle) { > >> > + ir_swizzle *swizzle = op->as_swizzle(); > >> > + ir_expression *val = swizzle->val->as_expression(); > >> > + if (val && expression_contains_ssbo_load(val)) > >> > + return true; > >> > + } > >> > + } > >> > + > >> > + return false; > >> > +} > >> > + > >> > +static bool > >> > is_cse_candidate(ir_rvalue *ir) > >> > { > >> > /* Our temporary variable assignment generation isn't ready to handle > >> > @@ -260,7 +282,16 @@ is_cse_candidate(ir_rvalue *ir) > >> > * to variable-index array dereferences at some point. > >> > */ > >> > switch (ir->ir_type) { > >> > - case ir_type_expression: > >> > + case ir_type_expression: { > >> > + /* Skip expressions involving SSBO loads, since these operate > >> > on > >> > + * read-write variables, meaning that the same ssbo_load > >> > expression > >> > + * may return a different value if the underlying buffer > >> > storage > >> > + * is written in between. > >> > + */ > >> > + if (expression_contains_ssbo_load(ir->as_expression())) > >> > + return false; > >> > + } > >> > + break; > >> > case ir_type_texture: > >> > break; > >> > default: > >> > -- > >> > 1.9.1 > >> > > >> > _______________________________________________ > >> > mesa-dev mailing list > >> > mesa-dev@lists.freedesktop.org > >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev