On Mon, 2015-11-16 at 17:52 -0800, Jordan Justen wrote: > On 2015-11-16 04:27:55, Iago Toral wrote: > > On Sat, 2015-11-14 at 13:43 -0800, Jordan Justen wrote: > > > This class has code that will be shared by lower_ubo_reference and > > > lower_shared_reference. (lower_shared_reference will be used to > > > support compute shader shared variables.) > > > > > > Signed-off-by: Jordan Justen <jordan.l.jus...@intel.com> > > > Cc: Samuel Iglesias Gonsalvez <sigles...@igalia.com> > > > Cc: Iago Toral Quiroga <ito...@igalia.com> > > > --- > > > src/glsl/Makefile.sources | 1 + > > > src/glsl/lower_buffer_access.cpp | 307 > > > +++++++++++++++++++++++++++++++++++++++ > > > src/glsl/lower_buffer_access.h | 56 +++++++ > > > src/glsl/lower_ubo_reference.cpp | 180 +---------------------- > > > 4 files changed, 367 insertions(+), 177 deletions(-) > > > create mode 100644 src/glsl/lower_buffer_access.cpp > > > create mode 100644 src/glsl/lower_buffer_access.h > > > > > > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources > > > index d4b02c1..f2c95c0 100644 > > > --- a/src/glsl/Makefile.sources > > > +++ b/src/glsl/Makefile.sources > > > @@ -155,6 +155,7 @@ LIBGLSL_FILES = \ > > > loop_analysis.h \ > > > loop_controls.cpp \ > > > loop_unroll.cpp \ > > > + lower_buffer_access.cpp \ > > > lower_clip_distance.cpp \ > > > lower_const_arrays_to_uniforms.cpp \ > > > lower_discard.cpp \ > > > diff --git a/src/glsl/lower_buffer_access.cpp > > > b/src/glsl/lower_buffer_access.cpp > > > new file mode 100644 > > > index 0000000..e0b5a2f > > > --- /dev/null > > > +++ b/src/glsl/lower_buffer_access.cpp > > > @@ -0,0 +1,307 @@ > > > +/* > > > + * Copyright (c) 2015 Intel Corporation > > > + * > > > + * Permission is hereby granted, free of charge, to any person obtaining > > > a > > > + * copy of this software and associated documentation files (the > > > "Software"), > > > + * to deal in the Software without restriction, including without > > > limitation > > > + * the rights to use, copy, modify, merge, publish, distribute, > > > sublicense, > > > + * and/or sell copies of the Software, and to permit persons to whom the > > > + * Software is furnished to do so, subject to the following conditions: > > > + * > > > + * The above copyright notice and this permission notice (including the > > > next > > > + * paragraph) shall be included in all copies or substantial portions of > > > the > > > + * Software. > > > + * > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > > > EXPRESS OR > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > > > MERCHANTABILITY, > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT > > > SHALL > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > > > OTHER > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > > > ARISING > > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > > > + * DEALINGS IN THE SOFTWARE. > > > + */ > > > + > > > +/** > > > + * \file lower_buffer_access.cpp > > > + * > > > + * Helper for IR lowering pass to replace dereferences of buffer object > > > based > > > + * shader variables with intrinsic function calls. > > > + * > > > + * This helper is used by lowering passes for UBOs, SSBOs and compute > > > shader > > > + * shared variables. > > > + */ > > > + > > > +#include "ir.h" > > > +#include "ir_builder.h" > > > +#include "ir_rvalue_visitor.h" > > > +#include "main/macros.h" > > > +#include "util/list.h" > > > +#include "glsl_parser_extras.h" > > > +#include "lower_buffer_access.h" > > > + > > > +using namespace ir_builder; > > > + > > > +namespace lower_buffer_access { > > > + > > > +static inline int > > > +writemask_for_size(unsigned n) > > > +{ > > > + return ((1 << n) - 1); > > > +} > > > + > > > +/** > > > + * Takes LHS and emits a series of assignments into its components > > > + * from the shared variable storage. > > > > I find this part of the comment a bit confusing. This function breaks a > > dereference access into one or multiple accesses to the underlying > > buffer storage. Such dereference could be in a RHS expression, and in > > fact, that will always be the case for UBO and SSBO loads. > > Hmm. I may have copied this comment from lower_ubo_reference some time > back. Anyway, I intended to use the current comment from > lower_ubo_reference: > > /** > * Takes a deref and recursively calls itself to break the deref down to the > * point that the reads or writes generated are contiguous scalars or vectors. > */
Yeah, that looks better. > > > + * Recursively calls itself to break the deref down to the point that > > > + * the intrinsic calls are generated. > > > + */ > > > +void > > > +lower_buffer_access::emit_access(bool is_write, > > > + ir_dereference *deref, > > > + ir_variable *base_offset, > > > + unsigned int deref_offset, > > > + bool row_major, > > > + int matrix_columns, > > > + unsigned int packing, > > > + unsigned int write_mask) > > > +{ > > > > Why not pass mem_ctx as parameter instead of having it be a class > > member? I find it a bit odd that this class defines mem_ctx but never > > really takes care of initializing it, expecting that subclasses do that > > for it, so in that case why not just make them actually take care of > > passing the mem_ctx to use instead? > > > > If you rather keep mem_ctx defined here I'd at least suggest to add an > > assert to the functions that use it to check that it has indeed been > > initialized by the subclass. > > I think your comment applies to the current code in > lower_ubo_reference as well. It resets mem_ctx at various points. I > will try to get rid of mem_ctx as a member variable in all the related > classes and add it as a parameter instead. Yes, the handling of the memory context in lower_ubo_reference is a bit messy, if you can fix that too it'd be great. > Thanks, > > -Jordan > > > > + if (deref->type->is_record()) { > > > + unsigned int field_offset = 0; > > > + > > > + for (unsigned i = 0; i < deref->type->length; i++) { > > > + const struct glsl_struct_field *field = > > > + &deref->type->fields.structure[i]; > > > + ir_dereference *field_deref = > > > + new(mem_ctx) ir_dereference_record(deref->clone(mem_ctx, > > > NULL), > > > + field->name); > > > + > > > + field_offset = > > > + glsl_align(field_offset, > > > + field->type->std140_base_alignment(row_major)); > > > + > > > + emit_access(is_write, field_deref, base_offset, > > > + deref_offset + field_offset, > > > + row_major, 1, packing, > > > + > > > writemask_for_size(field_deref->type->vector_elements)); > > > + > > > + field_offset += field->type->std140_size(row_major); > > > + } > > > + return; > > > + } > > > + > > > + if (deref->type->is_array()) { > > > + unsigned array_stride = packing == GLSL_INTERFACE_PACKING_STD430 ? > > > + deref->type->fields.array->std430_array_stride(row_major) : > > > + glsl_align(deref->type->fields.array->std140_size(row_major), > > > 16); > > > + > > > + for (unsigned i = 0; i < deref->type->length; i++) { > > > + ir_constant *element = new(mem_ctx) ir_constant(i); > > > + ir_dereference *element_deref = > > > + new(mem_ctx) ir_dereference_array(deref->clone(mem_ctx, > > > NULL), > > > + element); > > > + emit_access(is_write, element_deref, base_offset, > > > + deref_offset + i * array_stride, > > > + row_major, 1, packing, > > > + > > > writemask_for_size(element_deref->type->vector_elements)); > > > + } > > > + return; > > > + } > > > + > > > + if (deref->type->is_matrix()) { > > > + for (unsigned i = 0; i < deref->type->matrix_columns; i++) { > > > + ir_constant *col = new(mem_ctx) ir_constant(i); > > > + ir_dereference *col_deref = > > > + new(mem_ctx) ir_dereference_array(deref->clone(mem_ctx, > > > NULL), col); > > > + > > > + if (row_major) { > > > + /* For a row-major matrix, the next column starts at the next > > > + * element. > > > + */ > > > + int size_mul = deref->type->is_double() ? 8 : 4; > > > + emit_access(is_write, col_deref, base_offset, > > > + deref_offset + i * size_mul, > > > + row_major, deref->type->matrix_columns, packing, > > > + > > > writemask_for_size(col_deref->type->vector_elements)); > > > + } else { > > > + int size_mul; > > > + > > > + /* std430 doesn't round up vec2 size to a vec4 size */ > > > + if (packing == GLSL_INTERFACE_PACKING_STD430 && > > > + deref->type->vector_elements == 2 && > > > + !deref->type->is_double()) { > > > + size_mul = 8; > > > + } else { > > > + /* std140 always rounds the stride of arrays (and > > > matrices) to a > > > + * vec4, so matrices are always 16 between columns/rows. > > > With > > > + * doubles, they will be 32 apart when there are more > > > than 2 rows. > > > + * > > > + * For both std140 and std430, if the member is a > > > + * three-'component vector with components consuming N > > > basic > > > + * machine units, the base alignment is 4N. For vec4, base > > > + * alignment is 4N. > > > + */ > > > + size_mul = (deref->type->is_double() && > > > + deref->type->vector_elements > 2) ? 32 : 16; > > > + } > > > + > > > + emit_access(is_write, col_deref, base_offset, > > > + deref_offset + i * size_mul, > > > + row_major, deref->type->matrix_columns, packing, > > > + > > > writemask_for_size(col_deref->type->vector_elements)); > > > + } > > > + } > > > + return; > > > + } > > > + > > > + assert(deref->type->is_scalar() || deref->type->is_vector()); > > > + > > > + if (!row_major) { > > > + ir_rvalue *offset = > > > + add(base_offset, new(mem_ctx) ir_constant(deref_offset)); > > > + unsigned mask = > > > + is_write ? write_mask : (1 << deref->type->vector_elements) - 1; > > > + insert_buffer_access(deref, deref->type, offset, mask, -1); > > > + } else { > > > + unsigned N = deref->type->is_double() ? 8 : 4; > > > + > > > + /* We're dereffing a column out of a row-major matrix, so we > > > + * gather the vector from each stored row. > > > + */ > > > + assert(deref->type->base_type == GLSL_TYPE_FLOAT || > > > + deref->type->base_type == GLSL_TYPE_DOUBLE); > > > + /* Matrices, row_major or not, are stored as if they were > > > + * arrays of vectors of the appropriate size in std140. > > > + * Arrays have their strides rounded up to a vec4, so the > > > + * matrix stride is always 16. However a double matrix may either > > > be 16 > > > + * or 32 depending on the number of columns. > > > + */ > > > + assert(matrix_columns <= 4); > > > + unsigned matrix_stride = 0; > > > + /* Matrix stride for std430 mat2xY matrices are not rounded up to > > > + * vec4 size. From OpenGL 4.3 spec, section 7.6.2.2 "Standard > > > Uniform > > > + * Block Layout": > > > + * > > > + * "2. If the member is a two- or four-component vector with > > > components > > > + * consuming N basic machine units, the base alignment is 2N or 4N, > > > + * respectively." [...] > > > + * "4. If the member is an array of scalars or vectors, the base > > > alignment > > > + * and array stride are set to match the base alignment of a > > > single array > > > + * element, according to rules (1), (2), and (3), and rounded up > > > to the > > > + * base alignment of a vec4." [...] > > > + * "7. If the member is a row-major matrix with C columns and R > > > rows, the > > > + * matrix is stored identically to an array of R row vectors with C > > > + * components each, according to rule (4)." [...] > > > + * "When using the std430 storage layout, shader storage blocks > > > will be > > > + * laid out in buffer storage identically to uniform and shader > > > storage > > > + * blocks using the std140 layout, except that the base alignment > > > and > > > + * stride of arrays of scalars and vectors in rule 4 and of > > > structures in > > > + * rule 9 are not rounded up a multiple of the base alignment of a > > > vec4." > > > + */ > > > + if (packing == GLSL_INTERFACE_PACKING_STD430 && matrix_columns == > > > 2) > > > + matrix_stride = 2 * N; > > > + else > > > + matrix_stride = glsl_align(matrix_columns * N, 16); > > > + > > > + const glsl_type *deref_type = deref->type->base_type == > > > GLSL_TYPE_FLOAT ? > > > + glsl_type::float_type : glsl_type::double_type; > > > + > > > + for (unsigned i = 0; i < deref->type->vector_elements; i++) { > > > + ir_rvalue *chan_offset = > > > + add(base_offset, > > > + new(mem_ctx) ir_constant(deref_offset + i * > > > matrix_stride)); > > > + if (!is_write || ((1U << i) & write_mask)) > > > + insert_buffer_access(deref, deref_type, chan_offset, (1U << > > > i), i); > > > + } > > > + } > > > +} > > > + > > > +/** > > > + * Determine if a thing being dereferenced is row-major > > > + * > > > + * There is some trickery here. > > > + * > > > + * If the thing being dereferenced is a member of uniform block \b > > > without an > > > + * instance name, then the name of the \c ir_variable is the field name > > > of an > > > + * interface type. If this field is row-major, then the thing > > > referenced is > > > + * row-major. > > > + * > > > + * If the thing being dereferenced is a member of uniform block \b with > > > an > > > + * instance name, then the last dereference in the tree will be an > > > + * \c ir_dereference_record. If that record field is row-major, then the > > > + * thing referenced is row-major. > > > + */ > > > +static bool > > > +is_dereferenced_thing_row_major(const ir_dereference *deref) > > > +{ > > > + bool matrix = false; > > > + const ir_rvalue *ir = deref; > > > + > > > + while (true) { > > > + matrix = matrix || ir->type->without_array()->is_matrix(); > > > + > > > + switch (ir->ir_type) { > > > + case ir_type_dereference_array: { > > > + const ir_dereference_array *const array_deref = > > > + (const ir_dereference_array *) ir; > > > + > > > + ir = array_deref->array; > > > + break; > > > + } > > > + > > > + case ir_type_dereference_record: { > > > + const ir_dereference_record *const record_deref = > > > + (const ir_dereference_record *) ir; > > > + > > > + ir = record_deref->record; > > > + > > > + const int idx = ir->type->field_index(record_deref->field); > > > + assert(idx >= 0); > > > + > > > + const enum glsl_matrix_layout matrix_layout = > > > + > > > glsl_matrix_layout(ir->type->fields.structure[idx].matrix_layout); > > > + > > > + switch (matrix_layout) { > > > + case GLSL_MATRIX_LAYOUT_INHERITED: > > > + break; > > > + case GLSL_MATRIX_LAYOUT_COLUMN_MAJOR: > > > + return false; > > > + case GLSL_MATRIX_LAYOUT_ROW_MAJOR: > > > + return matrix || deref->type->without_array()->is_record(); > > > + } > > > + > > > + break; > > > + } > > > + > > > + case ir_type_dereference_variable: { > > > + const ir_dereference_variable *const var_deref = > > > + (const ir_dereference_variable *) ir; > > > + > > > + const enum glsl_matrix_layout matrix_layout = > > > + glsl_matrix_layout(var_deref->var->data.matrix_layout); > > > + > > > + switch (matrix_layout) { > > > + case GLSL_MATRIX_LAYOUT_INHERITED: > > > + case GLSL_MATRIX_LAYOUT_COLUMN_MAJOR: > > > + return false; > > > + case GLSL_MATRIX_LAYOUT_ROW_MAJOR: > > > + return matrix || deref->type->without_array()->is_record(); > > > + } > > > + > > > + unreachable("invalid matrix layout"); > > > + break; > > > + } > > > + > > > + default: > > > + return false; > > > + } > > > + } > > > + > > > + /* The tree must have ended with a dereference that wasn't an > > > + * ir_dereference_variable. That is invalid, and it should be > > > impossible. > > > + */ > > > + unreachable("invalid dereference tree"); > > > + return false; > > > +} > > > + > > > +} /* namespace lower_buffer_access */ > > > diff --git a/src/glsl/lower_buffer_access.h > > > b/src/glsl/lower_buffer_access.h > > > new file mode 100644 > > > index 0000000..3138963 > > > --- /dev/null > > > +++ b/src/glsl/lower_buffer_access.h > > > @@ -0,0 +1,56 @@ > > > +/* > > > + * Copyright (c) 2015 Intel Corporation > > > + * > > > + * Permission is hereby granted, free of charge, to any person obtaining > > > a > > > + * copy of this software and associated documentation files (the > > > "Software"), > > > + * to deal in the Software without restriction, including without > > > limitation > > > + * the rights to use, copy, modify, merge, publish, distribute, > > > sublicense, > > > + * and/or sell copies of the Software, and to permit persons to whom the > > > + * Software is furnished to do so, subject to the following conditions: > > > + * > > > + * The above copyright notice and this permission notice (including the > > > next > > > + * paragraph) shall be included in all copies or substantial portions of > > > the > > > + * Software. > > > + * > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > > > EXPRESS OR > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > > > MERCHANTABILITY, > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT > > > SHALL > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > > > OTHER > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > > > ARISING > > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > > > + * DEALINGS IN THE SOFTWARE. > > > + */ > > > + > > > +/** > > > + * \file lower_buffer_access.h > > > + * > > > + * Helper for IR lowering pass to replace dereferences of buffer object > > > based > > > + * shader variables with intrinsic function calls. > > > + * > > > + * This helper is used by lowering passes for UBOs, SSBOs and compute > > > shader > > > + * shared variables. > > > + */ > > > + > > > +#pragma once > > > +#ifndef LOWER_BUFFER_ACCESS_H > > > +#define LOWER_BUFFER_ACCESS_H > > > + > > > +namespace lower_buffer_access { > > > + > > > +class lower_buffer_access : public ir_rvalue_enter_visitor { > > > +public: > > > + virtual void > > > + insert_buffer_access(ir_dereference *deref, const glsl_type *type, > > > + ir_rvalue *offset, unsigned mask, int channel) = > > > 0; > > > + > > > + void emit_access(bool is_write, ir_dereference *deref, > > > + ir_variable *base_offset, unsigned int deref_offset, > > > + bool row_major, int matrix_columns, > > > + unsigned int packing, unsigned int write_mask); > > > + > > > + void *mem_ctx; > > > +}; > > > + > > > +} /* namespace lower_buffer_access */ > > > + > > > +#endif /* LOWER_BUFFER_ACCESS_H */ > > > diff --git a/src/glsl/lower_ubo_reference.cpp > > > b/src/glsl/lower_ubo_reference.cpp > > > index b8fcc8e..8de4f5e 100644 > > > --- a/src/glsl/lower_ubo_reference.cpp > > > +++ b/src/glsl/lower_ubo_reference.cpp > > > @@ -38,6 +38,7 @@ > > > #include "ir_rvalue_visitor.h" > > > #include "main/macros.h" > > > #include "glsl_parser_extras.h" > > > +#include "lower_buffer_access.h" > > > > > > using namespace ir_builder; > > > > > > @@ -132,7 +133,8 @@ is_dereferenced_thing_row_major(const ir_rvalue > > > *deref) > > > } > > > > > > namespace { > > > -class lower_ubo_reference_visitor : public ir_rvalue_enter_visitor { > > > +class lower_ubo_reference_visitor : > > > + public lower_buffer_access::lower_buffer_access { > > > public: > > > lower_ubo_reference_visitor(struct gl_shader *shader) > > > : shader(shader) > > > @@ -173,11 +175,6 @@ public: > > > void insert_buffer_access(ir_dereference *deref, const glsl_type > > > *type, > > > ir_rvalue *offset, unsigned mask, int > > > channel); > > > > > > - void emit_access(bool is_write, ir_dereference *deref, > > > - ir_variable *base_offset, unsigned int deref_offset, > > > - bool row_major, int matrix_columns, > > > - unsigned packing, unsigned write_mask); > > > - > > > ir_visitor_status visit_enter(class ir_expression *); > > > ir_expression *calculate_ssbo_unsized_array_length(ir_expression > > > *expr); > > > void check_ssbo_unsized_array_length_expression(class ir_expression > > > *); > > > @@ -195,7 +192,6 @@ public: > > > ir_call *check_for_ssbo_atomic_intrinsic(ir_call *ir); > > > ir_visitor_status visit_enter(ir_call *ir); > > > > > > - void *mem_ctx; > > > struct gl_shader *shader; > > > struct gl_uniform_buffer_variable *ubo_var; > > > ir_rvalue *uniform_block; > > > @@ -727,176 +723,6 @@ > > > lower_ubo_reference_visitor::insert_buffer_access(ir_dereference *deref, > > > } > > > } > > > > > > -static inline int > > > -writemask_for_size(unsigned n) > > > -{ > > > - return ((1 << n) - 1); > > > -} > > > - > > > -/** > > > - * Takes a deref and recursively calls itself to break the deref down to > > > the > > > - * point that the reads or writes generated are contiguous scalars or > > > vectors. > > > - */ > > > -void > > > -lower_ubo_reference_visitor::emit_access(bool is_write, > > > - ir_dereference *deref, > > > - ir_variable *base_offset, > > > - unsigned int deref_offset, > > > - bool row_major, > > > - int matrix_columns, > > > - unsigned packing, > > > - unsigned write_mask) > > > -{ > > > - if (deref->type->is_record()) { > > > - unsigned int field_offset = 0; > > > - > > > - for (unsigned i = 0; i < deref->type->length; i++) { > > > - const struct glsl_struct_field *field = > > > - &deref->type->fields.structure[i]; > > > - ir_dereference *field_deref = > > > - new(mem_ctx) ir_dereference_record(deref->clone(mem_ctx, > > > NULL), > > > - field->name); > > > - > > > - field_offset = > > > - glsl_align(field_offset, > > > - field->type->std140_base_alignment(row_major)); > > > - > > > - emit_access(is_write, field_deref, base_offset, > > > - deref_offset + field_offset, > > > - row_major, 1, packing, > > > - > > > writemask_for_size(field_deref->type->vector_elements)); > > > - > > > - field_offset += field->type->std140_size(row_major); > > > - } > > > - return; > > > - } > > > - > > > - if (deref->type->is_array()) { > > > - unsigned array_stride = packing == GLSL_INTERFACE_PACKING_STD430 ? > > > - deref->type->fields.array->std430_array_stride(row_major) : > > > - glsl_align(deref->type->fields.array->std140_size(row_major), > > > 16); > > > - > > > - for (unsigned i = 0; i < deref->type->length; i++) { > > > - ir_constant *element = new(mem_ctx) ir_constant(i); > > > - ir_dereference *element_deref = > > > - new(mem_ctx) ir_dereference_array(deref->clone(mem_ctx, > > > NULL), > > > - element); > > > - emit_access(is_write, element_deref, base_offset, > > > - deref_offset + i * array_stride, > > > - row_major, 1, packing, > > > - > > > writemask_for_size(element_deref->type->vector_elements)); > > > - } > > > - return; > > > - } > > > - > > > - if (deref->type->is_matrix()) { > > > - for (unsigned i = 0; i < deref->type->matrix_columns; i++) { > > > - ir_constant *col = new(mem_ctx) ir_constant(i); > > > - ir_dereference *col_deref = > > > - new(mem_ctx) ir_dereference_array(deref->clone(mem_ctx, > > > NULL), col); > > > - > > > - if (row_major) { > > > - /* For a row-major matrix, the next column starts at the next > > > - * element. > > > - */ > > > - int size_mul = deref->type->is_double() ? 8 : 4; > > > - emit_access(is_write, col_deref, base_offset, > > > - deref_offset + i * size_mul, > > > - row_major, deref->type->matrix_columns, packing, > > > - > > > writemask_for_size(col_deref->type->vector_elements)); > > > - } else { > > > - int size_mul; > > > - > > > - /* std430 doesn't round up vec2 size to a vec4 size */ > > > - if (packing == GLSL_INTERFACE_PACKING_STD430 && > > > - deref->type->vector_elements == 2 && > > > - !deref->type->is_double()) { > > > - size_mul = 8; > > > - } else { > > > - /* std140 always rounds the stride of arrays (and > > > matrices) to a > > > - * vec4, so matrices are always 16 between columns/rows. > > > With > > > - * doubles, they will be 32 apart when there are more > > > than 2 rows. > > > - * > > > - * For both std140 and std430, if the member is a > > > - * three-'component vector with components consuming N > > > basic > > > - * machine units, the base alignment is 4N. For vec4, base > > > - * alignment is 4N. > > > - */ > > > - size_mul = (deref->type->is_double() && > > > - deref->type->vector_elements > 2) ? 32 : 16; > > > - } > > > - > > > - emit_access(is_write, col_deref, base_offset, > > > - deref_offset + i * size_mul, > > > - row_major, deref->type->matrix_columns, packing, > > > - > > > writemask_for_size(col_deref->type->vector_elements)); > > > - } > > > - } > > > - return; > > > - } > > > - > > > - assert(deref->type->is_scalar() || deref->type->is_vector()); > > > - > > > - if (!row_major) { > > > - ir_rvalue *offset = > > > - add(base_offset, new(mem_ctx) ir_constant(deref_offset)); > > > - unsigned mask = > > > - is_write ? write_mask : (1 << deref->type->vector_elements) - 1; > > > - insert_buffer_access(deref, deref->type, offset, mask, -1); > > > - } else { > > > - unsigned N = deref->type->is_double() ? 8 : 4; > > > - > > > - /* We're dereffing a column out of a row-major matrix, so we > > > - * gather the vector from each stored row. > > > - */ > > > - assert(deref->type->base_type == GLSL_TYPE_FLOAT || > > > - deref->type->base_type == GLSL_TYPE_DOUBLE); > > > - /* Matrices, row_major or not, are stored as if they were > > > - * arrays of vectors of the appropriate size in std140. > > > - * Arrays have their strides rounded up to a vec4, so the > > > - * matrix stride is always 16. However a double matrix may either > > > be 16 > > > - * or 32 depending on the number of columns. > > > - */ > > > - assert(matrix_columns <= 4); > > > - unsigned matrix_stride = 0; > > > - /* Matrix stride for std430 mat2xY matrices are not rounded up to > > > - * vec4 size. From OpenGL 4.3 spec, section 7.6.2.2 "Standard > > > Uniform > > > - * Block Layout": > > > - * > > > - * "2. If the member is a two- or four-component vector with > > > components > > > - * consuming N basic machine units, the base alignment is 2N or 4N, > > > - * respectively." [...] > > > - * "4. If the member is an array of scalars or vectors, the base > > > alignment > > > - * and array stride are set to match the base alignment of a > > > single array > > > - * element, according to rules (1), (2), and (3), and rounded up > > > to the > > > - * base alignment of a vec4." [...] > > > - * "7. If the member is a row-major matrix with C columns and R > > > rows, the > > > - * matrix is stored identically to an array of R row vectors with C > > > - * components each, according to rule (4)." [...] > > > - * "When using the std430 storage layout, shader storage blocks > > > will be > > > - * laid out in buffer storage identically to uniform and shader > > > storage > > > - * blocks using the std140 layout, except that the base alignment > > > and > > > - * stride of arrays of scalars and vectors in rule 4 and of > > > structures in > > > - * rule 9 are not rounded up a multiple of the base alignment of a > > > vec4." > > > - */ > > > - if (packing == GLSL_INTERFACE_PACKING_STD430 && matrix_columns == > > > 2) > > > - matrix_stride = 2 * N; > > > - else > > > - matrix_stride = glsl_align(matrix_columns * N, 16); > > > - > > > - const glsl_type *deref_type = deref->type->base_type == > > > GLSL_TYPE_FLOAT ? > > > - glsl_type::float_type : glsl_type::double_type; > > > - > > > - for (unsigned i = 0; i < deref->type->vector_elements; i++) { > > > - ir_rvalue *chan_offset = > > > - add(base_offset, > > > - new(mem_ctx) ir_constant(deref_offset + i * > > > matrix_stride)); > > > - if (!is_write || ((1U << i) & write_mask)) > > > - insert_buffer_access(deref, deref_type, chan_offset, (1U << > > > i), i); > > > - } > > > - } > > > -} > > > - > > > void > > > lower_ubo_reference_visitor::write_to_memory(ir_dereference *deref, > > > ir_variable *var, > > > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev