Hi, this series brings support for OpenGL 4.3 ARB_shader_storage_object [1]. It includes the Mesa/GLSL frontend bits as well as the Intel i965 driver implementation.
The extension provides a new kind of buffer called shader storage buffer object (SSBO), which is similar to UBOs but: 1. Is writable 2. Allows a number of atomic operations 3. Allows an optional unsized array at the bottom of its definitions This series was developed by Samuel Iglesias and myself, based on initial code by Kristian Høgsberg. Development branch with the patches and some dependencies (*): git clone -b itoral-ARB_shader_storage_buffer_object-v1.0 https://github.com/Igalia/mesa.git (*) The i965 implementation needs to use untyped read/write messages to implement SSBO reads/writes which are also used in the implementation of ARB_shader_image_load_store that Curro is working on. The branch linked above includes these patches from Curro as well as a couple of general bugfixes (not necessarily SSBO-specific) from Tapani and Antia that are necessary for correct behavior in some scenarios. Piglit repository including SSBO tests: git clone -b arb_shader_storage_buffer_object-v1 https://github.com/Igalia/piglit.git === General notes about the implementation === Because SSBOs are very similar to UBOs the implementation attempts to reuse the code we already have for UBOs wherever we can. There is a lot of code in the GLSL compiler to deal with UBOs, so we do not want to exactly duplicate that. An "is buffer" flag is added if needed when reusing UBO data structures so we can tell if a given instance represents uniforms or buffers. The lower_ubo_reference pass is also updated to detect SSBO reads (lowered to ir_binop_ssbo_load expressions) and writes (lowered to a new IR node ir_ssbo_store that drivers can detect and implement). Since SSBOs are writeable, various optimization passes had to be altered accordingly. For example, we cannot kill dead assignments to buffer variables, or do CSE on ssbo load expressions, etc. Other features include: interactions with ARB_program_interface_query, support for a new std430 layout mode specific to SSBOs, memory qualifiers from ARB_shader_image_load_store applicable to buffer variables (also at layout level) and new atomic operations that can be used with integer buffer variables. Notice that NIR is not supported yet, so anyone wanting to test this on i965 needs to set INTEL_USE_NIR=0. === Comments for reviewers === The i965 implementation was developed and tested on Haswell and IvyBridge, other platforms might require small tweaks. Specifically, the message used in the implementation of the unsized array length() function requires a header in Skylake. Both Mesa and i965 need to give default values for certain things like the maximum allowed size of a shader storage buffer, the maximum number of buffer bindings, the maximum number of combined shader storage blocks, etc. We are not sure about what default values we should use for these in all cases and except in a few cases we generally copied the default values from uniforms. That may be fine or not, so let us know if we should use different values for any of these setting at the Mesa or i965 levels. The i965 implementation gets SSBO reads/writes to unaligned offsets right, even in the cases where there is expression-based indexing into arrays. Notice that this does not currently work with UBOs, so maybe we want to extend the implementation based on untyped read messages to UBOs as well so we can fix this, merging the ssbo_load and ubo_load code paths in the visitor code. That would make UBO loads go through the data cache instead of using sampler messages, but I guess that should not be a problem. We did not include GL_MAX_TESS_CONTROL_SHADER_STORAGE_BLOCKS and GL_MAX_TESS_EVALUATION_SHADER_STORAGE_BLOCKS because Mesa does not support tesselation shaders yet. We did not add support for glShaderStorageBlockBinding in display lists even though glUniformBlockBinding is supported. This is because ARB_uniform_buffer_object has a explicit mention to this for the case of glUniformBlockBinding, but ARB_shader_storage_object doesn't have the same mention for glShaderStorageBlockBinding. Since memory qualifiers were introduced with ARB_shader_image_load_store, qualifiers fields in some structrures are prefixed with 'image_'. We did not change that here, but we probably want to rename that now that SSBOs can also use them. I think we can do this with a later patch. === Piglit === There are no piglit regressions except for one introduced with patch 61 that is actually a bug fix for incorrect behavior in the current UBO implementation. The regressed piglit test expects the old (incorrect) behavior. There is a patch [2] in the piglit mailing list to fix that test. === Quick patch reference === Patches 01-13: Extension bringup, compiler bits for SSBOs and buffer variables (mesa) Patches 14-19: GL_SHADER_STORAGE_BUFFER target (mesa) Patches 20-25: edit a few optimization passes to play well with buffer variables (mesa) Patches 26-28: add the lowering of ssbo writes to ir_ssbo_store (mesa) Patches 29-34: driver implementation bits for SSBO buffers and buffer variables (i965) Patches 35-38: support for SSBO unsized arrays (mesa, i965) Patches 39-40: buffer-related bugfixes for UBOs and SSBOs (i965) Patches 41-44: add std430 layout mode for SSBOs (mesa) Patches 45-46: shader storage buffer object resource usage limit checks Patches 47-52: implement SSBO reads and writes (i965) Patches 53-58: implement SSBO atomics (mesa, i965) Patch 59: add glShaderStorageBlockBinding (mesa) Patches 60-61: buffer queries for SSBOs and fix for the same queries on UBOs (mesa) Patches 62-66: memory qualifiers with SSBOs (mesa) Patch 67: interaction with ARB_program_interface_query Patch 68: test tokens for ARB_shader_storage_buffer_object Patch 69: GLAPI for ARB_shader_storage_object Patch 70: getters for ARB_shader_storage_buffer_object max constants (mesa) Patch 71: enable ARB_shader_storage_buffer_object for gen7+ Patch 72: Mark ARB_shader_storage_buffer_object as done for i965 Patch 73: Fix instance blocks with inactive elements (general bugfix) Patch 74: Skip dependency control for opcodes emitting multiple instructions (i965/vec4, optional) Patch 73 is not necessary, but fixes a bug that exists in master with instance blocks and UBOs that we also hit while testing instance blocks with SSBOs. It was part of one of our dEQP batches but was never reviewed. Patch 74 is not necessary, at some point during development I needed that but the final implementation does not require it any more. Considering that we have that same patch on the FS and that the problem it fixes can be hit in vec4 for the same reasons I felt like it could make sense to merge it anyway. [1] https://www.opengl.org/registry/specs/ARB/shader_storage_buffer_object.txt [2] http://lists.freedesktop.org/archives/piglit/2015-May/015972.html Antia Puentes (1): glsl: Consider active all elements of a shared/std140 block array Iago Toral Quiroga (41): glsl: Identify active uniform blocks that are buffer blocks as such. mesa: Add shader storage buffer support to struct gl_context mesa: Initialize and free shader storage buffers mesa: Implement _mesa_DeleteBuffers for target GL_SHADER_STORAGE_BUFFER mesa: Implement _mesa_BindBuffersBase for target GL_SHADER_STORAGE_BUFFER mesa: Implement _mesa_BindBuffersRange for target GL_SHADER_STORAGE_BUFFER mesa: Implement _mesa_BindBufferBase for target GL_SHADER_STORAGE_BUFFER mesa: Implement _mesa_BindBufferRange for target GL_SHADER_STORAGE_BUFFER glsl: Don't do tree grafting on buffer variables glsl: Do not kill dead assignments to buffer variables or SSBO declarations. glsl: Do not do CSE for expressions involving SSBO loads glsl: Don't do constant propagation on buffer variables glsl: Don't do constant variable on buffer variables glsl: Don't do copy propagation on buffer variables mesa: Add new IR node ir_ssbo_store glsl: Lower shader storage buffer object writes to ir_ssbo_store glsl: Do constant folding on ir_ssbo_store i965: Use 16-byte offset alignment for shader storage buffers i965: Implement DriverFlags.NewShaderStorageBuffer i965: Set MaxShaderStorageBuffers for compute shaders i965: Upload Shader Storage Buffer Object surfaces i965: handle visiting of ir_var_buffer variables i965/fs: Do not split buffer variables i965/fs: Implement SSBO writes i965/fs: Implement SSBO reads i965/fs: Do not include the header with a pixel mask in untyped read messages i965/vec4: Implement SSBO writes i965/vec4: Implement SSBO reads glsl: Rename atomic counter functions glsl: Add atomic functions from ARB_shader_storage_buffer_object i965/vec4: Implement shader storage buffer object atomic intrinsics i965/fs: Implement shader storage buffer object atomic intrinsics glsl: First argument to atomic functions must be a buffer variable mesa: Add queries for GL_SHADER_STORAGE_BUFFER glsl: Allow use of memory qualifiers with ARB_shader_storage_buffer_object. glsl: Apply memory qualifiers to buffer variables glsl: Allow memory layout qualifiers on shader storage buffer objects glsl: Do not allow assignments to read-only variables glsl: Do not allow reads from write-only variables docs: Mark ARB_shader_storage_buffer_object as done for i965. i965/vec4: Skip dependency control for opcodes emitting multiple instructions Kristian Høgsberg (7): glsl: Add ir_var_buffer glsl: Implement parser support for 'buffer' qualifier glsl: link buffer variables and shader storage buffer interface blocks glsl: Add ir_binop_ssbo_load expression operation. glsl: lower SSBO reads to ir_binop_ssbo_load expressions i965: do not emit_bool_to_cond_code with ssbo load expressions glsl: atomic counters can be declared as buffer-qualified variables Samuel Iglesias Gonsalvez (25): mesa: define ARB_shader_storage_buffer_object extension mesa: add MaxShaderStorageBlocks to struct gl_program_constants glsl: enable binding layout qualifier usage for shader storage buffer objects glsl: shader buffer variables cannot have initializers glsl: buffer variables cannot be defined outside interface blocks glsl: fix error messages in invalid declarations of shader storage blocks glsl: add support for unsized arrays in shader storage blocks glsl: Add parser/compiler support for unsized array's length() i965/vec4: Implement unsized array's length calculation i965/fs: Implement unsized array's length calculation i965/wm: emit null buffer surfaces when null buffers are attached i965/wm: surfaces should have the API buffer size, not the drm buffer size glsl: Add parser/compiler support for std430 interface packing qualifier glsl: propagate interface packing information to arrays of scalars, vectors. glsl: propagate std430 packing qualifier to struct's members and array of structs glsl: add std430 interface packing support to ssbo writes and unsized array length glsl: a shader storage buffer must be smaller than the maximum size allowed glsl: number of active shader storage blocks must be within allowed limits mesa: add glShaderStorageBlockBinding() glsl: fix UNIFORM_BUFFER_START or UNIFORM_BUFFER_SIZE query when no buffer object is bound main: Add SHADER_STORAGE_BLOCK and BUFFER_VARIABLE support for ARB_program_interface_query main/tests: add ARB_shader_storage_buffer_object tokens to enum_strings glapi: add ARB_shader_storage_block_buffer_object mesa: Add getters for the GL_ARB_shader_storage_buffer_object max constants i965: Enable ARB_shader_storage_buffer_object extension for gen7+ docs/GL3.txt | 2 +- src/glsl/ast.h | 12 + src/glsl/ast_array_index.cpp | 6 +- src/glsl/ast_function.cpp | 37 ++ src/glsl/ast_to_hir.cpp | 362 ++++++++++-- src/glsl/ast_type.cpp | 4 +- src/glsl/builtin_functions.cpp | 215 ++++++- src/glsl/builtin_types.cpp | 3 +- src/glsl/builtin_variables.cpp | 5 +- src/glsl/glcpp/glcpp-parse.y | 3 + src/glsl/glsl_lexer.ll | 11 +- src/glsl/glsl_parser.yy | 110 +++- src/glsl/glsl_parser_extras.cpp | 65 ++- src/glsl/glsl_parser_extras.h | 7 + src/glsl/glsl_symbol_table.cpp | 16 +- src/glsl/glsl_types.cpp | 203 +++++-- src/glsl/glsl_types.h | 48 +- src/glsl/hir_field_selection.cpp | 15 +- src/glsl/ir.cpp | 14 + src/glsl/ir.h | 82 ++- src/glsl/ir_function.cpp | 1 + src/glsl/ir_hierarchical_visitor.cpp | 18 + src/glsl/ir_hierarchical_visitor.h | 2 + src/glsl/ir_hv_accept.cpp | 23 + src/glsl/ir_print_visitor.cpp | 15 +- src/glsl/ir_print_visitor.h | 1 + src/glsl/ir_reader.cpp | 2 + src/glsl/ir_rvalue_visitor.cpp | 21 + src/glsl/ir_rvalue_visitor.h | 3 + src/glsl/ir_uniform.h | 5 + src/glsl/ir_validate.cpp | 18 + src/glsl/ir_visitor.h | 2 + src/glsl/link_interface_blocks.cpp | 15 +- src/glsl/link_uniform_block_active_visitor.cpp | 24 + src/glsl/link_uniform_block_active_visitor.h | 1 + src/glsl/link_uniform_blocks.cpp | 36 +- src/glsl/link_uniform_initializers.cpp | 3 +- src/glsl/link_uniforms.cpp | 32 +- src/glsl/linker.cpp | 166 ++++-- src/glsl/linker.h | 1 + src/glsl/loop_unroll.cpp | 1 + src/glsl/lower_named_interface_blocks.cpp | 5 +- src/glsl/lower_ubo_reference.cpp | 633 ++++++++++++++++++--- src/glsl/lower_variable_index_to_cond_assign.cpp | 1 + src/glsl/nir/glsl_to_nir.cpp | 7 + src/glsl/opt_constant_folding.cpp | 16 + src/glsl/opt_constant_propagation.cpp | 8 + src/glsl/opt_constant_variable.cpp | 7 + src/glsl/opt_copy_propagation.cpp | 2 +- src/glsl/opt_cse.cpp | 33 +- src/glsl/opt_dead_code.cpp | 9 +- src/glsl/opt_structure_splitting.cpp | 5 +- src/glsl/opt_tree_grafting.cpp | 9 +- .../glapi/gen/ARB_shader_storage_buffer_object.xml | 36 ++ src/mapi/glapi/gen/GL4x.xml | 18 +- src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 6 +- src/mesa/drivers/dri/i965/brw_context.c | 2 + src/mesa/drivers/dri/i965/brw_context.h | 6 + src/mesa/drivers/dri/i965/brw_defines.h | 4 + src/mesa/drivers/dri/i965/brw_eu_emit.c | 4 +- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs.h | 5 + .../dri/i965/brw_fs_channel_expressions.cpp | 3 + src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 47 ++ .../drivers/dri/i965/brw_fs_vector_splitting.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 349 ++++++++++-- src/mesa/drivers/dri/i965/brw_shader.cpp | 6 + src/mesa/drivers/dri/i965/brw_state_upload.c | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 + src/mesa/drivers/dri/i965/brw_vec4.h | 8 + src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 35 ++ src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 361 +++++++++++- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 70 ++- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 + src/mesa/drivers/dri/i965/intel_extensions.c | 1 + src/mesa/main/bufferobj.c | 380 +++++++++++++ src/mesa/main/config.h | 2 + src/mesa/main/context.c | 8 + src/mesa/main/extensions.c | 1 + src/mesa/main/get.c | 38 +- src/mesa/main/get_hash_params.py | 12 + src/mesa/main/mtypes.h | 57 +- src/mesa/main/program_resource.c | 7 +- src/mesa/main/shader_query.cpp | 265 ++++++++- src/mesa/main/tests/enum_strings.cpp | 15 + src/mesa/main/uniforms.c | 52 ++ src/mesa/main/uniforms.h | 4 + src/mesa/program/ir_to_mesa.cpp | 10 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 16 + 90 files changed, 3800 insertions(+), 380 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_shader_storage_buffer_object.xml -- 1.9.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev