I've been working for the past couple of weeks on the up-and-coming Vulkan SPIR-V subgroups extensions. No spec is is public yet so I can't send full patches, but I do have it working. :-) (I don't mind mentioning it because the existence of such extensions has been announced at GDC and other public forums.) While working on subgroups, I fixed a variety of bugs and needed to do a few refactors. This series contains those bug fixes and refactors. I'd like to get started on the review process and merge what we can now.
Some of the patches in here overlap a bit with stuff that Connor did in his series for radv. In particular, I've taken a different approach which I like better to sorting out uint64_t vs. uvec4 for ballot intrinsics. Cc: Matt Turner <matts...@gmail.com> Cc: Francisco Jerez <curroje...@riseup.net> Cc: Connor Abbott <cwabbo...@gmail.com> Alejandro PiƱeiro (1): i965/fs: Add brw_reg_type_from_bit_size utility method Jason Ekstrand (43): nir/opcodes: Fix constant-folding of ufind_msb nir: Get rid of the variable on vote intrinsics i965/fs/nir: Use the nir_src_bit_size helper i965/fs/nir: Simplify 64-bit store_output i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write i965/fs/nir: Minor refactor of store_output i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src intel/fs: Protect opt_algebraic from OOB BROADCAST indices intel/fs: Uniformize the index in readInvocation intel/fs: Retype dest to match value in read[First]Invocation intel/fs: Assign constant locations if they haven't been assigned intel/fs: Remove min_dispatch_width from fs_visitor intel/cs: Drop min_dispatch_width checks from compile_cs intel/cs: Stop setting dispatch_grf_start_reg intel/cs: Ignore runtime_check_aads_emit for CS intel/cs: Rework the way thread local ID is handled intel/cs: Re-run final NIR optimizations for each SIMD size intel/cs: Push subgroup ID instead of base thread ID intel/compiler/fs: Set up subgroup invocation as a system value intel/fs: Rework zero-length URB write handling intel/eu: Use EXECUTE_1 for JMPI intel/eu: Make automatic exec sizes a configurable option intel/eu: Explicitly set EXECUTE_1 where needed intel/fs: Explicitly set EXECUTE_1 where needed intel/fs: Don't use automatic exec size inference intel/eu/validate: Look up types on demand in execution_type() spirv: Add support for the HelperInvocation builtin anv/pipeline: Dump shader immedately after spirv_to_nir anv/pipeline: Drop nir_lower_clip_cull_distance_arrays nir/lower_wpos_ytransform: Support system value intrinsics i965/program: Move nir_lower_system_values higher up intel/compiler: Call nir_lower_system_values in brw_preprocess_nir nir/opt_intrinsics: Rework progress nir: Add a new subgroups lowering pass nir: Add a ssa_dest_init_for_type helper nir: Make ballot intrinsics variable-size nir/lower_system_values: Lower SUBGROUP_*_MASK based on type nir/lower_subgroups: Lower ballot intrinsics to the specified bit size nir,intel/compiler: Use a fixed subgroup size spirv: Add a vtn_constant_value helper spirv: Rework barriers nir: Validate base types on array dereferences compiler/nir_types: Handle vectors in glsl_get_array_element src/compiler/Makefile.sources | 2 +- src/compiler/glsl/glsl_to_nir.cpp | 3 +- src/compiler/nir/nir.h | 25 ++- src/compiler/nir/nir_intrinsics.h | 19 +- .../nir/nir_lower_read_invocation_to_scalar.c | 112 ---------- src/compiler/nir/nir_lower_subgroups.c | 247 +++++++++++++++++++++ src/compiler/nir/nir_lower_system_values.c | 4 +- src/compiler/nir/nir_lower_wpos_ytransform.c | 4 + src/compiler/nir/nir_opcodes.py | 2 +- src/compiler/nir/nir_opt_intrinsics.c | 83 ++----- src/compiler/nir/nir_validate.c | 18 +- src/compiler/nir_types.cpp | 2 + src/compiler/spirv/spirv_to_nir.c | 132 +++++++++-- src/compiler/spirv/vtn_private.h | 6 + src/compiler/spirv/vtn_variables.c | 5 +- src/intel/compiler/brw_compiler.c | 4 - src/intel/compiler/brw_compiler.h | 3 +- src/intel/compiler/brw_eu.c | 1 + src/intel/compiler/brw_eu.h | 10 + src/intel/compiler/brw_eu_emit.c | 43 ++-- src/intel/compiler/brw_eu_validate.c | 6 +- src/intel/compiler/brw_fs.cpp | 225 ++++++++++--------- src/intel/compiler/brw_fs.h | 19 +- src/intel/compiler/brw_fs_generator.cpp | 14 +- src/intel/compiler/brw_fs_nir.cpp | 241 +++++++++++--------- src/intel/compiler/brw_fs_visitor.cpp | 79 +++---- src/intel/compiler/brw_nir.c | 13 +- src/intel/compiler/brw_nir.h | 3 +- src/intel/compiler/brw_nir_intrinsics.c | 55 ++--- src/intel/vulkan/anv_cmd_buffer.c | 15 +- src/intel/vulkan/anv_pipeline.c | 22 +- src/mesa/drivers/dri/i965/brw_cs.c | 3 - src/mesa/drivers/dri/i965/brw_program.c | 1 - src/mesa/drivers/dri/i965/gen7_cs_state.c | 18 +- 34 files changed, 867 insertions(+), 572 deletions(-) delete mode 100644 src/compiler/nir/nir_lower_read_invocation_to_scalar.c create mode 100644 src/compiler/nir/nir_lower_subgroups.c -- 2.5.0.400.gff86faf _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev