Hello, this is the V2 series for the implementation of the SPV_KHR_16bit_storage and VK_KHR_16bit_storage extensions on the anv vulkan driver, in addition to the GLSL and NIR support needed.
The original series can be found here [1]. In short V2 includes the following: * Updates on several patches after Jason Ekstrand's review of the original series. This includes some squashes. * Updates of several patches after some rebases against a more recent mesa master. * Four new patches with improvements over the original V1 series. We decided to keep them at the end of the series. Probably they can be reordered, or even squashed with existing patches, but we think that for the review it is easier this way. Finally an updated overview of the patches: Patches 1-2 add 16-bit float, int and uint types to GLSL. This is needed because NIR uses GLSL types internally. We use the enums already defined at AMD_gpu_shader_half_float and NV_gpu_shader extensions. Patch 4 updates mesa/st, in order to avoid warnings for types not handled on a switch. Patches 3-6 add NIR support for those new GLSL 16-bit types, conversion opcodes, and rounding modes for float to half-float conversions. Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support. Patches 10-17 add general 16-bit support for i965. This includes handling of new types on several general purpose methods, update/remove some asserts, setting the stride to 2 on most cases (details on each patch), and support for copy propagation. Patches 18-22 add support for 32 to 16-bit conversions for i965, including rounding mode opcodes (needed for float to half-float conversions), and an optimization that removes superfluous rounding mode sets. Patch 23 adds 16-bit support for constant location. Patches 24-28 add and use two new messages: byte scattered read and write. Those were needed because untyped surface message has a fixed 32-bit write size. Those messages are used on the 16-bit support of store SSBO, load SSBO, load UBO and load shared. Patches 29-33 implement 16-bit vertex attribute inputs support on i965. These include changes on anv. This was needed because 16-bit surface formats do implicit conversion to 32-bit. To workaround this, we override the 16-bit surface format, and use 32-bit ones. Patches 34-40 implements 16-bit store output support for fragment shaders on i965. Patch 41 adds a custom optimization that helps to reduce the pressure on the register allocator. Without this optimization, some CTS tests need several minutes to compile. Patches 42-45 are the new patches since V1. Three of them are improvements over V1 that doesn't fix any execution problem, but are probably more performant as get less scattered messages used (not tested though). 16bit CTS tests passes without them. The other one would fix a real problem (patch 45), but unfourtunately no CTS test yet catching it. Patches 46-47 enable both extensions on anv vulkan driver. [1] https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html Alejandro Piñeiro (18): i965/vec4: Handle 16-bit types at type_size_xvec4 i965/fs: Add brw_reg_type_from_bit_size utility method i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type i965/fs: Retype 16-bit/stride2 movs to UD on nir_op_vecX i965/fs: Need to allocate as minimum 32-bit register i965/fs: Update assertion on copy propagation i965/fs: Handle 32-bit to 16-bit conversions i965/fs: Define new shader opcodes to set rounding modes i965/fs: Enable rounding mode on f2f16 ops i965/fs: Add remove_extra_rounding_modes optimization i965/fs: Adjust type_size/type_slots on store_ssbo i965/fs: Use byte_scattered_write on 16-bit store_ssbo anv/pipeline: Use 32-bit surface formats for 16-bit formats anv/cmd_buffer: Add a padding to the vertex buffer i965/fs: Use half_precision data_format on 16-bit fb writes i965/fs: Add reuse_16bit_conversions_register optimization i965/fs: Predicate byte scattered writes if needed anv: Enable VK_KHR_16bit_storage Eduardo Lima Mitev (8): glsl: Add 16-bit types mesa/st: Handle 16-bit types at st_glsl_storage_type_size() nir: Add support for 16-bit types (half float, int16 and uint16) nir: Populate conversion opcodes to/from 16-bit types spirv/nir: Handle 16-bit types spirv/nir: Add support for SPV_KHR_16bit_storage i965/fs: Optimize 16-bit SSBO stores by packing two into a 32-bit reg anv: Enable SPV_KHR_16bit_storage on gen 8+ Jose Maria Casanova Crespo (21): nir: Add rounding modes enum nir: Handle fp16 rounding modes at nir_type_conversion_op spirv: Enable FPRoundingMode decorator to nir operations i965: Support for 16-bit base types in helper functions i965/fs: Set stride 2 when dealing with 16-bit floats/ints i965: Add support for control register i965/fs: Support push constants of 16-bit types i965/fs: Add byte scattered write message and fs support i965/fs: Add byte scattered read message and fs support i965/fs: Use byte scattered read compiler: Mark when input/ouput attribute at VS uses 16-bit i965/compiler: includes 16-bit vertex input i965/fs: Unpack 16-bit from 32-bit components in VS load_input i965/fs: Enable Render Target Write for 16-bit outputs i965/fs: Include support for SEND data_format bit for Render Targets i965/disasm: Show half-precision data_format on rt_writes i965/fs: Mark 16-bit outputs on FS store_output i965/fs: 16-bit source payloads always use 1 register i965/fs: Enable 16-bit render target write on SKL and CHV i965/fs: Enables 16-bit load_ubo with sampler i965/fs: Use untyped_surface_read for 16-bit load_ssbo src/compiler/builtin_type_macros.h | 26 ++ src/compiler/glsl/ast_to_hir.cpp | 3 + src/compiler/glsl/glsl_to_nir.cpp | 3 +- src/compiler/glsl/ir_clone.cpp | 3 + src/compiler/glsl/link_uniform_initializers.cpp | 3 + src/compiler/glsl/lower_buffer_access.cpp | 3 +- src/compiler/glsl_types.cpp | 93 +++++- src/compiler/glsl_types.h | 34 +- src/compiler/nir/nir.c | 6 + src/compiler/nir/nir.h | 22 +- src/compiler/nir/nir_gather_info.c | 23 +- src/compiler/nir/nir_opcodes.py | 10 +- src/compiler/nir/nir_opcodes_c.py | 17 +- src/compiler/nir/nir_split_var_copies.c | 6 + src/compiler/nir_types.cpp | 24 ++ src/compiler/nir_types.h | 9 + src/compiler/shader_info.h | 2 + src/compiler/spirv/nir_spirv.h | 1 + src/compiler/spirv/spirv_to_nir.c | 53 +++- src/compiler/spirv/vtn_alu.c | 34 +- src/compiler/spirv/vtn_variables.c | 21 ++ src/intel/compiler/brw_compiler.h | 1 + src/intel/compiler/brw_disasm.c | 4 + src/intel/compiler/brw_eu.h | 22 +- src/intel/compiler/brw_eu_defines.h | 28 ++ src/intel/compiler/brw_eu_emit.c | 174 ++++++++++- src/intel/compiler/brw_fs.cpp | 210 ++++++++++++- src/intel/compiler/brw_fs.h | 2 + src/intel/compiler/brw_fs_builder.h | 2 +- src/intel/compiler/brw_fs_copy_propagation.cpp | 6 +- src/intel/compiler/brw_fs_generator.cpp | 31 +- src/intel/compiler/brw_fs_nir.cpp | 396 ++++++++++++++++++++++-- src/intel/compiler/brw_fs_surface_builder.cpp | 32 +- src/intel/compiler/brw_fs_surface_builder.h | 14 + src/intel/compiler/brw_fs_visitor.cpp | 6 + src/intel/compiler/brw_inst.h | 1 + src/intel/compiler/brw_ir_fs.h | 3 - src/intel/compiler/brw_nir.c | 16 + src/intel/compiler/brw_reg.h | 6 + src/intel/compiler/brw_shader.cpp | 24 ++ src/intel/compiler/brw_shader.h | 7 + src/intel/compiler/brw_vec4.cpp | 1 + src/intel/compiler/brw_vec4_generator.cpp | 3 +- src/intel/compiler/brw_vec4_visitor.cpp | 3 + src/intel/vulkan/anv_device.c | 13 + src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_pipeline.c | 1 + src/intel/vulkan/genX_cmd_buffer.c | 20 +- src/intel/vulkan/genX_pipeline.c | 47 +++ src/mesa/program/ir_to_mesa.cpp | 6 + src/mesa/state_tracker/st_glsl_types.cpp | 3 + 51 files changed, 1390 insertions(+), 89 deletions(-) -- 2.11.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev