Hi all, this series implements ARB_shader_ballot for radeonsi, tested against a bunch of piglit tests I just sent out as well as an upcoming test in the GLCTS.
There are a bunch of gotchas in LLVM, and I'll probably be sending out those patches next week. The basic functionality is working even with LLVM 4.0, but it's easy to run into trouble. ARB_shader_ballot could be interesting for AZDO-style programming on AMD hardware. By default, texture instructions can only be applied to samplers that are dynamically uniform, i.e. the same texture needs to be sampled by all shader invocations within a draw call. Even ARB_bindless_texture doesn't relax this constraint. However, using the readFirstInvocationARB builtin provided by ARB_shader_ballot, one could write a loop like: samplerXX textures[N]; int idx = ...; for (;;) { int local_idx = readFirstInvocationARB(idx); if (local_idx != idx) continue; sample from textures[local_idx] } or some equivalent using ARB_bindless_texture instead of indices, and have it work correctly. There's a bit of overhead to the loop, of course, but as long as _most_ shader waves have uniform values it could be a useful tool for reducing CPU overhead by building bigger batches and fewer draw calls. Note that the spec language of ARB_shader_ballot doesn't guarantee that this trick works, but it does work on all AMD GCN hardware. Please review! Nicolai -- docs/features.txt | 2 +- docs/relnotes/17.1.0.html | 1 + src/compiler/glsl/builtin_functions.cpp | 77 +++++++++ src/compiler/glsl/builtin_variables.cpp | 22 +++ src/compiler/glsl/glsl_parser_extras.cpp | 1 + src/compiler/glsl/glsl_parser_extras.h | 2 + src/compiler/glsl/ir.cpp | 12 ++ src/compiler/glsl/ir_expression_operation.py | 7 + src/compiler/glsl/ir_validate.cpp | 16 ++ src/compiler/shader_enums.c | 7 + src/compiler/shader_enums.h | 59 +++++++ src/gallium/auxiliary/tgsi/tgsi_info.c | 6 +- src/gallium/auxiliary/tgsi/tgsi_strings.c | 7 + src/gallium/docs/source/screen.rst | 2 + src/gallium/docs/source/tgsi.rst | 118 +++++++++++-- src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 + .../drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/llvmpipe/lp_screen.c | 1 + .../drivers/nouveau/nv30/nv30_screen.c | 1 + .../drivers/nouveau/nv50/nv50_screen.c | 1 + .../drivers/nouveau/nvc0/nvc0_screen.c | 1 + src/gallium/drivers/r300/r300_screen.c | 1 + src/gallium/drivers/r600/r600_pipe.c | 1 + src/gallium/drivers/radeonsi/si_pipe.c | 3 + src/gallium/drivers/radeonsi/si_shader.c | 153 ++++++++++++++++- .../drivers/radeonsi/si_shader_internal.h | 2 +- .../drivers/radeonsi/si_shader_tgsi_setup.c | 27 ++- src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c | 1 + src/gallium/drivers/swr/swr_screen.cpp | 1 + src/gallium/drivers/vc4/vc4_screen.c | 1 + src/gallium/drivers/virgl/virgl_screen.c | 1 + src/gallium/include/pipe/p_defines.h | 1 + src/gallium/include/pipe/p_shader_tokens.h | 13 +- src/mapi/glapi/registry/gl.xml | 2 +- src/mesa/main/extensions_table.h | 1 + src/mesa/main/mtypes.h | 1 + src/mesa/program/ir_to_mesa.cpp | 3 + src/mesa/state_tracker/st_extensions.c | 1 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 25 +++ 41 files changed, 553 insertions(+), 32 deletions(-) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev