On 05/19/2017 07:05 PM, Ilia Mirkin wrote:
Great work, Samuel! Is this available in a branch somewhere?

Thanks Ilia!

Here's the branch:
https://cgit.freedesktop.org/~hakzsam/mesa/log/?h=arb_bindless_texture


On Fri, May 19, 2017 at 12:52 PM, Samuel Pitoiset
<samuel.pitoi...@gmail.com> wrote:
Hi,

This series implements ARB_bindless_texture for RadeonSI.

Reminder: the GLSL compiler part is already upstream.

This series has been mainly tested with Feral games, here's the list of
existing games that use ARB_bindless_texture (though not by default):

- DXMD
- Hitman
- Dirt Rally
- Mad Max

Today, Feral announced "Warhammer 40,000: Dawn of War III" (called DOW3) which
is going to be released next month. This game *requires* ARB_bindless_texture,
that now explains why I did all this work. :-) So, we have ~3 weeks for merging
this whole series. It would be very nice to have DOW3 support at day one!

=== Tracking bindless problems ===

The following games have been successfully tested:

- Dirt Rally
- Hitman
- Mad Max
- DOW3

For these:

- No rendering issues
- No VM faults (ie. amdgpu.vm_debug=1)

However, DXMD is currently broken because the bindless_sampler layout qualifier
is missing, which ends up by reporting a ton of INVALID_OPERATION errors. Note
that Feral implemented bindless support against NV_bindless_texture and not
ARB_bindless_texture. The main difference is that bindless_sampler is implicit
for NV_* while it's required for ARB_*. Feral plan to fix this soon.

All ARB_bindless_texture piglit tests pass with this series.

=== Tracking regressions/changes ===

- No regressions with the Intel CI system
- One piglit regression that needs to be fixed
   (arb_texture_multisample-sample-position)
- No shader-db changes
- No CPU overhead (glxgears and Heaven in low)

=== Performance results for DOW3 ===

DOW3 exposes two bindless texture modes:
- mode 1: all bindless (ie. no bound samplers)
- mode 2: bound/bindless (ie. only bindless when the limit is reached)

CPU: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
NVIDIA blob: 381.22

== GTX 1060 ==

LOW:
  - mode 1: 89 FPS
  - mode 2: 51 FPS

MEDIUM:
  - mode 1: 49 FPS
  - mode 2: 28 FPS

HIGH:
  - mode 1: 32 FPS
  - mode 2: 19 FPS

The GTX 1060 performs very well with the all bindless mode (default), while
the bound/bindless mode is not good at all.

== RX480 ==

LOW:
  - mode 1: 67 FPS (-32%)
  - mode 2: 75 FPS (+32%)

MEDIUM:
  - mode 1: 38 FPS (-28%)
  - mode 2: 44 FPS (+57%)

HIGH:
  - mode 1: 26 FPS (-23%)
  - mode 2: 29 FPS (+52%)

The RX 480 performs very well with the bound/bindless mode (default), while
the all bindless mode still has to be improved.

The most important bottleneck with the all bindless mode is the number of
buffers that have to be added for every command stream. The overhead in the
winsys and in the kernel (amdgpu_cs_ioctl) becomes important in this situation.
This mode is still clearly CPU bound and should be improved (see the "Future
work" section).

Btw, without any optimisations, it was around 35FPS in low (mode 1).

=== Performance results for other Feral titles ===

I didn't record any numbers because these games have been initially
developed/tested against the NVIDIA blob which it's unaffected by a VERY huge
number of resident handles. While the AMD stack is really slow in this
situation. Though, as I said, all Feral games that use bindless work fine, we
just need to improve perf on both sides.

=== Future work ===

I have some ideas to try in order to improve performance with RadeonSI. I will
work on this once this series is upstream.

Please review,
Thanks!

Samuel Pitoiset (65):
   mapi: add GL_ARB_bindless_texture entry points
   mesa: implement ARB_bindless_texture
   mesa: add support for unsigned 64-bit vertex attributes
   mesa: add support for glUniformHandleui64*ARB()
   mesa: refuse to update sampler parameters when a handle is allocated
   mesa: refuse to update tex parameters when a handle is allocated
   mesa: refuse to change textures when a handle is allocated
   mesa: refuse to change tex buffers when a handle is allocated
   mesa: keep track of the current variable in add_uniform_to_shader
   mesa: store bindless samplers as PROGRAM_UNIFORM
   mesa: add infrastructure for bindless samplers/images bound to units
   glsl: process uniform samplers declared bindless
   glsl: process uniform images declared bindless
   glsl: pass the ir_variable object to set_opaque_binding()
   glsl: set the explicit binding value for bindless samplers/images
   glsl: add ir_variable::is_bindless()
   mesa: add update_single_shader_texture_used() helper
   mesa: add update_single_program_texture_state() helper
   mesa: update textures for bindless samplers bound to texture units
   mesa: pass gl_program to _mesa_associate_uniform_storage()
   mesa: associate uniform storage to bindless samplers/images
   mesa: handle bindless uniforms bound to texture/image units
   mesa: get rid of a workaround for bindless in _mesa_get_uniform()
   gallium: add PIPE_CAP_BINDLESS_TEXTURE
   gallium: add ARB_bindless_texture interface
   ddebug: add ARB_bindless_texture support
   trace: add ARB_bindless_texture support
   tc: add ARB_bindless_texture support
   tgsi: add new Bindless flag to tgsi_instruction_texture
   tgsi: add new Bindless flag to tgsi_instruction_memory
   tgsi/ureg: accept TGSI_FILE_{CONSTANT,INPUT} for dst registers
   st/glsl_to_tgsi: add support for bindless samplers
   st/glsl_to_tgsi: add support for bindless images
   st/glsl_to_tgsi: add support for bindless pack/unpack operations
   st/glsl_to_tgsi: teach the DCE pass about bindless samplers/images
   st/glsl_to_tgsi: teach rename_temp_registers() about bindless samplers
   tgsi/scan: record bindless samplers/images usage
   st/mesa: implement ARB_bindless_texture
   st/mesa: make update_single_texture() non-static
   st/mesa: make convert_sampler_from_unit() non-static
   st/mesa: add st_convert_image_from_unit() helper
   st/mesa: add st_create_{texture,image}_handle_from_unit() helper
   st/mesa: add infrastructure for storing bound texture/image handles
   st/mesa: make bindless samplers/images bound to units resident
   st/mesa: do not release sampler views for resident textures
   st/mesa: disable per-context seamless cubemap when using texture
     handles
   st/mesa: enable ARB_bindless_texture
   radeonsi: add a slab allocator for resident descriptors
   radeonsi: add si_init_descriptor_list() helper
   radeonsi: add si_set_sampler_view_desc() helper
   radeonsi: add si_set_shader_image_desc() helper
   radeonsi: implement ARB_bindless_texture
   radeonsi: add all resident buffers to the current CS
   radeonsi: only add descriptors in presence of resident handles
   radeonsi: add si_update_check_render_feedback() helper
   radeonsi: decompress DCC for resident textures/images
   radeonsi: decompress resident textures/images before graphics/compute
   radeonsi: isolate real framebuffer changes from the decompression
     passes
   radeonsi: track use of bindless samplers/images from tgsi_shader_info
   radeonsi: only decompress resident textures/images when used
   radeonsi: upload new descriptors when resident buffers are invalidated
   radeonsi: invalidate buffers which are made resident if needed
   radeonsi: add support for loading bindless samplers
   radeonsi: add support for loading bindless images
   radeonsi: enable ARB_bindless_texture

  docs/features.txt                                  |   2 +-
  docs/relnotes/17.2.0.html                          |   1 +
  src/compiler/glsl/ir.h                             |  11 +
  src/compiler/glsl/ir_uniform.h                     |  12 +
  src/compiler/glsl/link_uniform_initializers.cpp    |  42 +-
  src/compiler/glsl/link_uniforms.cpp                | 156 +++-
  src/compiler/glsl/shader_cache.cpp                 |  47 +
  src/gallium/auxiliary/tgsi/tgsi_build.c            |   8 +
  src/gallium/auxiliary/tgsi/tgsi_scan.c             |  37 +
  src/gallium/auxiliary/tgsi/tgsi_scan.h             |   2 +
  src/gallium/auxiliary/tgsi/tgsi_ureg.c             |  21 +-
  src/gallium/auxiliary/tgsi/tgsi_ureg.h             |  16 +-
  src/gallium/auxiliary/util/u_threaded_context.c    | 147 ++++
  .../auxiliary/util/u_threaded_context_calls.h      |   4 +
  src/gallium/docs/source/screen.rst                 |   2 +
  src/gallium/drivers/ddebug/dd_context.c            |  61 ++
  src/gallium/drivers/etnaviv/etnaviv_screen.c       |   1 +
  src/gallium/drivers/freedreno/freedreno_screen.c   |   1 +
  src/gallium/drivers/i915/i915_screen.c             |   1 +
  src/gallium/drivers/llvmpipe/lp_screen.c           |   1 +
  src/gallium/drivers/nouveau/nv30/nv30_screen.c     |   1 +
  src/gallium/drivers/nouveau/nv50/nv50_screen.c     |   1 +
  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c     |   1 +
  src/gallium/drivers/r300/r300_screen.c             |   1 +
  src/gallium/drivers/r600/r600_pipe.c               |   1 +
  src/gallium/drivers/radeon/r600_pipe_common.h      |   4 +
  src/gallium/drivers/radeonsi/si_blit.c             | 131 ++-
  src/gallium/drivers/radeonsi/si_compute.c          |   2 +
  src/gallium/drivers/radeonsi/si_compute.h          |  14 +
  src/gallium/drivers/radeonsi/si_descriptors.c      | 943 +++++++++++++++++++--
  src/gallium/drivers/radeonsi/si_hw_context.c       |   1 +
  src/gallium/drivers/radeonsi/si_pipe.c             |  25 +
  src/gallium/drivers/radeonsi/si_pipe.h             |  68 ++
  src/gallium/drivers/radeonsi/si_shader.h           |  12 +
  src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c  |  48 +-
  src/gallium/drivers/radeonsi/si_state.c            |  10 +-
  src/gallium/drivers/radeonsi/si_state.h            |   9 +
  src/gallium/drivers/softpipe/sp_screen.c           |   1 +
  src/gallium/drivers/svga/svga_screen.c             |   1 +
  src/gallium/drivers/swr/swr_screen.cpp             |   1 +
  src/gallium/drivers/trace/tr_context.c             | 114 +++
  src/gallium/drivers/vc4/vc4_screen.c               |   1 +
  src/gallium/drivers/virgl/virgl_screen.c           |   1 +
  src/gallium/include/pipe/p_context.h               |  16 +
  src/gallium/include/pipe/p_defines.h               |   1 +
  src/gallium/include/pipe/p_shader_tokens.h         |   6 +-
  src/mapi/glapi/gen/ARB_bindless_texture.xml        | 100 +++
  src/mapi/glapi/gen/Makefile.am                     |   1 +
  src/mapi/glapi/gen/apiexec.py                      |   3 +
  src/mapi/glapi/gen/gl_API.xml                      |   4 +-
  src/mapi/glapi/gen/gl_genexec.py                   |   1 +
  src/mesa/Makefile.sources                          |   2 +
  src/mesa/main/api_loopback.c                       |  18 +
  src/mesa/main/api_loopback.h                       |   6 +
  src/mesa/main/bufferobj.c                          |   4 +-
  src/mesa/main/context.c                            |   3 +
  src/mesa/main/dd.h                                 |  19 +
  src/mesa/main/mtypes.h                             |  86 ++
  src/mesa/main/samplerobj.c                         |  48 ++
  src/mesa/main/shared.c                             |  12 +
  src/mesa/main/tests/dispatch_sanity.cpp            |  18 +
  src/mesa/main/teximage.c                           |  25 +-
  src/mesa/main/texobj.c                             |  12 +
  src/mesa/main/texparam.c                           |  61 ++
  src/mesa/main/texstate.c                           |  52 +-
  src/mesa/main/texturebindless.c                    | 902 ++++++++++++++++++++
  src/mesa/main/texturebindless.h                    |  96 +++
  src/mesa/main/uniform_query.cpp                    | 208 ++++-
  src/mesa/main/uniforms.c                           | 119 ++-
  src/mesa/main/uniforms.h                           |  16 +
  src/mesa/main/varray.c                             |  23 +
  src/mesa/main/varray.h                             |   3 +
  src/mesa/main/vtxfmt.c                             |   4 +
  src/mesa/program/ir_to_mesa.cpp                    |  36 +-
  src/mesa/program/ir_to_mesa.h                      |   4 +-
  src/mesa/program/program.c                         |   8 +
  src/mesa/state_tracker/st_atifs_to_tgsi.c          |   2 +-
  src/mesa/state_tracker/st_atom_constbuf.c          |   6 +
  src/mesa/state_tracker/st_atom_image.c             |  33 +-
  src/mesa/state_tracker/st_atom_sampler.c           |  32 +-
  src/mesa/state_tracker/st_atom_texture.c           |  15 +-
  src/mesa/state_tracker/st_cb_texture.c             |  84 ++
  src/mesa/state_tracker/st_context.c                |   2 +
  src/mesa/state_tracker/st_context.h                |  11 +
  src/mesa/state_tracker/st_extensions.c             |   1 +
  src/mesa/state_tracker/st_glsl_to_nir.cpp          |   3 +-
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp         | 138 ++-
  src/mesa/state_tracker/st_mesa_to_tgsi.c           |   2 +-
  src/mesa/state_tracker/st_pbo.c                    |   2 +-
  src/mesa/state_tracker/st_sampler_view.c           |   6 +
  src/mesa/state_tracker/st_shader_cache.c           |   3 +-
  src/mesa/state_tracker/st_texture.c                | 213 +++++
  src/mesa/state_tracker/st_texture.h                |  28 +
  src/mesa/vbo/vbo_attrib_tmp.h                      |  28 +
  src/mesa/vbo/vbo_context.h                         |   2 +
  src/mesa/vbo/vbo_exec_api.c                        |  15 +-
  src/mesa/vbo/vbo_save_api.c                        |   3 +
  97 files changed, 4250 insertions(+), 260 deletions(-)
  create mode 100644 src/mapi/glapi/gen/ARB_bindless_texture.xml
  create mode 100644 src/mesa/main/texturebindless.c
  create mode 100644 src/mesa/main/texturebindless.h

--
2.13.0

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to