Great work, Samuel! Is this available in a branch somewhere?
On Fri, May 19, 2017 at 12:52 PM, Samuel Pitoiset <samuel.pitoi...@gmail.com> wrote: > Hi, > > This series implements ARB_bindless_texture for RadeonSI. > > Reminder: the GLSL compiler part is already upstream. > > This series has been mainly tested with Feral games, here's the list of > existing games that use ARB_bindless_texture (though not by default): > > - DXMD > - Hitman > - Dirt Rally > - Mad Max > > Today, Feral announced "Warhammer 40,000: Dawn of War III" (called DOW3) which > is going to be released next month. This game *requires* ARB_bindless_texture, > that now explains why I did all this work. :-) So, we have ~3 weeks for > merging > this whole series. It would be very nice to have DOW3 support at day one! > > === Tracking bindless problems === > > The following games have been successfully tested: > > - Dirt Rally > - Hitman > - Mad Max > - DOW3 > > For these: > > - No rendering issues > - No VM faults (ie. amdgpu.vm_debug=1) > > However, DXMD is currently broken because the bindless_sampler layout > qualifier > is missing, which ends up by reporting a ton of INVALID_OPERATION errors. Note > that Feral implemented bindless support against NV_bindless_texture and not > ARB_bindless_texture. The main difference is that bindless_sampler is implicit > for NV_* while it's required for ARB_*. Feral plan to fix this soon. > > All ARB_bindless_texture piglit tests pass with this series. > > === Tracking regressions/changes === > > - No regressions with the Intel CI system > - One piglit regression that needs to be fixed > (arb_texture_multisample-sample-position) > - No shader-db changes > - No CPU overhead (glxgears and Heaven in low) > > === Performance results for DOW3 === > > DOW3 exposes two bindless texture modes: > - mode 1: all bindless (ie. no bound samplers) > - mode 2: bound/bindless (ie. only bindless when the limit is reached) > > CPU: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz > NVIDIA blob: 381.22 > > == GTX 1060 == > > LOW: > - mode 1: 89 FPS > - mode 2: 51 FPS > > MEDIUM: > - mode 1: 49 FPS > - mode 2: 28 FPS > > HIGH: > - mode 1: 32 FPS > - mode 2: 19 FPS > > The GTX 1060 performs very well with the all bindless mode (default), while > the bound/bindless mode is not good at all. > > == RX480 == > > LOW: > - mode 1: 67 FPS (-32%) > - mode 2: 75 FPS (+32%) > > MEDIUM: > - mode 1: 38 FPS (-28%) > - mode 2: 44 FPS (+57%) > > HIGH: > - mode 1: 26 FPS (-23%) > - mode 2: 29 FPS (+52%) > > The RX 480 performs very well with the bound/bindless mode (default), while > the all bindless mode still has to be improved. > > The most important bottleneck with the all bindless mode is the number of > buffers that have to be added for every command stream. The overhead in the > winsys and in the kernel (amdgpu_cs_ioctl) becomes important in this > situation. > This mode is still clearly CPU bound and should be improved (see the "Future > work" section). > > Btw, without any optimisations, it was around 35FPS in low (mode 1). > > === Performance results for other Feral titles === > > I didn't record any numbers because these games have been initially > developed/tested against the NVIDIA blob which it's unaffected by a VERY huge > number of resident handles. While the AMD stack is really slow in this > situation. Though, as I said, all Feral games that use bindless work fine, we > just need to improve perf on both sides. > > === Future work === > > I have some ideas to try in order to improve performance with RadeonSI. I will > work on this once this series is upstream. > > Please review, > Thanks! > > Samuel Pitoiset (65): > mapi: add GL_ARB_bindless_texture entry points > mesa: implement ARB_bindless_texture > mesa: add support for unsigned 64-bit vertex attributes > mesa: add support for glUniformHandleui64*ARB() > mesa: refuse to update sampler parameters when a handle is allocated > mesa: refuse to update tex parameters when a handle is allocated > mesa: refuse to change textures when a handle is allocated > mesa: refuse to change tex buffers when a handle is allocated > mesa: keep track of the current variable in add_uniform_to_shader > mesa: store bindless samplers as PROGRAM_UNIFORM > mesa: add infrastructure for bindless samplers/images bound to units > glsl: process uniform samplers declared bindless > glsl: process uniform images declared bindless > glsl: pass the ir_variable object to set_opaque_binding() > glsl: set the explicit binding value for bindless samplers/images > glsl: add ir_variable::is_bindless() > mesa: add update_single_shader_texture_used() helper > mesa: add update_single_program_texture_state() helper > mesa: update textures for bindless samplers bound to texture units > mesa: pass gl_program to _mesa_associate_uniform_storage() > mesa: associate uniform storage to bindless samplers/images > mesa: handle bindless uniforms bound to texture/image units > mesa: get rid of a workaround for bindless in _mesa_get_uniform() > gallium: add PIPE_CAP_BINDLESS_TEXTURE > gallium: add ARB_bindless_texture interface > ddebug: add ARB_bindless_texture support > trace: add ARB_bindless_texture support > tc: add ARB_bindless_texture support > tgsi: add new Bindless flag to tgsi_instruction_texture > tgsi: add new Bindless flag to tgsi_instruction_memory > tgsi/ureg: accept TGSI_FILE_{CONSTANT,INPUT} for dst registers > st/glsl_to_tgsi: add support for bindless samplers > st/glsl_to_tgsi: add support for bindless images > st/glsl_to_tgsi: add support for bindless pack/unpack operations > st/glsl_to_tgsi: teach the DCE pass about bindless samplers/images > st/glsl_to_tgsi: teach rename_temp_registers() about bindless samplers > tgsi/scan: record bindless samplers/images usage > st/mesa: implement ARB_bindless_texture > st/mesa: make update_single_texture() non-static > st/mesa: make convert_sampler_from_unit() non-static > st/mesa: add st_convert_image_from_unit() helper > st/mesa: add st_create_{texture,image}_handle_from_unit() helper > st/mesa: add infrastructure for storing bound texture/image handles > st/mesa: make bindless samplers/images bound to units resident > st/mesa: do not release sampler views for resident textures > st/mesa: disable per-context seamless cubemap when using texture > handles > st/mesa: enable ARB_bindless_texture > radeonsi: add a slab allocator for resident descriptors > radeonsi: add si_init_descriptor_list() helper > radeonsi: add si_set_sampler_view_desc() helper > radeonsi: add si_set_shader_image_desc() helper > radeonsi: implement ARB_bindless_texture > radeonsi: add all resident buffers to the current CS > radeonsi: only add descriptors in presence of resident handles > radeonsi: add si_update_check_render_feedback() helper > radeonsi: decompress DCC for resident textures/images > radeonsi: decompress resident textures/images before graphics/compute > radeonsi: isolate real framebuffer changes from the decompression > passes > radeonsi: track use of bindless samplers/images from tgsi_shader_info > radeonsi: only decompress resident textures/images when used > radeonsi: upload new descriptors when resident buffers are invalidated > radeonsi: invalidate buffers which are made resident if needed > radeonsi: add support for loading bindless samplers > radeonsi: add support for loading bindless images > radeonsi: enable ARB_bindless_texture > > docs/features.txt | 2 +- > docs/relnotes/17.2.0.html | 1 + > src/compiler/glsl/ir.h | 11 + > src/compiler/glsl/ir_uniform.h | 12 + > src/compiler/glsl/link_uniform_initializers.cpp | 42 +- > src/compiler/glsl/link_uniforms.cpp | 156 +++- > src/compiler/glsl/shader_cache.cpp | 47 + > src/gallium/auxiliary/tgsi/tgsi_build.c | 8 + > src/gallium/auxiliary/tgsi/tgsi_scan.c | 37 + > src/gallium/auxiliary/tgsi/tgsi_scan.h | 2 + > src/gallium/auxiliary/tgsi/tgsi_ureg.c | 21 +- > src/gallium/auxiliary/tgsi/tgsi_ureg.h | 16 +- > src/gallium/auxiliary/util/u_threaded_context.c | 147 ++++ > .../auxiliary/util/u_threaded_context_calls.h | 4 + > src/gallium/docs/source/screen.rst | 2 + > src/gallium/drivers/ddebug/dd_context.c | 61 ++ > src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 + > src/gallium/drivers/freedreno/freedreno_screen.c | 1 + > src/gallium/drivers/i915/i915_screen.c | 1 + > src/gallium/drivers/llvmpipe/lp_screen.c | 1 + > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + > src/gallium/drivers/r300/r300_screen.c | 1 + > src/gallium/drivers/r600/r600_pipe.c | 1 + > src/gallium/drivers/radeon/r600_pipe_common.h | 4 + > src/gallium/drivers/radeonsi/si_blit.c | 131 ++- > src/gallium/drivers/radeonsi/si_compute.c | 2 + > src/gallium/drivers/radeonsi/si_compute.h | 14 + > src/gallium/drivers/radeonsi/si_descriptors.c | 943 > +++++++++++++++++++-- > src/gallium/drivers/radeonsi/si_hw_context.c | 1 + > src/gallium/drivers/radeonsi/si_pipe.c | 25 + > src/gallium/drivers/radeonsi/si_pipe.h | 68 ++ > src/gallium/drivers/radeonsi/si_shader.h | 12 + > src/gallium/drivers/radeonsi/si_shader_tgsi_mem.c | 48 +- > src/gallium/drivers/radeonsi/si_state.c | 10 +- > src/gallium/drivers/radeonsi/si_state.h | 9 + > src/gallium/drivers/softpipe/sp_screen.c | 1 + > src/gallium/drivers/svga/svga_screen.c | 1 + > src/gallium/drivers/swr/swr_screen.cpp | 1 + > src/gallium/drivers/trace/tr_context.c | 114 +++ > src/gallium/drivers/vc4/vc4_screen.c | 1 + > src/gallium/drivers/virgl/virgl_screen.c | 1 + > src/gallium/include/pipe/p_context.h | 16 + > src/gallium/include/pipe/p_defines.h | 1 + > src/gallium/include/pipe/p_shader_tokens.h | 6 +- > src/mapi/glapi/gen/ARB_bindless_texture.xml | 100 +++ > src/mapi/glapi/gen/Makefile.am | 1 + > src/mapi/glapi/gen/apiexec.py | 3 + > src/mapi/glapi/gen/gl_API.xml | 4 +- > src/mapi/glapi/gen/gl_genexec.py | 1 + > src/mesa/Makefile.sources | 2 + > src/mesa/main/api_loopback.c | 18 + > src/mesa/main/api_loopback.h | 6 + > src/mesa/main/bufferobj.c | 4 +- > src/mesa/main/context.c | 3 + > src/mesa/main/dd.h | 19 + > src/mesa/main/mtypes.h | 86 ++ > src/mesa/main/samplerobj.c | 48 ++ > src/mesa/main/shared.c | 12 + > src/mesa/main/tests/dispatch_sanity.cpp | 18 + > src/mesa/main/teximage.c | 25 +- > src/mesa/main/texobj.c | 12 + > src/mesa/main/texparam.c | 61 ++ > src/mesa/main/texstate.c | 52 +- > src/mesa/main/texturebindless.c | 902 ++++++++++++++++++++ > src/mesa/main/texturebindless.h | 96 +++ > src/mesa/main/uniform_query.cpp | 208 ++++- > src/mesa/main/uniforms.c | 119 ++- > src/mesa/main/uniforms.h | 16 + > src/mesa/main/varray.c | 23 + > src/mesa/main/varray.h | 3 + > src/mesa/main/vtxfmt.c | 4 + > src/mesa/program/ir_to_mesa.cpp | 36 +- > src/mesa/program/ir_to_mesa.h | 4 +- > src/mesa/program/program.c | 8 + > src/mesa/state_tracker/st_atifs_to_tgsi.c | 2 +- > src/mesa/state_tracker/st_atom_constbuf.c | 6 + > src/mesa/state_tracker/st_atom_image.c | 33 +- > src/mesa/state_tracker/st_atom_sampler.c | 32 +- > src/mesa/state_tracker/st_atom_texture.c | 15 +- > src/mesa/state_tracker/st_cb_texture.c | 84 ++ > src/mesa/state_tracker/st_context.c | 2 + > src/mesa/state_tracker/st_context.h | 11 + > src/mesa/state_tracker/st_extensions.c | 1 + > src/mesa/state_tracker/st_glsl_to_nir.cpp | 3 +- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 138 ++- > src/mesa/state_tracker/st_mesa_to_tgsi.c | 2 +- > src/mesa/state_tracker/st_pbo.c | 2 +- > src/mesa/state_tracker/st_sampler_view.c | 6 + > src/mesa/state_tracker/st_shader_cache.c | 3 +- > src/mesa/state_tracker/st_texture.c | 213 +++++ > src/mesa/state_tracker/st_texture.h | 28 + > src/mesa/vbo/vbo_attrib_tmp.h | 28 + > src/mesa/vbo/vbo_context.h | 2 + > src/mesa/vbo/vbo_exec_api.c | 15 +- > src/mesa/vbo/vbo_save_api.c | 3 + > 97 files changed, 4250 insertions(+), 260 deletions(-) > create mode 100644 src/mapi/glapi/gen/ARB_bindless_texture.xml > create mode 100644 src/mesa/main/texturebindless.c > create mode 100644 src/mesa/main/texturebindless.h > > -- > 2.13.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev