Dear all, with the help of Nicolai's comments I rewrote the proposed patch set to improve the register renaming.
The patch is related to bugs where shader compilation fails with "- translation from TGSI failed!" Among these is https://bugs.freedesktop.org/show_bug.cgi?id=65448 which I can confirm will be fixed for R600_DEBUG=nosb set (with sb enabled it will fail with a failing assertion in the sb code). Changes to the first patch set are: - significantly cutting down on the memory allocations - exposing only a minimal interface to register lifetime estimation and calculating the rename table. The algorithm works like follows: - first the program is scanned, the loops, switch and if/else scopes are collected and for each temporary first and last reads and writes and the according scopes are collected, and it its recorded whether a variable is written conditionally, and whether loops have continue or break statements. - then after the whole program has been scanned, the life times are estimated by merging the read and write scopes for each temporary. - the register mapping is evaluated - applying the mapping is done with the rename_temp_registers method already in place. The algorithm tracks optimal life times for temporaries that are written unconditionally. For temporaries written in if/else branches or switch cases it is not (yet) tracked whether they are written in all branches, and hence, the estimated life time does not necessarily comprise the optimum. Running piglit on the shaders shows no regressions, and marks one more test as passing: spec@glsl-1.50@execution@variable-indexing@gs-input-array-vec2-index-rd However, I don't think that my patch actually tackles the true problem of this shader - i.e. the shader copies a large input block to temp arrays, and accesses these indirectly via a variable not controlled by the shader, thereby making register renaming impossible for these temporaries. Checking the perfocmance by running the shader-db perf record --call-graph ./run -j1 shaders I get the following performance compared to the original implementation current patches applied self 0.25 0.22 - life-time estimation 0.03 0.12 - evaluate mapping (in self=0.17) 0.05 - rename-registers 0.05 0.05 All numbers are in %, normalized for the corresponding number reported for main. The reduction when evaluating the mappings results because the original implementation uses a brute force O(n^2) algorithm, whereas I use a O(n log n) algorithm to find renaming candidates. Many thanks for any comments, Gert Gert Wollny (3): mesa/st: glsl_to_tgsi move some helper classes to extra files mesa/st: glsl_to_tgsi Implement a new lifetime tracker for temporaries mesa/st: glsl_to_tgsi: tie in the new register renaming approach configure.ac | 1 + src/mesa/Makefile.am | 4 +- src/mesa/Makefile.sources | 4 + src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 299 +------ src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp | 202 +++++ src/mesa/state_tracker/st_glsl_to_tgsi_private.h | 164 ++++ .../state_tracker/st_glsl_to_tgsi_temprename.cpp | 674 +++++++++++++++ .../state_tracker/st_glsl_to_tgsi_temprename.h | 30 + src/mesa/state_tracker/tests/Makefile.am | 40 + .../tests/test_glsl_to_tgsi_lifetime.cpp | 959 +++++++++++++++++++++ 10 files changed, 2092 insertions(+), 285 deletions(-) create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.cpp create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_private.h create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp create mode 100644 src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h create mode 100644 src/mesa/state_tracker/tests/Makefile.am create mode 100644 src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp -- 2.13.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev