Hello, This is the continuation of my ff_fragment_shader cache key optimizations. I have continued to try to reduce overhead of make_state_key function and it seems that I have gained a little bit. As this is the first time I have ventured into the mesa codebase so much, it's possible that I did something wrong along the way. Please, point it out if you find anything incorrect. For example, I was a little bit confused by the indentation used in some code parts (like using only spaces or a mix of spaces and tabs). I tried to preserve the indentation of the files I modified.
As before, the number of patches might be a little bit high since some of them are very simple. I might squash some of them if you prefer that. The first 3 patches are a rebased resend of a previous series. I have kept Eric's r-by (I have changed the commit message of these a little bit, I hope keeping r-by was ok). Patches 4-9 contain simple self-contained improvements to the cache key and its computation. Patches 10 and 11 try to move some of the state computation to the point it is changed. I have added a couple of compressed state fields into the context object. Patches 12 and 13 use these new fields inside make_state_key, simplifying it a lot. Along the way, I have fixed an apparent bug (GL_ONE was not handled as a combine source), though there was no difference for piglit quick run. Finally, patch 14 uses the new compressed fog state for atifs state handling in st/mesa, since it was quite simple to modify it. I didn't bother using the new state for classic dri drivers. I have run a piglit quick test on radeonsi before and after the series and there were no differences apart from some unstable test results. As for performance measurements, I have run a simple minecraft apitrace through perf-record 5 times and have found that: 1. The apitrace replay fps measure is too variable to show any difference. It can be passed as "a wash". 2. perf-report shows something more encouraging. The time spent in _mesa_get_fixed_func_fragment_program has dropped from ~0.78% to ~0.37%. Standard deviation here is ~0.025% so the performance gain is statistically significant. Regards, Gustaw Gustaw Smolarczyk (14): mesa/main/ff_frag: Use correct constant. mesa/main/ff_frag: Remove enabled_units. mesa/main/ff_frag: Reduce the size of nr_enabled_units. mesa/main/ff_frag: Remove unused struct. mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC. mesa/main/ff_frag: Simplify get_fp_input_mask. mesa/main/ff_frag: Store nr_enabled_units only once. mesa/main/ff_frag: Use gl_texture_object::TargetIndex. mesa/main/ff_frag: Don't retrieve format if not necessary. mesa/main: Maintain compressed fog mode. mesa/main: Maintain compressed TexEnv Combine state. mesa/main/ff_frag: Use compressed fog mode. mesa/main/ff_frag: Use compressed TexEnv Combine state. st/mesa: Use compressed fog mode for atifs. src/mesa/main/enable.c | 1 + src/mesa/main/ff_fragment_shader.cpp | 506 ++++++++++-------------------- src/mesa/main/fog.c | 9 + src/mesa/main/mtypes.h | 97 ++++++ src/mesa/main/texstate.c | 103 ++++++ src/mesa/state_tracker/st_atifs_to_tgsi.c | 6 +- src/mesa/state_tracker/st_atom_shader.c | 17 +- 7 files changed, 388 insertions(+), 351 deletions(-) -- 2.12.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev