Big thanks to Grazvydas Ignotas for helping test this version. V4: - lots of reworking patches to remove code churn should be much nicer now - fixed fallback when shader has been detached - fixed a couple of bugs with UBOs - no more printfs, debug info is behind an environment var - various cleanups, tweaks and fixes V3: - add support for geometry and tessellation stages - cache clip planes - reserve parameter storage before restoring list - stop losing buffer blocks on cache fallback - lots of little fixes I cant remember
V2: - rebased on master - add support for encoding doubles - renamed skip_cache params to is_cache_fallback, and fix related bug when disabling shader cache for xfb. This series is based on the great work done by Carl, Kristian and others. There are no regressions after two runs of piglit with shader cache enabled on my Broadwell machine. This series enables on disk shader cache for all stage except compute programs. For now transform feedback, and SSO programs skip using the cache, these will be added as follow ups. My main goal with this series is to land something that passes piglit there is a number of optimisations that can still be done such as skipping more validation and state recreation when falling back to a full recompile but I would rather leave this until we have something fully working. Games: Enabling shader cache with the Shadow of Mordor benchmark make things noticeably smoother and helps consitently keep the min FPS at 15 on my Skylake, were as without it can be anywhere between 4-15. The elemental demo which Dave pointed out as also doing a bunch of compiles during the demo is also smoother especially on the second run but its really slow on my Skylake regardless. Maybe someone with a highend Skylake would like to give it a try. Here are the shader-db times (from V2): Cache disabled: Thread 1 took 1360.47 seconds and compiled 13015 shaders (not including SIMD16) with 50 GL context switches Thread 3 took 1349.85 seconds and compiled 12848 shaders (not including SIMD16) with 40 GL context switches Thread 2 took 1362.94 seconds and compiled 12637 shaders (not including SIMD16) with 36 GL context switches Thread 0 took 1352.41 seconds and compiled 12593 shaders (not including SIMD16) with 46 GL context switches Cache enabled first run: Thread 1 took 1410.30 seconds and compiled 12678 shaders (not including SIMD16) with 34 GL context switches Thread 2 took 1421.35 seconds and compiled 12822 shaders (not including SIMD16) with 50 GL context switches Thread 0 took 1410.49 seconds and compiled 12999 shaders (not including SIMD16) with 40 GL context switches Thread 3 took 1426.67 seconds and compiled 12594 shaders (not including SIMD16) with 48 GL context switches Cache enabled second run: Thread 0 took 259.84 seconds and compiled 12817 shaders (not including SIMD16) with 40 GL context switches Thread 3 took 257.03 seconds and compiled 12533 shaders (not including SIMD16) with 50 GL context switches Thread 1 took 256.18 seconds and compiled 12828 shaders (not including SIMD16) with 40 GL context switches Thread 2 took 261.31 seconds and compiled 12915 shaders (not including SIMD16) with 39 GL context switches You can find the series in the shader-cache branch of: https://github.com/tarceri/Mesa_arrays_of_arrays.git MESA_GLSL_CACHE_ENABLE=1 - enables the cache. MESA_GLSL=cache_info - enables some debug messages _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev