This series contains two independent discard-related optimizations: PATCH 1-2 change the i965 back-end to do per-subspan instead of per-SIMD-thread discard jumps, which can save bandwidth and ALU cycles in some scenarios. This improves the FPS of a very simple microbenchmark that does a costly texturing operation after a conditional discard by roughly 3x (verified on HSW and SKL but the improvement should affect all platforms from Gen6 up). PATCH 3-5 change the scheduler heuristics to take into account the runtime benefit of scheduling exit nodes earlier in the program in addition to the current latency- and register pressure-sensitive heuristics. This results in a 10% to 20% FPS increase in the GfxBench Manhattan benchmark, and a roughly 5% FPS increase in Unigine Valley. The exact FPS change will be heavily dependent on the platform and settings, but I've observed comparable improvements on HSW, SKL, BDW, BSW and VLV.
For a branch in testable form see: https://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-sched-discard [PATCH 1/5] i965/fs: Drop bogus writemasking disable bit from HALT instructions. [PATCH 2/5] i965/fs: Switch to per-subspan discard jumps. [PATCH 3/5] i965/sched: Calculate the critical path of scheduling nodes non-recursively. [PATCH 4/5] i965/sched: Assign a preferred exit node to each node of the dependency graph. [PATCH 5/5] i965/sched: Change the scheduling heuristics to favor early program termination. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev