On Fri, May 27, 2016 at 7:05 PM, Francisco Jerez <curroje...@riseup.net> wrote:
> This fixes the few code quality regressions from the previous series > enabling SIMD32 CS codegen in the back-end -- AFAICT by the end of the > series we can finally enable GL 4.3 on all Gen8+ hardware. > > Patches 1-8 delay the SIMD lowering pass after the bulk of > optimization passes have been run, which should decrease the > compilation time of mainly SIMD32 shaders and improve the code quality > of SIMD32 shaders on all generations and shaders of any dispatch width > on older generations (up to and including IVB) that use SIMD lowering > more intensively to implement various workarounds. > > Patches 9-14 rework the SIMD lowering pass to avoid emitting the copy > instructions used to zip and unzip register regions where possible, > since the register coalesce and copy propagation passes seem to > perform rather poorly at getting rid of them in some cases. In the > long term we'll likely want to improve the register coalesce pass > irrespective of these changes. > > Patches 15-20 improve the compute-to-mrf pass used on Gen4-6 to handle > cases where the source of a VGRF-to-MRF copy is initialized by the > shader using multiple single-GRF writes, which becomes far more common > with the additional SIMD lowering going on after this series. > > Patches 21-24 are some other assorted changes improving code quality > on older gens. > > I wanted to provide more detailed (e.g. per commit) shader-db stats > with this series, but kind of ran out of time. Let me know if you > would like to see more evidence that any of the changes below is > improving code quality in case it's not clear from the commit alone. > > [PATCH 01/25] i965/fs: Let CSE handle logical sampler sends as expressions. > [PATCH 02/25] i965/fs: Allow constant propagation into logical send > sources. > [PATCH 03/25] i965/fs: Add FS_OPCODE_FB_WRITE_LOGICAL to > has_side_effects(). > [PATCH 04/25] i965/fs: Run SIMD and logical send lowering after the > optimization loop. > [PATCH 05/25] i965/fs: Take opt_redundant_discard_jumps out of the > optimization loop. > [PATCH 06/25] i965/fs: Fix UB list sentinel dereference in > opt_sampler_eot(). > [PATCH 07/25] i965/fs: Implement opt_sampler_eot() in terms of logical > sends. [PATCH 08/25] SQUASH: i965/fs: Add basic dataflow check to > opt_sampler_eot(). > > [PATCH 09/25] i965/fs: Refactor offset() into a separate function taking > the width as argument. > [PATCH 10/25] i965/fs: Generalize regions_overlap() from copy propagation > to handle non-VGRF files. > [PATCH 11/25] i965/fs: Factor out region zipping and unzipping from the > SIMD lowering pass. > [PATCH 12/25] i965/fs: Skip SIMD lowering source unzipping for regular > scalar regions. > [PATCH 13/25] i965/fs: Skip SIMD lowering destination zipping if possible. > [PATCH 14/25] i965/fs: Reindent emit_zip(). > 9-14 Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > [PATCH 15/25] i965/fs: Teach regions_overlap() about COMPR4 MRF regions. > [PATCH 16/25] i965/fs: Simplify and improve accuracy of compute_to_mrf() > by using regions_overlap(). > [PATCH 17/25] i965/fs: Fix compute-to-mrf VGRF region coverage condition. > [PATCH 18/25] i965/fs: Refactor compute_to_mrf() to split search and > rewrite into separate loops. > [PATCH 19/25] i965/fs: Teach compute_to_mrf about the COMPR4 address > transformation. > [PATCH 20/25] i965/fs: Extend compute_to_mrf() to coalesce VGRFs > initialized by multiple single-GRF writes. > [PATCH 21/25] i965/fs: Extend remove_duplicate_mrf_writes() to handle > non-VGRF to MRF copies. > 15-21 scare me. A lot. They even make me think that forking the compiler between SNB and IVB may be a good idea. :-/ MRFs are annoying, but COMPR4 is such a gross hack that I really want to teach as little of the compiler about it as possible. So here's the million dollar question: Do we need them? and, more importantly, do we need them now? I didn't see anything wrong in my brief skimming but don't call that a review. > [PATCH 22/25] i965/fs: Fix constant combining for instructions that cannot > accept source mods. > [PATCH 23/25] i965/fs: Allow scalar source regions on SNB math > instructions. > [PATCH 24/25] i965/fs: Skip gen4 pre/post-send dependency workaronds for > the first/last block. > [PATCH 25/25] i965: Expose GL 4.3 on Gen8+. > 22-25 are Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev