Jason Ekstrand <ja...@jlekstrand.net> writes: > On Fri, May 20, 2016 at 10:47 PM, Francisco Jerez <curroje...@riseup.net> > wrote: > >> The purpose of this series is to improve the back-end infrastructure >> so that lowering of most IR instructions that are too wide to execute >> natively (which is far more common than usual in SIMD32 dispatch mode) >> happens semi-automatically at the IR level. >> >> Patches 1-6 address some issues in a few optimization and lowering >> passes that would otherwise lead to regressions in the following >> changes of the series. Patches 7-12 move the construction of several >> messages into lower_logical_sends() so the SIMD lowering pass can deal >> with them. Patches 13-22 teach the SIMD lowering pass about a number >> of additional ISA restrictions that can be enforced easily by >> splitting SIMD instructions into smaller chunks. Patches 23-31 are >> mainly about removing generator code that wouldn't have worked on >> SIMD32 but is no longer necessary given the infrastructure introduced >> in the first part of the series. >> >> Some of the changes from this series that remove SIMD workarounds >> currently implemented in the generator could potentially be left out >> at least in the initial merge at the cost of losing ARB_compute_shader >> support on VLV and low-end IVB which like Gen8+ don't have enough >> threads per subslice to reach the workgroup size requirement specified >> by the extension in SIMD16 mode. Some other changes like the removal >> of DDY unrolling from the generator are completely optional right now >> although they will eventually be required for SIMD32 fragment shader >> support and they seemed like a nice clean-up. >> >> Expect two more series of roughly the same size coming up soon-ish, >> the second one will get the generator code in good shape for SIMD32, >> and the third one will address some of the remaining issues of the >> compiler back-end so we can start plumbing 32-wide compute shaders >> through it and turn the GL 4.3 switch. >> >> [PATCH 01/31] i965/fs: Fix byte_offset() for MRF/ARF/FIXED_GRF regs. >> [PATCH 02/31] i965/fs: Generalize is_uniform() to is_periodic(). >> [PATCH 03/31] i965/fs: No need to unzip SIMD-periodic sources during SIMD >> lowering. >> [PATCH 04/31] i965/fs: Handle instruction predication in SIMD lowering >> pass. >> [PATCH 05/31] i965/fs: Fix CSE temporary copy for some LOAD_PAYLOAD corner >> cases. >> [PATCH 06/31] i965/fs: Avoid constant propagation when the type sizes >> don't match. >> > > 5 and 6 are Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > > >> [PATCH 07/31] i965/fs: Hide varying pull constant load message setup >> behind logical opcode. >> [PATCH 08/31] i965/fs: Implement promotion of varying pull loads on Gen4 >> during SIMD lowering. >> [PATCH 09/31] i965/fs: Rename Gen4 physical varying pull constant load >> opcode. >> [PATCH 10/31] i965/fs: Add missing get_latency_gen7() cases for the Gen7 >> pull constant opcodes. >> [PATCH 11/31] i965/fs: Lower math into Gen4-5 send-like instructions in >> lower_logical_sends. >> [PATCH 12/31] i965/fs: Handle SAMPLEINFO consistently like other texturing >> instructions. >> > > >> [PATCH 13/31] i965/fs: Enforce extended math exec size limits during SIMD >> lowering. >> [PATCH 14/31] i965/fs: Enforce common regioning restrictions by SIMD >> splitting. >> [PATCH 15/31] i965/fs: Implement workaround for IVB CMP dependency race in >> the SIMD lowering pass. >> [PATCH 16/31] i965/fs: Implement HSW BFI exec size workarounds in the SIMD >> lowering pass. >> [PATCH 17/31] i965/fs: Assert that IF instruction with embedded compare >> has legal exec_size. >> [PATCH 18/31] i965/fs: Calculate maximum execution size of MOV_INDIRECT >> correctly. >> [PATCH 19/31] i965/fs: Apply usual FPU-like execution size restrictions to >> MULH. >> [PATCH 20/31] i965/fs: Lower DDY instructions to SIMD8 during SIMD >> lowering time >> > > 13-20 Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > > >> [PATCH 21/31] i965/fs: Lower LOAD_PAYLOAD instructions of unsupported >> width. >> > > This one is Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> with > substantial reservations. If we ever hit the asserts, we can fix the bugs > then. > > [PATCH 22/31] i965/fs: Limit SIMD width of various virtual opcodes to the >> maximum supported value. >> [PATCH 23/31] i965/fs: Remove handcrafted math SIMD lowering from the >> generator. >> [PATCH 24/31] i965/fs: Set default access mode to Align1 for all >> instructions in the generator. >> [PATCH 25/31] i965/fs: Drop lowering code for a few three-source >> instructions from the generator. >> [PATCH 26/31] i965/fs: Drop Gen7 CMP SIMD unrolling workaround from the >> generator. >> [PATCH 27/31] i965/fs: Remove manual unrolling of BFI instructions from >> the generator. >> [PATCH 28/31] i965/fs: Remove manual splitting of DDY ops in the generator. >> [PATCH 29/31] i965: Define brw_int_type() helper. >> [PATCH 30/31] i965/fs: Remove extract virtual opcodes. >> [PATCH 31/31] i965/fs: Remove FS_OPCODE_PACK_STENCIL_REF virtual >> instruction. >> > > 22-31 are Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > Thanks!
> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev