This is an implementation of non-coherent framebuffer fetch as described here [1] working on most hardware generations supported by the i965 driver (from Gen5 to Gen8). My plan was to send the coherent framebuffer fetch implementation for SKL+ first since it's actually simpler than the non-coherent path, but I've noticed some potential hardware issues that need further investigation, so here's the non-coherent path so it hopefully gets some reviews in the meantime -- I plan to send the implementation of coherent framebuffer fetch next week.
Patches 01-11 get the compiler ready for non-coherent framebuffer fetch (some of the changes like the NIR fragment output location rework will also be useful for the coherent path). Patches 12-20 implement the required state setup logic and the new glBlendBarrier entry point. You can find the whole series along with the driver-independent changes for EXT_shader_framebuffer_fetch in my Mesa tree [2], but note that in order to test it you still need to add an additional entry to extensions_table.h manually since the non-coherent extension is not exposed yet. [1] https://lists.freedesktop.org/archives/mesa-dev/2016-July/124028.html [2] https://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-fb-fetch [PATCH 01/21] i965/fs: Get rid of fs_visitor::do_dual_src. [PATCH 02/21] i965/fs: Add brw_wm_prog_key bit specifying whether FB reads should be coherent. [PATCH 03/21] i965: Allocate space in the binding table for non-coherent FB fetch. [PATCH 04/21] i965/fs: Force per-sample dispatch if the shader reads from a multisample FBO. [PATCH 05/21] i965/fs: Emit interpolation setup if non-coherent framebuffer fetch is in use. [PATCH 06/21] i965/fs: Implement non-coherent framebuffer fetch using the sampler unit. [PATCH 07/21] i965/fs: Special-case nir_intrinsic_store_output for the fragment shader. [PATCH 08/21] i965: Fix undefined signed overflow in INTEL_MASK for bitfields of 31 bits. [PATCH 09/21] i965/fs: Rework representation of fragment output locations in NIR. [PATCH 10/21] i965/fs: Allocate fragment output temporaries on demand. [PATCH 11/21] i965/fs: Translate nir_intrinsic_load_output on a fragment output. [PATCH 12/21] i965: Return whether the miptree was resolved from intel_miptree_resolve_color(). [PATCH 13/21] i965: Resolve color for non-coherent FB fetch at UpdateState time. [PATCH 14/21] i965: Factor out isl_surf_dim/isl_dim_layout calculation into functions. [PATCH 15/21] i965: Return the correct layout from get_isl_dim_layout for pre-ILK cube textures. [PATCH 16/21] i965: Add missing has_surface_tile_offset flag to the Gen8+ device info structures. [PATCH 17/21] i965: Massage argument list of brw_emit_surface_state(). [PATCH 18/21] i965: Implement support for overriding the texture target in brw_emit_surface_state. [PATCH 19/21] i965: Upload surface state for non-coherent framebuffer fetch. [PATCH 20/21] i965: Implement glBlendBarrier. [PATCH 21/21] i965: Flip the non-coherent framebuffer fetch extension bit on G45-Gen8 hardware. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev