I tried first creating the auxiliary buffer the same time with the color buffer. That, however, led me into a situation where we would later create the rest of the mip-levels and the compression would need to be disabled (it is only supported for single level buffers).
Here we try to create it on demand just before the hardware starts to render. This is similar what we do with fast clear buffers, their creation is deferred until the first clear. This setup also gives the opportunity to detect if the miptree represents the temporaty texture used internally in the mesa core. This texture is mostly written by cpu and therefore enabling compression for it doesn't make much sense. Note that a heuristic is included. Floating point formats are not enabled yet as they are only seen to hurt performance. Some highlights with window system driver kept fixed to default and only the application driver changing: Manhattan: 8.32152% +/- 0.355881% Offscreen: 9.09713% +/- 0.340763% Glb trex: 8.46231% +/- 0.460624% Offscreen: 9.31872% +/- 0.463743% Numbers from our system where the driver is changed also for the windowing environment: GFXBench3_Manhattan 41.8 FPS 12.0 % 46.8 FPS GFXBench3_Manhattan_OffScreen 48.7 FPS 9.0 % 53.1 FPS GLBenchmark_Trex_FixedTime 133.0 FPS 9.0 % 145.0 FPS GLBenchmark_Trex_FixedTime_OffScreen 168.0 FPS 9.5 % 184.0 FPS Unigine-heaven regresses: -2.31021% +/- 0.217207%. There are no color resolves needed during the run so the hit comes from something else. Perhaps the content is such that it doesn't really compress but the additional work required of the hardware to maintain the associated meta data slows us down. v2 (Ben): Re-use msaa layout type for single sampled case. Signed-off-by: Topi Pohjolainen <topi.pohjolai...@intel.com> --- src/mesa/drivers/dri/i965/gen8_surface_state.c | 9 +++++++++ src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 23 ++++++++++++++++++++++- 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index d1c9b5a..f687418 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -485,6 +485,15 @@ gen8_update_renderbuffer_surface(struct brw_context *brw, __func__, _mesa_get_format_name(rb_format)); } + /* Consider if lossless compression is supported but the needed + * auxiliary buffer doesn't exist yet. + */ + if (brw->gen >= 9 && mt->mcs_mt == NULL && + intel_tiling_supports_non_msrt_mcs(brw, mt->tiling) && + intel_miptree_supports_non_msrt_fast_clear(brw, mt) && + intel_miptree_supports_lossless_compressed(mt->format)) + intel_miptree_alloc_non_msrt_mcs(brw, mt); + struct intel_mipmap_tree *aux_mt = mt->mcs_mt; const uint32_t aux_mode = gen8_get_aux_mode(brw, mt, surf_type); diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index e9fbeeb..ee15a2f 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -788,7 +788,8 @@ intel_miptree_create(struct brw_context *brw, /* If this miptree is capable of supporting fast color clears, set * fast_clear_state appropriately to ensure that fast clears will occur. * Allocation of the MCS miptree will be deferred until the first fast - * clear actually occurs. + * clear actually occurs or when compressed single sampled buffer is + * written by the GPU for the first time. */ if (intel_tiling_supports_non_msrt_mcs(brw, mt->tiling) && intel_miptree_supports_non_msrt_fast_clear(brw, mt)) { @@ -1609,6 +1610,26 @@ intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw, 0 /* num_samples */, layout_flags); + /* From Gen9 onwards single-sampled (non-msrt) auxiliary buffers are + * used for lossless compression which requires similar initialisation + * as multi-sample compression. + */ + if (brw->gen >= 9 && + intel_miptree_supports_lossless_compressed(mt->format)) { + /* Hardware sets the auxiliary buffer to all zeroes when it does full + * resolve. Initialize it accordingly in case the first renderer is + * cpu (or other none compression aware party). + * + * This is also explicitly stated in the spec (MCS Buffer for Render + * Target(s)): + * "If Software wants to enable Color Compression without Fast clear, + * Software needs to initialize MCS with zeros." + */ + intel_miptree_init_mcs(brw, mt, 0); + mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED; + mt->msaa_layout = INTEL_MSAA_LAYOUT_CMS; + } + return mt->mcs_mt; } -- 2.5.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev