There is one thing I need a response on at the very bottom, the rest will be addressed in v2.
Thanks. On Mon, Nov 09, 2015 at 11:33:17AM -0800, Chad Versace wrote: > On Tue 03 Nov 2015, Ben Widawsky wrote: > > On Fri, Oct 16, 2015 at 04:05:22PM -0700, Chad Versace wrote: > > > On Tue 13 Oct 2015, Ben Widawsky wrote: > > > > Initially I had this planned as a patch to be squashed in to the > > > > enabling patch > > > > because there is no point enabling fast clears without this. However, > > > > Chad > > > > merged a patch which disables fast clears on gen9 explicitly, and so I > > > > can hide > > > > this behind the revert of that patch. This is a nice I really wanted > > > > this patch > > > > as a distinct patch for review. This is a new, weird, and poorly > > > > documented > > > > restriction for SKL. (In fact, I am still not 100% certain the > > > > restriction is > > > > entirely necessary, but there are around 30 piglit regressions without > > > > this). > > > > > > > > SKL adds compressible render targets and as a result mutates some of the > > > > programming for fast clears and resolves. There is a new internal > > > > surface type > > > > called the CCS. The old AUX_MCS bit becomes AUX_CCS_D. "The Auxiliary > > > > surface is > > > > a CCS (Color Control Surface) with compression disabled or an MCS with > > > > compression enabled, depending on number of multisamples. MCS > > > > (Multisample > > > > Control Surface) is a special type of CCS." > > > > > > > > Signed-off-by: Ben Widawsky <b...@bwidawsk.net> > > > > --- > > > > src/mesa/drivers/dri/i965/brw_context.h | 1 + > > > > src/mesa/drivers/dri/i965/brw_surface_formats.c | 27 > > > > +++++++++++++++++++++++++ > > > > src/mesa/drivers/dri/i965/gen8_surface_state.c | 8 ++++++-- > > > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 3 +++ > > > > 4 files changed, 37 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > > > > b/src/mesa/drivers/dri/i965/brw_context.h > > > > index e59478a..32b8250 100644 > > > > --- a/src/mesa/drivers/dri/i965/brw_context.h > > > > +++ b/src/mesa/drivers/dri/i965/brw_context.h > > > > @@ -1546,6 +1546,7 @@ struct brw_context > > > > > > > > uint32_t render_target_format[MESA_FORMAT_COUNT]; > > > > bool format_supported_as_render_target[MESA_FORMAT_COUNT]; > > > > + bool losslessly_compressable[MESA_FORMAT_COUNT]; > > > > > > I agree with Neil. It's a shame to increase the context size for static > > > information. And there is already a static array in > > > brw_surface_formats.c for exactly this type of information. > > > > As I said in the mail with Neil, the formats aren't static and change per > > GEN. I > > could do the base formats which are shared by all Gens, but I think you > > still > > end up needing something like this. Or did I misunderstand what is being > > asked > > of me? > > One of us misunderstood each other, but I don't know who. > > The table in brw_surface_formats.c does contain per-gen information. The > table lists, for each surface format and surface format capability, the > first gen that supports the capability. > > So, you can avoid fattening brw_context by adding a new column in the > brw_surface_formats table for losslessly_compressable, then setting each > item in the column to either X (no gen supports the capability) or 90 > (Skylake). > > Is there some reason why this wouldn't work? I misunderstood something Neil said. You're (and Neil) totally right here. > > > > > > > > > > > /* Interpolation modes, one byte per vue slot. > > > > * Used Gen4/5 by the clip|sf|wm stages. Ignored on Gen6+. > > > > diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c > > > > b/src/mesa/drivers/dri/i965/brw_surface_formats.c > > > > index 97fff60..d706ecc 100644 > > > > --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c > > > > +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c > > > > @@ -693,6 +693,33 @@ brw_init_surface_formats(struct brw_context *brw) > > > > } > > > > } > > > > > > > > + if (brw->gen >= 9) { > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_FLOAT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_SINT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_UINT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_UNORM16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_SNORM16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_SINT16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_UINT16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_FLOAT16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RG_FLOAT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RG_SINT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RG_UINT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBX_FLOAT16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_B8G8R8A8_UNORM] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_R8G8B8A8_UNORM] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_R8G8B8A8_SNORM] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_SINT8] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RGBA_UINT8] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RG_SINT16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RG_UINT16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_RG_FLOAT16] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_R_UINT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_R_SINT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_R_FLOAT32] = true; > > > > + brw->losslessly_compressable[MESA_FORMAT_B8G8R8X8_UNORM] = true; > > > > + } > > > > > > Properties of surface formats should go into the monster table that > > > occurs earlier in the file. Then you can replace > > > brw_context::losslessly_compressable with a query and keep brw_context > > > at its current size. > > G> > Same comment from above. > > And same again from me. > > > > > > > > > > + > > > > /* We will check this table for FBO completeness, but the surface > > > > format > > > > * table above only covered color rendering. > > > > */ > > > > diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c > > > > b/src/mesa/drivers/dri/i965/gen8_surface_state.c > > > > index 995b4dd..b19b492 100644 > > > > --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c > > > > +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c > > > > @@ -243,8 +243,10 @@ gen8_emit_texture_surface_state(struct brw_context > > > > *brw, > > > > * "When Auxiliary Surface Mode is set to AUX_CCS_D or > > > > AUX_CCS_E, HALIGN > > > > * 16 must be used." > > > > */ > > > > - if (brw->gen >= 9 || mt->num_samples == 1) > > > > + if (brw->gen >= 9 || mt->num_samples == 1) { > > > > assert(mt->halign == 16); > > > > + assert(mt->num_samples || > > > > brw->losslessly_compressable[mt->format] == true); > ^^^^^^^^^^^^^^^ > > Did you mean to test for `mt->num_samples > 1`? As written, this > assertion looks incorrect for `mt->num_samples == 1`. This comment also > applies to the assertion in the next hunk. > Yes. Interestingly, it was wrong even before I rebased on your patch - so I'm confused why nothing fails. It will be fixed on resend. > > > > + } > > > > } > > > > > > > > const uint32_t surf_type = translate_tex_target(target); > > > > @@ -488,8 +490,10 @@ gen8_update_renderbuffer_surface(struct > > > > brw_context *brw, > > > > * "When Auxiliary Surface Mode is set to AUX_CCS_D or > > > > AUX_CCS_E, HALIGN > > > > * 16 must be used." > > > > */ > > > > - if (brw->gen >= 9 || mt->num_samples == 1) > > > > + if (brw->gen >= 9 || mt->num_samples == 1) { > > > > assert(mt->halign == 16); > > > > + assert(mt->num_samples || > > > > brw->losslessly_compressable[mt->format] == true); > > > > > > This assert is wrong. It's perfectly safe to render into a format that > > > is not losslessly compressable, even when it has a MCS. > > > > > > > Could you explain how this can happen? I can't really fathom how it can. I > > am > > far from an expert, but I'm also surprised if that's true I never hit the > > assert. If/when we support other AUX types this is certainly an invalid > > assertion. > > It's not a matter of how "it happens". It's a matter of "does the > hardware allow this". The lossless compressability of a surface format > does not alter the hardware's ability to render into the surface in the > presence of a single-sample CCS_D. The lossless compressability of > a surface format affects the hardware's ability to *sample* from the > surface in the presence of a single-sample CCS_D. > > That's why in the first hunk, in gen8_emit_texture_surface_state, the > assertion's intent (but not the details) is correct. But the assertion's > intent in the second hunk, in gen8_update_renderbuffer_surface, is > incorrect. It's too restrictive and doesn't reflect what the hardware > requires. > asserts in driver code is exactly about how "it happens" and not about what the hardware supports. I don't care enough to argue further, I will remove the assert. > > > > > > + } > > > > } > > > > > > > > uint32_t *surf = allocate_surface_state(brw, &offset, surf_index); > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > > index 32f0ff7..f108b75 100644 > > > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > > @@ -265,6 +265,9 @@ intel_miptree_supports_non_msrt_fast_clear(struct > > > > brw_context *brw, > > > > if (!brw->format_supported_as_render_target[mt->format]) > > > > return false; > > > > > > > > + if (brw->gen >= 9 && !brw->losslessly_compressable[mt->format]) > > > > + return false; > > > > + > > > > > > This hunk needs a comment that explains why you're disabling 1x fast > > > clears for these formats. > > > > > > > If you insist on adding a comment, I'd rather add it where the formats are > > defined as opposed to here. The variable name, or lookup function would > > pretty > > clearly tell what it's doing. I honestly think only a spec reference is > > lacking > > though. > > Do you mean the variable name "losslessly_compressable" is sufficient to > explain what's happening here? I think the name itself is sufficient, yes. Perhaps you know something more intricate than I do though. I'm not trying to be coy, just tell me what you want as a comment, and I will add it. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev