On 11 July 2012 11:16, Paul Berry <stereotype...@gmail.com> wrote: > On 11 July 2012 11:01, Chad Versace <chad.vers...@linux.intel.com> wrote: > >> On 07/06/2012 03:29 PM, Paul Berry wrote: >> > This patch updates the blorp engine to properly handle the case where >> > the surface being textured from uses Gen7's CMS MSAA layout. The >> > following changes were necessary: >> > >> > - Before reading color values from the surface, we need to read from >> > the MCS buffer using the ld_mcs sampler message. This is done by >> > the mcs_fetch() function, and the result is stored in the mcs_data >> > register. This only needs to be done once per pixel, since the MCS >> > value is shared between all samples belonging to a pixel. >> > >> > - When reading color values from the surface, we need to use the >> > ld2dms sampler message instead of the ld2dss message, and we need to >> > provide the value read from the MCS buffer as an argument. >> > --- >> > src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 45 >> ++++++++++++++++++++++++- >> > 1 files changed, 43 insertions(+), 2 deletions(-) >> >> I see an easy optimization that's suitable for a small follow-up patch. >> But this >> code works and passes tests, so let's commit it as-is first. >> > > Cool. I'm curious to hear what your optimization idea is. >
Oh! The optimization you are referring to is the one you mention later in the email! I get it now. > > >> >> This patch is >> Reviewed-by: Chad Versace <chad.vers...@linux.intel.com> >> >> > @@ -1126,8 +1148,13 @@ brw_blorp_blit_program::texel_fetch(struct >> brw_reg dst) >> > break; >> > case 7: >> > if (key->tex_samples > 0) { >> > - texture_lookup(dst, GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DSS, >> > - gen7_ld2dss_args, >> ARRAY_SIZE(gen7_ld2dss_args)); >> > + if (key->tex_layout == INTEL_MSAA_LAYOUT_CMS) { >> > + texture_lookup(dst, GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DMS, >> > + gen7_ld2dms_args, >> ARRAY_SIZE(gen7_ld2dms_args)); >> > + } else { >> > + texture_lookup(dst, GEN7_SAMPLER_MESSAGE_SAMPLE_LD2DSS, >> > + gen7_ld2dss_args, >> ARRAY_SIZE(gen7_ld2dss_args)); >> > + } >> >> The hw docs suggest here "a simple optimization with probable large >> return in >> performance". If mcs_data == 0, which should be the common case for most >> fragments, then only sample array slice 0 needs to be accessed. That is, >> when >> tex_layout is CMS then we should emit code like this here: >> >> if (mcs_data == 0) { >> ld2dss(S, X, Y); >> } else { >> ld2dms(S, mcs_data, X, Y); >> } >> > > I actually believe this will be slightly faster: > > ld2dms(0, mcs_data, X, Y); > if (mcs_data != 0) { > for (S = 1; S < num_samples; ++S) > ld2dms(S, mcs_data, X, Y); > average samples together; > } > } > > Because it will avoid issuing an extra sample instruction in the case > where some pixels have a zero value for mcs_data and others have a nonzero > value for mcs_data. I've got that queued up for a patch series after this > one is reviewed :) > > >> >> ---- >> Chad Versace >> chad.vers...@linux.intel.com >> >> >> >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev