On Fri, Oct 23, 2015 at 7:06 PM, Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> wrote: > On Fri, Oct 23, 2015 at 4:57 PM, Marek Olšák <mar...@gmail.com> wrote: >> On Fri, Oct 23, 2015 at 1:57 PM, Bas Nieuwenhuizen >> <b...@basnieuwenhuizen.nl> wrote: >>> On Fri, Oct 23, 2015 at 1:52 PM, Marek Olšák <mar...@gmail.com> wrote: >>>> On Fri, Oct 23, 2015 at 1:30 PM, Bas Nieuwenhuizen >>>> <b...@basnieuwenhuizen.nl> wrote: >>>>> On Fri, Oct 23, 2015 at 12:50 PM, Marek Olšák <mar...@gmail.com> wrote: >>>>>> On Fri, Oct 23, 2015 at 12:17 PM, Bas Nieuwenhuizen >>>>>> <b...@basnieuwenhuizen.nl> wrote: >>>>>>> On Thu, Oct 22, 2015 at 12:12 PM, Marek Olšák <mar...@gmail.com> wrote: >>>>>>>>> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c >>>>>>>>> b/src/gallium/drivers/radeonsi/si_descriptors.c >>>>>>>>> index 5548cba3..a277fa5 100644 >>>>>>>>> --- a/src/gallium/drivers/radeonsi/si_descriptors.c >>>>>>>>> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c >>>>>>>>> @@ -234,7 +234,7 @@ static void si_set_sampler_views(struct >>>>>>>>> pipe_context *ctx, >>>>>>>>> } else { >>>>>>>>> samplers->depth_texture_mask &= ~(1 >>>>>>>>> << slot); >>>>>>>>> } >>>>>>>>> - if (rtex->cmask.size || rtex->fmask.size) { >>>>>>>>> + if (rtex->cmask.size || rtex->fmask.size || >>>>>>>>> rtex->surface.dcc_enabled) { >>>>>>>>> samplers->compressed_colortex_mask |= >>>>>>>>> 1 << slot; >>>>>>>> >>>>>>>> I'd like this flag to be set only when dirty_level_mask is non-zero. >>>>>>>> Setting this for all textures that have DCC is quite expensive in draw >>>>>>>> calls. >>>>>>> >>>>>>> I think this code is incorrect even without considering DCC. If we do >>>>>>> a fast clear on a surface which allocates a cmask and then use that >>>>>>> surface as a texture without calling set_sampler_views in between >>>>>>> (because it was bound before) we get a stale compressed_colortex_mask. >>>>>>> >>>>>>> Some testing shows that this can be triggered using OpenGL, although >>>>>>> the GL_ARB_texture_barrier extension may be needed to make the result >>>>>>> not undefined per the specification. >>>>>> >>>>>> In that case, we should decompress in texture_barrier and not in draw >>>>>> calls. >>>>>> >>>>>> Marek >>>>> >>>>> >>>>> texture_barrier does not need to be called though, the language >>>>> changes might be needed. >>>>> >>>>> Basically the test is >>>>> >>>>> fbo1, fbo2 framebuffers with 1 color buffer each: >>>>> >>>>> bind fbo2 as texture >>>>> clear fbo1 using shader >>>>> bind fbo1 as texture >>>>> clear fbo2 using shader >>>>> clear fbo1 using clear (which results in cmask being allocated for fbo1) >>> >>>>> bind fbo2 as texture >>>>> copy fbo2 to fbo1 using copy shader (which wrongly does not decompress >>>>> fbo1) >>> >>> My apologies, these two lines should just be a copy fbo1 to fbo2, >>> which does need to eleminate the cmask fast clear. >> >> That sounds like a texture barrier is required. >> >> Marek > > I think it valid if even without ARB_texture_barrier as the only place > where we could have a rendering feedback loop is the clear. The shader > clears and the copy do not have the same fbo as texture and therefore > no render feedback loop. > > I am not sure if a clear classifies as a GL rendering operation. If it > is not, we have no render feedback loop. If it is, it is still not a > render feedback loop as the active fragment and vertex shaders (the > clear shader) do not contain instructions that sample from that > texture.
The texture barrier ensures that the previous writes are visible to the next read of the texture. The previous reads are irrelevant. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev