On Wed, Oct 05, 2016 at 04:30:51PM -0700, Nanley Chery wrote:
> On Tue, Sep 27, 2016 at 02:47:22PM -0700, Nanley Chery wrote:
> > On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote:
> > > On Mon 26 Sep 2016, Nanley Chery wrote:
> > > > From: Nanley Chery <nanleych...@gmail.com>
> > > > 
> > > > Provides an FPS increase of ~30% on the Sascha triangle and 
> > > > multisampling
> > > > demos.
> > > > 
> > > > Clears that happen within a render pass via vkCmdClearAttachments are 
> > > > safe
> > > > even if the clear color changes. This is because the meta 
> > > > implementation does
> > > > not use LOAD_OP_CLEAR which avoids any conflicts with 
> > > > 3DSTATE_CLEAR_PARAMS.
> > > > 
> > > > Signed-off-by: Nanley Chery <nanley.g.ch...@intel.com>
> > > > Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
> > > > 
> > > > ---
> > > > 
> > > > v2. Update granularity comment for accuracy
> > > > 
> > > >  src/intel/vulkan/anv_pass.c        | 13 +++++++++++++
> > > >  src/intel/vulkan/gen8_cmd_buffer.c |  6 ++++++
> > > >  src/intel/vulkan/genX_cmd_buffer.c |  4 +---
> > > >  3 files changed, 20 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> > > > index 69c3c7e..595c2ea 100644
> > > > --- a/src/intel/vulkan/anv_pass.c
> > > > +++ b/src/intel/vulkan/anv_pass.c
> > > > @@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity(
> > > >      VkRenderPass                                renderPass,
> > > >      VkExtent2D*                                 pGranularity)
> > > >  {
> > > > +   ANV_FROM_HANDLE(anv_render_pass, pass, renderPass);
> > > > +
> > > > +   /* This granularity satisfies HiZ fast clear alignment requirements
> > > > +    * for all sample counts.
> > > > +    */
> > > > +   for (unsigned i = 0; i < pass->subpass_count; ++i) {
> > > > +      if (pass->subpasses[i].depth_stencil_attachment !=
> > > > +          VK_ATTACHMENT_UNUSED) {
> > > > +         *pGranularity = (VkExtent2D) { .width = 8, .height = 4 };
> > > > +         return;
> > > > +      }
> > > > +   }
> > > > +
> > > >     *pGranularity = (VkExtent2D) { 1, 1 };
> > > >  }
> > > > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
> > > > b/src/intel/vulkan/gen8_cmd_buffer.c
> > > > index 14e6a7b..96e972c 100644
> > > > --- a/src/intel/vulkan/gen8_cmd_buffer.c
> > > > +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> > > > @@ -479,6 +479,12 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer 
> > > > *cmd_buffer,
> > > >               cmd_state->render_area.extent.height % px_dim.h)
> > > >              return;
> > > >        }
> > > > +
> > > > +      anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), 
> > > > cp) {
> > > > +         cp.DepthClearValueValid = true;
> > > > +         cp.DepthClearValue =
> > > > +            
> > > > cmd_buffer->state.attachments[ds].clear_value.depthStencil.depth;
> > > > +      }
> > > >        break;
> > > >     case BLORP_HIZ_OP_DEPTH_RESOLVE:
> > > >        if (cmd_buffer->state.pass->attachments[ds].store_op !=
> > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> > > > b/src/intel/vulkan/genX_cmd_buffer.c
> > > > index 2cb1539..290fefc 100644
> > > > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > > > @@ -1320,9 +1320,6 @@ cmd_buffer_emit_depth_stencil(struct 
> > > > anv_cmd_buffer *cmd_buffer)
> > > >     } else {
> > > >        anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), 
> > > > sb);
> > > >     }
> > > > -
> > > > -   /* Clear the clear params. */
> > > > -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
> > > 
> > > We may need to preserve emission of 3DSTATE_CLEAR_PARAMS here. Two 
> > > reasons:
> > > 
> > >     Reason 1. If hiz is enabled in the 3DSTATE_DEPTH_BUFFER, and the hiz
> > >        surface has some bits in the clear state, and 
> > > 3DSTATE_CLEAR_PARAMS.DepthClearValueValid is 0,
> > >        and we emit a draw call, what does the hardware do when it
> > >        accesses a cleard pixel? I don't want to find out.
> > > 
> > 
> > Good point.
> > 
> 
> I thought about this some more and came to the conclusion that this
> shouldn't be a problem. In this V2, one of the two initialization
> requirements of the clear value (quoted below) is performed whenever
> a pixel is cleared in the depth buffer.
> 
> > >     Reason 2. The PRM says we have to (though, to be honest, I don't 
> > > trust the PRM's logic).
> > > 
> > >         From the Skylake PRM >> Vol7: 3D-Media-GPGUP >> Section: 
> > > Hierarchical Depth Buffer:
> > >         | 
> > >         |  If HiZ is enabled, you must initialize the clear value by 
> > > either:
> > >         | 
> > >         |     1. Perform a depth clear pass to initialize the clear value.
> > >         |     2. Send a 3dstate_clear_params packet with valid = 1.
> > >         | 
> > >         |  Without one of these events, context switching will fail, as 
> > > it will try
> > >         |  to save off a clear value even though no valid clear value has 
> > > been set.
> > >         |  When context restore happens, HW will restore an uninitialized 
> > > clear value.
> > >     
> > >         Even though the hardware docs claim we need 3DSTATE_CLEAR_PARAMS 
> > > when hiz is
> > >         enabled, the docs are vague about the consequences. Does context 
> > > switching
> > >         really fail, as claimed by #1? Or does context switching actually 
> > > succeed, but
> > >         context restore gives us an invalid clear value (which doesn't 
> > > hurt us), as
> > >         claimed by #2? Oh hw docs... :/
> > > 
> > 
> > I didn't trust the logic as well, but I agree. It's good to keep the
> > diff as small as reasonably possible. 
> > 
> 
> Like Jason mentioned in another email I think the PRM is saying that the
> saved value isn't trustworthy and so saving off the current context will
> fail in a sense. #2 discusses context restoration, implying that a
> context save did in fact occur and another context was switched to.
> 
> > > As a consequence of that reasoning, we should set 
> > > 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1 
> > > whenever hiz is enabled, even if we don't care about the actual clear 
> > > value.
> 
> Testing shows that we cannot emit clear_params packets back-to-back,
> so unconditionally emitting an arbitrary value here will override the
> actual clear value we want to set later. Trying to emit the actual clear
> value here causes segfaults that seem to stem from an interaction with
> secondary command buffers. While I could try to add conditions to work
> around this, my current understanding that changing this code is not
> necessary.
> 

Jason and I talked about this off-list and came to the conclusion that
we can likely stop secondary subpasses from emitting depth stencil
state. That would enable me to emit the actual clear color in
cmd_buffer_emit_depth_stencil(). Given my current understanding of
things, my comment about 3DSTATE_CLEAR_PARAMS will simply touch on the
IVB requirement of it needing to be with the other depth stencil packets
and the odd behavior I noticed on BDW+.

> > 
> > In the V3, I plan to emit that packet once at device initialization time
> > HSW+, and to always emit it (in the expected location) for IVB/BYT. Only
> > the latter platforms have the restriction that it must always be
> > programmed with the other depth/stencil commands.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to