On Wed, Oct 05, 2016 at 04:30:51PM -0700, Nanley Chery wrote: > On Tue, Sep 27, 2016 at 02:47:22PM -0700, Nanley Chery wrote: > > On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote: > > > On Mon 26 Sep 2016, Nanley Chery wrote: > > > > From: Nanley Chery <nanleych...@gmail.com> > > > > > > > > Provides an FPS increase of ~30% on the Sascha triangle and > > > > multisampling > > > > demos. > > > > > > > > Clears that happen within a render pass via vkCmdClearAttachments are > > > > safe > > > > even if the clear color changes. This is because the meta > > > > implementation does > > > > not use LOAD_OP_CLEAR which avoids any conflicts with > > > > 3DSTATE_CLEAR_PARAMS. > > > > > > > > Signed-off-by: Nanley Chery <nanley.g.ch...@intel.com> > > > > Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> > > > > > > > > --- > > > > > > > > v2. Update granularity comment for accuracy > > > > > > > > src/intel/vulkan/anv_pass.c | 13 +++++++++++++ > > > > src/intel/vulkan/gen8_cmd_buffer.c | 6 ++++++ > > > > src/intel/vulkan/genX_cmd_buffer.c | 4 +--- > > > > 3 files changed, 20 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c > > > > index 69c3c7e..595c2ea 100644 > > > > --- a/src/intel/vulkan/anv_pass.c > > > > +++ b/src/intel/vulkan/anv_pass.c > > > > @@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity( > > > > VkRenderPass renderPass, > > > > VkExtent2D* pGranularity) > > > > { > > > > + ANV_FROM_HANDLE(anv_render_pass, pass, renderPass); > > > > + > > > > + /* This granularity satisfies HiZ fast clear alignment requirements > > > > + * for all sample counts. > > > > + */ > > > > + for (unsigned i = 0; i < pass->subpass_count; ++i) { > > > > + if (pass->subpasses[i].depth_stencil_attachment != > > > > + VK_ATTACHMENT_UNUSED) { > > > > + *pGranularity = (VkExtent2D) { .width = 8, .height = 4 }; > > > > + return; > > > > + } > > > > + } > > > > + > > > > *pGranularity = (VkExtent2D) { 1, 1 }; > > > > } > > > > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c > > > > b/src/intel/vulkan/gen8_cmd_buffer.c > > > > index 14e6a7b..96e972c 100644 > > > > --- a/src/intel/vulkan/gen8_cmd_buffer.c > > > > +++ b/src/intel/vulkan/gen8_cmd_buffer.c > > > > @@ -479,6 +479,12 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer > > > > *cmd_buffer, > > > > cmd_state->render_area.extent.height % px_dim.h) > > > > return; > > > > } > > > > + > > > > + anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), > > > > cp) { > > > > + cp.DepthClearValueValid = true; > > > > + cp.DepthClearValue = > > > > + > > > > cmd_buffer->state.attachments[ds].clear_value.depthStencil.depth; > > > > + } > > > > break; > > > > case BLORP_HIZ_OP_DEPTH_RESOLVE: > > > > if (cmd_buffer->state.pass->attachments[ds].store_op != > > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > > > > b/src/intel/vulkan/genX_cmd_buffer.c > > > > index 2cb1539..290fefc 100644 > > > > --- a/src/intel/vulkan/genX_cmd_buffer.c > > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c > > > > @@ -1320,9 +1320,6 @@ cmd_buffer_emit_depth_stencil(struct > > > > anv_cmd_buffer *cmd_buffer) > > > > } else { > > > > anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), > > > > sb); > > > > } > > > > - > > > > - /* Clear the clear params. */ > > > > - anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp); > > > > > > We may need to preserve emission of 3DSTATE_CLEAR_PARAMS here. Two > > > reasons: > > > > > > Reason 1. If hiz is enabled in the 3DSTATE_DEPTH_BUFFER, and the hiz > > > surface has some bits in the clear state, and > > > 3DSTATE_CLEAR_PARAMS.DepthClearValueValid is 0, > > > and we emit a draw call, what does the hardware do when it > > > accesses a cleard pixel? I don't want to find out. > > > > > > > Good point. > > > > I thought about this some more and came to the conclusion that this > shouldn't be a problem. In this V2, one of the two initialization > requirements of the clear value (quoted below) is performed whenever > a pixel is cleared in the depth buffer. > > > > Reason 2. The PRM says we have to (though, to be honest, I don't > > > trust the PRM's logic). > > > > > > From the Skylake PRM >> Vol7: 3D-Media-GPGUP >> Section: > > > Hierarchical Depth Buffer: > > > | > > > | If HiZ is enabled, you must initialize the clear value by > > > either: > > > | > > > | 1. Perform a depth clear pass to initialize the clear value. > > > | 2. Send a 3dstate_clear_params packet with valid = 1. > > > | > > > | Without one of these events, context switching will fail, as > > > it will try > > > | to save off a clear value even though no valid clear value has > > > been set. > > > | When context restore happens, HW will restore an uninitialized > > > clear value. > > > > > > Even though the hardware docs claim we need 3DSTATE_CLEAR_PARAMS > > > when hiz is > > > enabled, the docs are vague about the consequences. Does context > > > switching > > > really fail, as claimed by #1? Or does context switching actually > > > succeed, but > > > context restore gives us an invalid clear value (which doesn't > > > hurt us), as > > > claimed by #2? Oh hw docs... :/ > > > > > > > I didn't trust the logic as well, but I agree. It's good to keep the > > diff as small as reasonably possible. > > > > Like Jason mentioned in another email I think the PRM is saying that the > saved value isn't trustworthy and so saving off the current context will > fail in a sense. #2 discusses context restoration, implying that a > context save did in fact occur and another context was switched to. > > > > As a consequence of that reasoning, we should set > > > 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1 > > > whenever hiz is enabled, even if we don't care about the actual clear > > > value. > > Testing shows that we cannot emit clear_params packets back-to-back, > so unconditionally emitting an arbitrary value here will override the > actual clear value we want to set later. Trying to emit the actual clear > value here causes segfaults that seem to stem from an interaction with > secondary command buffers. While I could try to add conditions to work > around this, my current understanding that changing this code is not > necessary. >
Jason and I talked about this off-list and came to the conclusion that we can likely stop secondary subpasses from emitting depth stencil state. That would enable me to emit the actual clear color in cmd_buffer_emit_depth_stencil(). Given my current understanding of things, my comment about 3DSTATE_CLEAR_PARAMS will simply touch on the IVB requirement of it needing to be with the other depth stencil packets and the odd behavior I noticed on BDW+. > > > > In the V3, I plan to emit that packet once at device initialization time > > HSW+, and to always emit it (in the expected location) for IVB/BYT. Only > > the latter platforms have the restriction that it must always be > > programmed with the other depth/stencil commands. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev