On Mon 26 Sep 2016, Nanley Chery wrote: > On Mon, Sep 19, 2016 at 01:49:09PM -0700, Nanley Chery wrote: > > On Fri, Sep 02, 2016 at 03:16:21PM -0700, Chad Versace wrote: > > > On Wed 31 Aug 2016, Nanley Chery wrote: > > > > From: Chad Versace <chad.vers...@intel.com> > > > > > > > > Nanley Chery: > > > > (rebase) > > > > - Resolve conflicts with new anv_batch_emit macro > > > > (amend) > > > > - Remove wip! tag and handle a QPitch TODO > > > > - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems > > > > - Only use HiZ for single-subpass renderpasses > > > > - Emit the HiZ instruction before the stencil instruction to follow the > > > > optimized clear sequence specified in the PRMs > > > > - Don't modify clear params > > > > - Enable resolves when a HiZ buffer is used to ensure depth buffer > > > > validity > > > > > > > > Provides an FPS increase of ~15% on the Sascha triangle and > > > > multisampling > > > > demos. > > > > > > > > Signed-off-by: Nanley Chery <nanley.g.ch...@intel.com> > > > > --- > > > > src/intel/vulkan/gen8_cmd_buffer.c | 4 ++++ > > > > src/intel/vulkan/genX_cmd_buffer.c | 41 > > > > ++++++++++++++++++++++++++++++++++---- > > > > 2 files changed, 41 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c > > > > b/src/intel/vulkan/gen8_cmd_buffer.c > > > > index 4f27350..7f65fe2 100644 > > > > --- a/src/intel/vulkan/gen8_cmd_buffer.c > > > > +++ b/src/intel/vulkan/gen8_cmd_buffer.c > > > > @@ -414,6 +414,10 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer > > > > *cmd_buffer, enum anv_hz_op op) > > > > if (iview == NULL || !anv_image_has_hiz(iview->image)) > > > > return; > > > > > > > > + /* FIXME: Implement multi-subpass HiZ */ > > > > + if (cmd_buffer->state.pass->subpass_count > 1) > > > > + return; > > > > + > > > > const uint32_t ds = cmd_state->subpass->depth_stencil_attachment; > > > > const bool full_surface_op = > > > > cmd_state->render_area.extent.width == > > > > iview->extent.width && > > > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > > > > b/src/intel/vulkan/genX_cmd_buffer.c > > > > index 95ed5f2..349d2a4 100644 > > > > --- a/src/intel/vulkan/genX_cmd_buffer.c > > > > +++ b/src/intel/vulkan/genX_cmd_buffer.c > > > > @@ -1040,6 +1040,7 @@ cmd_buffer_emit_depth_stencil(struct > > > > anv_cmd_buffer *cmd_buffer) > > > > anv_cmd_buffer_get_depth_stencil_view(cmd_buffer); > > > > const struct anv_image *image = iview ? iview->image : NULL; > > > > const bool has_depth = image && (image->aspects & > > > > VK_IMAGE_ASPECT_DEPTH_BIT); > > > > + const bool has_hiz = image != NULL && anv_image_has_hiz(image); > > > > const bool has_stencil = > > > > image && (image->aspects & VK_IMAGE_ASPECT_STENCIL_BIT); > > > > > > > > > > > @@ -1052,7 +1053,12 @@ cmd_buffer_emit_depth_stencil(struct > > > > anv_cmd_buffer *cmd_buffer) > > > > db.SurfaceType = SURFTYPE_2D; > > > > db.DepthWriteEnable = true; > > > > db.StencilWriteEnable = has_stencil; > > > > - db.HierarchicalDepthBufferEnable = false; > > > > + > > > > + if (cmd_buffer->state.pass->subpass_count == 1) { > > > > + db.HierarchicalDepthBufferEnable = has_hiz; > > > > + } else { > > > > + anv_finishme("Multiple-subpass HiZ not implemented"); > > > > + } > > > > > > > > db.SurfaceFormat = isl_surf_get_depth_format(&device->isl_dev, > > > > > > > > &image->depth_surface.isl); > > > > @@ -1104,6 +1110,34 @@ cmd_buffer_emit_depth_stencil(struct > > > > anv_cmd_buffer *cmd_buffer) > > > > } > > > > } > > > > > > > > + if (has_hiz) { > > > > > > Note: This codepath is hit sometimes when > > > 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false. > > > Specifically, when subpass_count > 1. It's weird, but I doubt it causes > > > any harm. After all, all the surface data programmed by > > > 3DSTATE_HIER_BUFFER is valid here regardless of the value of > > > HierarchicalDepthBufferEnable. > > > > > > > + anv_batch_emit(&cmd_buffer->batch, > > > > GENX(3DSTATE_HIER_DEPTH_BUFFER), hdb) { > > > > + hdb.HierarchicalDepthBufferObjectControlState = GENX(MOCS); > > > > + hdb.SurfacePitch = image->hiz_surface.isl.row_pitch - 1; > > > > + hdb.SurfaceBaseAddress = (struct anv_address) { > > > > + .bo = image->bo, > > > > + .offset = image->offset + image->hiz_surface.offset, > > > > + }; > > > > +#if GEN_GEN >= 8 > > > > + /* From the SKL PRM Vol2a: > > > > + * > > > > + * The interpretation of this field is dependent on > > > > Surface Type > > > > + * as follows: > > > > + * - SURFTYPE_1D: distance in pixels between array slices > > > > + * - SURFTYPE_2D/CUBE: distance in rows between array > > > > slices > > > > + * - SURFTYPE_3D: distance in rows between R - slices > > > > + * > > > > + * ISL implements HiZ surfaces for 1D depth buffers as 2D. > > > > Therefore > > > > + * the depth buffer needs to be checked for the dimension. > > > > + */ > > > > + hdb.SurfaceQPitch = > > > > + image->depth_surface.isl.dim == ISL_SURF_DIM_1D ? > > > > + isl_surf_get_array_pitch_el(&image->hiz_surface.isl) >> > > > > 2 : > > > > + > > > > isl_surf_get_array_pitch_el_rows(&image->hiz_surface.isl) >> 2; > > > > +#endif > > > > + } > > > > + } > > > > > > Here I expected to see: > > > > > > } else { > > > /* Disable hierarchial depth buffers. */ > > > anv_batch_emit(&cmd_buffer->batch, > > > GENX(3DSTATE_HIER_DEPTH_BUFFER), hz); > > > } > > > > > > I'm not insisting that you *should* do that; just that I expected it. > > > > > > Why? If the current depth surface has no hiz surface, then any > > > previously emitted 3DSTATE_HIER_DEPTH_BUFFER remains active. That > > > instruction persists across render passes. If the HiZ surface referenced > > > by the stale 3DSTATE_HIER_DEPTH_BUFFER no longer exists, what happens? > > > Does the GPU attempt to prefetch that surface before observing > > > 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, therefore > > > performing an illegal memory access? Does the GPU first observe that > > > 3DSTATE_DEPTH_BUFFER.HierarchicalDepthBufferEnable is false, and so > > > ignore the active (but stale) 3DSTATE_HIER_DEPTH_BUFFER instruction, > > > doesn't fetch the stale surface memory, and all is good? Or something > > > completely different that we haven't thought of? > > > > > > > Is there any documentation that mentions prefetching surfaces? I was > > only able to find some on state and instruction prefetching. > > > > I'd expect that the GPU would avoid accessing a region of memory that > > it knows to be undefined. There are some programming notes that agree > > with this expectation. For example, the depth resolve and HiZ > > ambiguation state fields in the BDW+ PRMs say: > > > > If Hierarchical Depth Buffer Enable is disabled, enabling this field > > will have no effect. > > > > Nonetheless, the overhead of the extra packet should be negligible so > > I can emit the empty packet if you'd like. > > > > Should I send out the HiZ v2? I'm currently waiting on a reply for this, > but I wouldn't mind just emitting the zero'ed HIER packet.
From #intel-gfx: 15:59:46 nchery | chadv: I think that was the last patch awaiting a reply. 16:00:51 chadv | nchery: i'm looking at it again now. 16:01:32 jekstrand | I think it's probably good form to emit an empty HIER_DEPTH_BUFFER if we're not using HiZ but I don't know that it matters. 16:02:05 Kayden | there are platforms where you have to emit DEPTH_BUFFER, HIER_DEPTH_BUFFER, STENCIL_BUFFER, and CLEAR_PARAMS all together, all at once 16:02:20 Kayden | I definitely think you should emit one to turn it off if it isn't in use. 16:02:28 Kayden | seems much safer 16:02:49 chadv | It may be safe to skip emitting HIER_DEPTH_BUFFER, but if it's not... then that's one doozy of a heisenbug. 16:02:52 * | Kayden hasn't seen any patches 16:02:57 Kayden | indeed 16:03:01 chadv | Let's just play it safe and zero it. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev