On Fri, Aug 01, 2014 at 12:53:43AM -0700, Jordan Justen wrote: > (bf25ee2 for gen6) > > Previously we would always find the 2D sub-surface of interest, > and then program the surface to this location. Now we always > program the 3DSTATE_DEPTH_BUFFER at the start of the surface. > To select the lod/slice, we utilize the lod & minimum array > element fields. > > We also must disable brw_workaround_depthstencil_alignment for > gen >= 6. Now the hardware will handle alignment when rendering > to additional slices/LODs. > > v3: > * Set depth_mt bo RELOC offset to 0, as was done in bf25ee2 > > Signed-off-by: Jordan Justen <jordan.l.jus...@intel.com> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56127 > --- > src/mesa/drivers/dri/i965/brw_misc_state.c | 4 +- > src/mesa/drivers/dri/i965/gen6_blorp.cpp | 71 > +++++++++------------------- > src/mesa/drivers/dri/i965/gen6_depth_state.c | 35 ++++++++++---- > 3 files changed, 51 insertions(+), 59 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c > b/src/mesa/drivers/dri/i965/brw_misc_state.c > index 76e22bd..e3980fc 100644 > --- a/src/mesa/drivers/dri/i965/brw_misc_state.c > +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c > @@ -261,10 +261,10 @@ brw_workaround_depthstencil_alignment(struct > brw_context *brw, > if (stencil_irb) > brw->depthstencil.stencil_mt = get_stencil_miptree(stencil_irb); > > - /* Gen7+ doesn't require the workarounds, since we always program the > + /* Gen6+ doesn't require the workarounds, since we always program the > * surface state at the start of the whole surface. > */ > - if (brw->gen >= 7) > + if (brw->gen >= 6) > return; > > /* Check if depth buffer is in depth/stencil format. If so, then it's > only > diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp > b/src/mesa/drivers/dri/i965/gen6_blorp.cpp > index dca6bfc..5a56442 100644 > --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp > +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp > @@ -787,10 +787,6 @@ static void > gen6_blorp_emit_depth_stencil_config(struct brw_context *brw, > const brw_blorp_params *params) > { > - struct gl_context *ctx = &brw->ctx; > - uint32_t draw_x = params->depth.x_offset; > - uint32_t draw_y = params->depth.y_offset; > - uint32_t tile_mask_x, tile_mask_y; > uint32_t surfwidth, surfheight; > uint32_t surftype; > unsigned int depth = MAX2(params->depth.mt->logical_depth0, 1); > @@ -814,12 +810,6 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context > *brw, > break; > } > > - brw_get_depthstencil_tile_masks(params->depth.mt, > - params->depth.level, > - params->depth.layer, > - NULL, > - &tile_mask_x, &tile_mask_y); > - > min_array_element = params->depth.layer; > > lod = params->depth.level - params->depth.mt->first_level; > @@ -838,55 +828,42 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context > *brw, > > /* 3DSTATE_DEPTH_BUFFER */ > { > - uint32_t tile_x = draw_x & tile_mask_x; > - uint32_t tile_y = draw_y & tile_mask_y; > - uint32_t offset = > - intel_miptree_get_aligned_offset(params->depth.mt, > - draw_x & ~tile_mask_x, > - draw_y & ~tile_mask_y, false); > - > - /* According to the Sandy Bridge PRM, volume 2 part 1, pp326-327 > - * (3DSTATE_DEPTH_BUFFER dw5), in the documentation for "Depth > - * Coordinate Offset X/Y": > - * > - * "The 3 LSBs of both offsets must be zero to ensure correct > - * alignment" > - * > - * We have no guarantee that tile_x and tile_y are correctly aligned, > - * since they are determined by the mipmap layout, which is only > aligned > - * to multiples of 4. > - * > - * So, to avoid hanging the GPU, just smash the low order 3 bits of > - * tile_x and tile_y to 0. This is a temporary workaround until we > come > - * up with a better solution. > - */ > - WARN_ONCE((tile_x & 7) || (tile_y & 7), > - "Depth/stencil buffer needs alignment to 8-pixel > boundaries.\n" > - "Truncating offset, bad rendering may occur.\n"); > - tile_x &= ~7; > - tile_y &= ~7; > - > intel_emit_post_sync_nonzero_flush(brw); > intel_emit_depth_stall_flushes(brw); > > BEGIN_BATCH(7); > + /* 3DSTATE_DEPTH_BUFFER dw0 */ > OUT_BATCH(_3DSTATE_DEPTH_BUFFER << 16 | (7 - 2)); > + > + /* 3DSTATE_DEPTH_BUFFER dw1 */ > OUT_BATCH((params->depth.mt->pitch - 1) | > params->depth_format << 18 | > 1 << 21 | /* separate stencil enable */ > 1 << 22 | /* hiz enable */ > BRW_TILEWALK_YMAJOR << 26 | > 1 << 27 | /* y-tiled */ > - BRW_SURFACE_2D << 29); > + surftype << 29); > + > + /* 3DSTATE_DEPTH_BUFFER dw2 */ > OUT_RELOC(params->depth.mt->bo, > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > - offset); > + 0); > + > + /* 3DSTATE_DEPTH_BUFFER dw3 */ > OUT_BATCH(BRW_SURFACE_MIPMAPLAYOUT_BELOW << 1 | > - (params->depth.width + tile_x - 1) << 6 | > - (params->depth.height + tile_y - 1) << 19); > + (surfwidth - 1) << 6 | > + (surfheight - 1) << 19 | > + lod << 2); > + > + /* 3DSTATE_DEPTH_BUFFER dw4 */ > + OUT_BATCH((depth - 1) << 21 | > + min_array_element << 10 | > + (depth - 1) << 1);
Here we could have a comment saying that the latter "depth - 1" is render target view extent, and that the spec mandates it to always match the value of depth (the former "depth - 1"). The same applies for gen6_emit_depth_stencil_hiz() further down. This is up to you. > + > + /* 3DSTATE_DEPTH_BUFFER dw5 */ > OUT_BATCH(0); > - OUT_BATCH(tile_x | > - tile_y << 16); > + > + /* 3DSTATE_DEPTH_BUFFER dw6 */ > OUT_BATCH(0); > ADVANCE_BATCH(); > } > @@ -894,17 +871,13 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context > *brw, > /* 3DSTATE_HIER_DEPTH_BUFFER */ > { > struct intel_mipmap_tree *hiz_mt = params->depth.mt->hiz_mt; > - uint32_t hiz_offset = > - intel_miptree_get_aligned_offset(hiz_mt, > - draw_x & ~tile_mask_x, > - (draw_y & ~tile_mask_y) / 2, > false); > > BEGIN_BATCH(3); > OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2)); > OUT_BATCH(hiz_mt->pitch - 1); > OUT_RELOC(hiz_mt->bo, > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > - hiz_offset); > + 0); > ADVANCE_BATCH(); > } > > diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c > b/src/mesa/drivers/dri/i965/gen6_depth_state.c > index 5e3981c..860233c 100644 > --- a/src/mesa/drivers/dri/i965/gen6_depth_state.c > +++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c > @@ -51,6 +51,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > unsigned int min_array_element; > GLenum gl_target = GL_TEXTURE_2D; > unsigned int lod; > + const struct intel_mipmap_tree *mt = depth_mt ? depth_mt : stencil_mt; > const struct intel_renderbuffer *irb = NULL; > const struct gl_renderbuffer *rb = NULL; > > @@ -112,8 +113,16 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > > lod = irb ? irb->mt_level - irb->mt->first_level : 0; > > + if (mt) { > + width = mt->logical_width0; > + height = mt->logical_height0; > + } > + > BEGIN_BATCH(7); > + /* 3DSTATE_DEPTH_BUFFER dw0 */ > OUT_BATCH(_3DSTATE_DEPTH_BUFFER << 16 | (7 - 2)); > + > + /* 3DSTATE_DEPTH_BUFFER dw1 */ > OUT_BATCH((depth_mt ? depth_mt->pitch - 1 : 0) | > (depthbuffer_format << 18) | > ((enable_hiz_ss ? 1 : 0) << 21) | /* separate stencil enable */ > @@ -121,22 +130,32 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > (BRW_TILEWALK_YMAJOR << 26) | > ((depth_mt ? depth_mt->tiling != I915_TILING_NONE : 1) > << 27) | > - (depth_surface_type << 29)); > + (surftype << 29)); > > + /* 3DSTATE_DEPTH_BUFFER dw2 */ > if (depth_mt) { > OUT_RELOC(depth_mt->bo, > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > - depth_offset); > + 0); > } else { > OUT_BATCH(0); > } > > - OUT_BATCH(((width + tile_x - 1) << 6) | > - ((height + tile_y - 1) << 19)); > - OUT_BATCH(0); > + /* 3DSTATE_DEPTH_BUFFER dw3 */ > + OUT_BATCH(((width - 1) << 6) | > + ((height - 1) << 19) | > + lod << 2); > + > + /* 3DSTATE_DEPTH_BUFFER dw4 */ > + OUT_BATCH((depth - 1) << 21 | > + min_array_element << 10 | > + (depth - 1) << 1); > > - OUT_BATCH(tile_x | (tile_y << 16)); > + /* 3DSTATE_DEPTH_BUFFER dw5 */ > + OUT_BATCH(0); > + assert(tile_x == 0 && tile_y == 0); I noticed that the vtable function for gen < 6 is now the only one actually using the tile offsets - gen7 and gen8 have their own functions also and don't even assert. I don't expect any change in this patch on this regard, I'm just making a mental note for us to address this later on. Hence: Reviewed-by: Topi Pohjolainen <topi.pohjolai...@intel.com> > > + /* 3DSTATE_DEPTH_BUFFER dw6 */ > OUT_BATCH(0); > > ADVANCE_BATCH(); > @@ -158,7 +177,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > OUT_BATCH(hiz_mt->pitch - 1); > OUT_RELOC(hiz_mt->bo, > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > - brw->depthstencil.hiz_offset); > + 0); > ADVANCE_BATCH(); > } else { > BEGIN_BATCH(3); > @@ -180,7 +199,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > OUT_BATCH(2 * stencil_mt->pitch - 1); > OUT_RELOC(stencil_mt->bo, > I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > - brw->depthstencil.stencil_offset); > + 0); > ADVANCE_BATCH(); > } else { > BEGIN_BATCH(3); > -- > 2.0.1 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev