On Fri, Aug 01, 2014 at 01:27:49PM -0700, Jordan Justen wrote: > On Fri, Aug 1, 2014 at 1:53 AM, Pohjolainen, Topi > <topi.pohjolai...@intel.com> wrote: > > On Fri, Aug 01, 2014 at 12:53:43AM -0700, Jordan Justen wrote: > >> (bf25ee2 for gen6) > >> > >> Previously we would always find the 2D sub-surface of interest, > >> and then program the surface to this location. Now we always > >> program the 3DSTATE_DEPTH_BUFFER at the start of the surface. > >> To select the lod/slice, we utilize the lod & minimum array > >> element fields. > >> > >> We also must disable brw_workaround_depthstencil_alignment for > >> gen >= 6. Now the hardware will handle alignment when rendering > >> to additional slices/LODs. > >> > >> v3: > >> * Set depth_mt bo RELOC offset to 0, as was done in bf25ee2 > >> > >> Signed-off-by: Jordan Justen <jordan.l.jus...@intel.com> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56127 > >> --- > >> src/mesa/drivers/dri/i965/brw_misc_state.c | 4 +- > >> src/mesa/drivers/dri/i965/gen6_blorp.cpp | 71 > >> +++++++++------------------- > >> src/mesa/drivers/dri/i965/gen6_depth_state.c | 35 ++++++++++---- > >> 3 files changed, 51 insertions(+), 59 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c > >> b/src/mesa/drivers/dri/i965/brw_misc_state.c > >> index 76e22bd..e3980fc 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_misc_state.c > >> +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c > >> @@ -261,10 +261,10 @@ brw_workaround_depthstencil_alignment(struct > >> brw_context *brw, > >> if (stencil_irb) > >> brw->depthstencil.stencil_mt = get_stencil_miptree(stencil_irb); > >> > >> - /* Gen7+ doesn't require the workarounds, since we always program the > >> + /* Gen6+ doesn't require the workarounds, since we always program the > >> * surface state at the start of the whole surface. > >> */ > >> - if (brw->gen >= 7) > >> + if (brw->gen >= 6) > >> return; > >> > >> /* Check if depth buffer is in depth/stencil format. If so, then it's > >> only > >> diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp > >> b/src/mesa/drivers/dri/i965/gen6_blorp.cpp > >> index dca6bfc..5a56442 100644 > >> --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp > >> +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp > >> @@ -787,10 +787,6 @@ static void > >> gen6_blorp_emit_depth_stencil_config(struct brw_context *brw, > >> const brw_blorp_params *params) > >> { > >> - struct gl_context *ctx = &brw->ctx; > >> - uint32_t draw_x = params->depth.x_offset; > >> - uint32_t draw_y = params->depth.y_offset; > >> - uint32_t tile_mask_x, tile_mask_y; > >> uint32_t surfwidth, surfheight; > >> uint32_t surftype; > >> unsigned int depth = MAX2(params->depth.mt->logical_depth0, 1); > >> @@ -814,12 +810,6 @@ gen6_blorp_emit_depth_stencil_config(struct > >> brw_context *brw, > >> break; > >> } > >> > >> - brw_get_depthstencil_tile_masks(params->depth.mt, > >> - params->depth.level, > >> - params->depth.layer, > >> - NULL, > >> - &tile_mask_x, &tile_mask_y); > >> - > >> min_array_element = params->depth.layer; > >> > >> lod = params->depth.level - params->depth.mt->first_level; > >> @@ -838,55 +828,42 @@ gen6_blorp_emit_depth_stencil_config(struct > >> brw_context *brw, > >> > >> /* 3DSTATE_DEPTH_BUFFER */ > >> { > >> - uint32_t tile_x = draw_x & tile_mask_x; > >> - uint32_t tile_y = draw_y & tile_mask_y; > >> - uint32_t offset = > >> - intel_miptree_get_aligned_offset(params->depth.mt, > >> - draw_x & ~tile_mask_x, > >> - draw_y & ~tile_mask_y, false); > >> - > >> - /* According to the Sandy Bridge PRM, volume 2 part 1, pp326-327 > >> - * (3DSTATE_DEPTH_BUFFER dw5), in the documentation for "Depth > >> - * Coordinate Offset X/Y": > >> - * > >> - * "The 3 LSBs of both offsets must be zero to ensure correct > >> - * alignment" > >> - * > >> - * We have no guarantee that tile_x and tile_y are correctly > >> aligned, > >> - * since they are determined by the mipmap layout, which is only > >> aligned > >> - * to multiples of 4. > >> - * > >> - * So, to avoid hanging the GPU, just smash the low order 3 bits of > >> - * tile_x and tile_y to 0. This is a temporary workaround until we > >> come > >> - * up with a better solution. > >> - */ > >> - WARN_ONCE((tile_x & 7) || (tile_y & 7), > >> - "Depth/stencil buffer needs alignment to 8-pixel > >> boundaries.\n" > >> - "Truncating offset, bad rendering may occur.\n"); > >> - tile_x &= ~7; > >> - tile_y &= ~7; > >> - > >> intel_emit_post_sync_nonzero_flush(brw); > >> intel_emit_depth_stall_flushes(brw); > >> > >> BEGIN_BATCH(7); > >> + /* 3DSTATE_DEPTH_BUFFER dw0 */ > >> OUT_BATCH(_3DSTATE_DEPTH_BUFFER << 16 | (7 - 2)); > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw1 */ > >> OUT_BATCH((params->depth.mt->pitch - 1) | > >> params->depth_format << 18 | > >> 1 << 21 | /* separate stencil enable */ > >> 1 << 22 | /* hiz enable */ > >> BRW_TILEWALK_YMAJOR << 26 | > >> 1 << 27 | /* y-tiled */ > >> - BRW_SURFACE_2D << 29); > >> + surftype << 29); > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw2 */ > >> OUT_RELOC(params->depth.mt->bo, > >> I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > >> - offset); > >> + 0); > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw3 */ > >> OUT_BATCH(BRW_SURFACE_MIPMAPLAYOUT_BELOW << 1 | > >> - (params->depth.width + tile_x - 1) << 6 | > >> - (params->depth.height + tile_y - 1) << 19); > >> + (surfwidth - 1) << 6 | > >> + (surfheight - 1) << 19 | > >> + lod << 2); > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw4 */ > >> + OUT_BATCH((depth - 1) << 21 | > >> + min_array_element << 10 | > >> + (depth - 1) << 1); > > > > Here we could have a comment saying that the latter "depth - 1" is render > > target view extent, and that the spec mandates it to always match the value > > of depth (the former "depth - 1"). The same applies for > > gen6_emit_depth_stencil_hiz() further down. This is up to you. > > Actually, I think we might be programming the field wrong for 3D textures. > "For 3D Surfaces: > This field indicates the extent of the accessible ???R??? coordinates > minus 1 on the LOD > currently being rendered to. > For 1D and 2D Surfaces: > This field must be set to the same value as the Depth field." > > So, I think we need a new variable: > const unsigned view_extent = > gl_target == GL_TEXTURE_3D ? > minify(depth, lod) : depth; > > Then programming this state packet should be clearer. (The distinction > between the depth and view extent fields.) > > Can I look into fixing this separately, since it probably impacts all gen >= > 6?
Sounds good to me - nice finding. I guess we are also missing tests. > > -Jordan > > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw5 */ > >> OUT_BATCH(0); > >> - OUT_BATCH(tile_x | > >> - tile_y << 16); > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw6 */ > >> OUT_BATCH(0); > >> ADVANCE_BATCH(); > >> } > >> @@ -894,17 +871,13 @@ gen6_blorp_emit_depth_stencil_config(struct > >> brw_context *brw, > >> /* 3DSTATE_HIER_DEPTH_BUFFER */ > >> { > >> struct intel_mipmap_tree *hiz_mt = params->depth.mt->hiz_mt; > >> - uint32_t hiz_offset = > >> - intel_miptree_get_aligned_offset(hiz_mt, > >> - draw_x & ~tile_mask_x, > >> - (draw_y & ~tile_mask_y) / 2, > >> false); > >> > >> BEGIN_BATCH(3); > >> OUT_BATCH((_3DSTATE_HIER_DEPTH_BUFFER << 16) | (3 - 2)); > >> OUT_BATCH(hiz_mt->pitch - 1); > >> OUT_RELOC(hiz_mt->bo, > >> I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > >> - hiz_offset); > >> + 0); > >> ADVANCE_BATCH(); > >> } > >> > >> diff --git a/src/mesa/drivers/dri/i965/gen6_depth_state.c > >> b/src/mesa/drivers/dri/i965/gen6_depth_state.c > >> index 5e3981c..860233c 100644 > >> --- a/src/mesa/drivers/dri/i965/gen6_depth_state.c > >> +++ b/src/mesa/drivers/dri/i965/gen6_depth_state.c > >> @@ -51,6 +51,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > >> unsigned int min_array_element; > >> GLenum gl_target = GL_TEXTURE_2D; > >> unsigned int lod; > >> + const struct intel_mipmap_tree *mt = depth_mt ? depth_mt : stencil_mt; > >> const struct intel_renderbuffer *irb = NULL; > >> const struct gl_renderbuffer *rb = NULL; > >> > >> @@ -112,8 +113,16 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > >> > >> lod = irb ? irb->mt_level - irb->mt->first_level : 0; > >> > >> + if (mt) { > >> + width = mt->logical_width0; > >> + height = mt->logical_height0; > >> + } > >> + > >> BEGIN_BATCH(7); > >> + /* 3DSTATE_DEPTH_BUFFER dw0 */ > >> OUT_BATCH(_3DSTATE_DEPTH_BUFFER << 16 | (7 - 2)); > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw1 */ > >> OUT_BATCH((depth_mt ? depth_mt->pitch - 1 : 0) | > >> (depthbuffer_format << 18) | > >> ((enable_hiz_ss ? 1 : 0) << 21) | /* separate stencil enable > >> */ > >> @@ -121,22 +130,32 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > >> (BRW_TILEWALK_YMAJOR << 26) | > >> ((depth_mt ? depth_mt->tiling != I915_TILING_NONE : 1) > >> << 27) | > >> - (depth_surface_type << 29)); > >> + (surftype << 29)); > >> > >> + /* 3DSTATE_DEPTH_BUFFER dw2 */ > >> if (depth_mt) { > >> OUT_RELOC(depth_mt->bo, > >> I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > >> - depth_offset); > >> + 0); > >> } else { > >> OUT_BATCH(0); > >> } > >> > >> - OUT_BATCH(((width + tile_x - 1) << 6) | > >> - ((height + tile_y - 1) << 19)); > >> - OUT_BATCH(0); > >> + /* 3DSTATE_DEPTH_BUFFER dw3 */ > >> + OUT_BATCH(((width - 1) << 6) | > >> + ((height - 1) << 19) | > >> + lod << 2); > >> + > >> + /* 3DSTATE_DEPTH_BUFFER dw4 */ > >> + OUT_BATCH((depth - 1) << 21 | > >> + min_array_element << 10 | > >> + (depth - 1) << 1); > >> > >> - OUT_BATCH(tile_x | (tile_y << 16)); > >> + /* 3DSTATE_DEPTH_BUFFER dw5 */ > >> + OUT_BATCH(0); > >> + assert(tile_x == 0 && tile_y == 0); > > > > I noticed that the vtable function for gen < 6 is now the only one actually > > using the tile offsets - gen7 and gen8 have their own functions also and > > don't > > even assert. > > > > I don't expect any change in this patch on this regard, I'm just making a > > mental note for us to address this later on. Hence: > > > > Reviewed-by: Topi Pohjolainen <topi.pohjolai...@intel.com> > > > >> > >> + /* 3DSTATE_DEPTH_BUFFER dw6 */ > >> OUT_BATCH(0); > >> > >> ADVANCE_BATCH(); > >> @@ -158,7 +177,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > >> OUT_BATCH(hiz_mt->pitch - 1); > >> OUT_RELOC(hiz_mt->bo, > >> I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > >> - brw->depthstencil.hiz_offset); > >> + 0); > >> ADVANCE_BATCH(); > >> } else { > >> BEGIN_BATCH(3); > >> @@ -180,7 +199,7 @@ gen6_emit_depth_stencil_hiz(struct brw_context *brw, > >> OUT_BATCH(2 * stencil_mt->pitch - 1); > >> OUT_RELOC(stencil_mt->bo, > >> I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, > >> - brw->depthstencil.stencil_offset); > >> + 0); > >> ADVANCE_BATCH(); > >> } else { > >> BEGIN_BATCH(3); > >> -- > >> 2.0.1 > >> > >> _______________________________________________ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev