On Sat, Jul 19, 2014 at 07:08:59PM -0700, Jordan Justen wrote: > On Fri, Jul 18, 2014 at 1:58 AM, Pohjolainen, Topi > <topi.pohjolai...@intel.com> wrote: > > On Tue, Jul 15, 2014 at 06:32:12PM -0700, Jordan Justen wrote: > >> We now skip allocating a hiz miptree for gen7. Instead, we calculate > >> the required hiz buffer parameters and allocate a bo directly. > >> > >> Signed-off-by: Jordan Justen <jordan.l.jus...@intel.com> > >> --- > >> src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 95 > >> ++++++++++++++++++++++++++- > >> 1 file changed, 93 insertions(+), 2 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> index 8719c29..7e8bec8 100644 > >> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > >> @@ -823,7 +823,10 @@ intel_miptree_release(struct intel_mipmap_tree **mt) > >> drm_intel_bo_unreference((*mt)->bo); > >> intel_miptree_release(&(*mt)->stencil_mt); > >> if ((*mt)->hiz_buf) { > >> - intel_miptree_release(&(*mt)->hiz_buf->mt); > >> + if ((*mt)->hiz_buf->mt) > >> + intel_miptree_release(&(*mt)->hiz_buf->mt); > >> + else > >> + drm_intel_bo_unreference((*mt)->hiz_buf->bo); > >> free((*mt)->hiz_buf); > >> } > >> intel_miptree_release(&(*mt)->mcs_mt); > >> @@ -1374,6 +1377,89 @@ intel_miptree_level_enable_hiz(struct brw_context > >> *brw, > >> } > >> > >> > >> +/** > >> + * Helper for intel_miptree_alloc_hiz() that determines the required hiz > >> + * buffer dimensions and allocates a bo for the hiz buffer. > >> + */ > >> +static struct intel_miptree_aux_buffer * > >> +intel_gen7_hiz_buf_create(struct brw_context *brw, > >> + struct intel_mipmap_tree *mt) > >> +{ > >> + unsigned z_width = mt->logical_width0; > >> + unsigned z_height = mt->logical_height0; > >> + const unsigned z_depth = mt->logical_depth0; > >> + unsigned hz_width, hz_height, qpitch; > > > > Minor nit, qpitch could be called hz_qpitch for clarity as it is a result of > > hiz-specific rules just as hz_width and hz_height. Simple matter of taste > > and > > you choose the way that you feel the best. > > > >> + struct intel_miptree_aux_buffer *buf = calloc(sizeof(*buf), 1); > >> + > >> + if (!buf) > >> + return NULL; > >> + > >> + /* Gen7 PRM Volume 2, Part 1, 11.5.3 "Hierarchical Depth Buffer" > >> documents > >> + * adjustments required for Z_Height and Z_Width based on > >> multisampling. > >> + */ > >> + switch(mt->num_samples) { > >> + case 0: > >> + case 1: > >> + break; > >> + case 2: > >> + case 4: > >> + z_width *= 2; > >> + z_height *= 2; > >> + break; > >> + case 8: > >> + z_width *= 4; > >> + z_height *= 2; > >> + break; > >> + default: > >> + assert(!"Unsupported sample count!"); > >> + } > >> + > >> + const unsigned vertical_align = 8; /* 'j' in the docs */ > >> + const unsigned H0 = z_height; > >> + const unsigned h0 = ALIGN(H0, vertical_align); > >> + const unsigned h1 = ALIGN(minify(H0, 1), vertical_align); > >> + const unsigned Z0 = z_depth; > >> + > >> + /* HZ_Width (bytes) = ceiling(Z_Width / 16) * 16 */ > >> + hz_width = ALIGN(z_width, 16); > >> + > >> + if (mt->target == GL_TEXTURE_3D) { > >> + unsigned H_i = H0; > >> + unsigned Z_i = Z0; > >> + hz_height = 0; > >> + for (int level = mt->first_level; level <= mt->last_level; ++level) > >> { > >> + unsigned h_i = ALIGN(H_i, vertical_align); > >> + /* sum(i=0 to m; h_i * max(1, floor(Z_Depth/2**i))) */ > > > > I had to think for a second if you had typo here (2**i) but then realized > > you used it to mean power-of-two. I've also seen people using 2^i, would > > that > > make sense to you? > > > >> + hz_height += h_i * Z_i; > >> + H_i = minify(H_i, 1); > >> + Z_i = minify(Z_i, 1); > >> + } > >> + /* HZ_Height = > >> + * (1/2) * sum(i=0 to m; h_i * max(1, floor(Z_Depth/2**i))) > >> + */ > >> + hz_height = CEILING(hz_height, 2); > >> + } else { > >> + qpitch = h0 + h1 + (12 * vertical_align); > >> + /* HZ_Height (rows) = Ceiling ( ( Q_pitch * Z_depth/2) /8 ) * 8 */ > >> + hz_height = (ALIGN(qpitch, 8) / 2) * Z0; > > > > Here the ALIGN is no-op - qpitch is a sum of three already aligned numbers, > > and hence it is aligned itself. The final result in turn is not always > > aligned > > (althought is should be). For example, say > > > > qpitch = ALIGN(16, 8) + ALIGN(minify(16, 1), 8) + 12 * 8 = 15 * 8 > > ZO = z_depth = 1 > > > > => hz_height = (15 * 8 / 2) * 1 = 60 > > > > This particular case would probably fine as there is only one layer and > > still > > a lot of extra. But that may not be the case with higher odd layer numbers > > anymore. > > > > I would change this into: > > > > hz_height = ALIGN(qpitch * Z0 / 2, vertical_align); > > The comment above the assignment is from the docs, and it uses the > constant 8, rather than 'j', so I thought it would be better just to > use 8. Although, you are right that it is probably 8 because they set > 'j' as 8 for the purposes of hiz calculations. > > I also wanted to do the ALIGN before the integer / 2.
I'm missing the rational for this, the spec says that the result of the division should be aligned by eight, right? > > How do you feel about: > hz_height = ALIGN(qpitch * Z0, 8) / 2; With the example numbers the end result is not aligned by 8: hz_height = ALIGN(15 * 8 * 1, 8) / 2 = 120 / 2 = 60 > > I also think this alternative is closest to the docs: > hz_height = CEILING(qpitch * Z0, 2 * 8) * 8; This I agree, this would be in line with the spec: hz_height = CEILING(15 * 8 * 1, 2 * 8) * 8 = 64 > > -Jordan _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev