On 22 May 2012 09:21, Paul Berry <stereotype...@gmail.com
<mailto:stereotype...@gmail.com>> wrote:
On 22 May 2012 08:19, Kenneth Graunke <kenn...@whitecape.org
<mailto:kenn...@whitecape.org>> wrote:
On 05/11/2012 11:03 AM, Paul Berry wrote:
Starting in Gen7, there are two possible layouts for MSAA
surfaces:
- Interleaved, in which additional samples are accommodated
by scaling
up the width and height of the surface. This is the only
layout
available in Gen6. On Gen7 it is used for depth and stencil
surfaces only.
- Sliced, in which the surface is stored as a 2D array, with
array
slice n containing all pixel data for sample n. On Gen7
this layout
is used for color surfaces.
The "Sliced" layout has an additional requirement: it must
be used in
ARYSPC_LOD0 mode, which means that the surface doesn't leave
any extra
room between array slices for miplevels other than 0.
This patch modifies the surface allocation functions to use the
correct layout when allocating MSAA surfaces in Gen7, and to
set the
array offsets properly when using ARYSPC_LOD0 mode. It also
modifies
the code that populates SURFACE_STATE structures to ensure that
ARYSPC_LOD0 mode is selected in the appropriate circumstances.
---
src/mesa/drivers/dri/i965/brw___blorp.cpp | 1 +
src/mesa/drivers/dri/i965/brw___blorp.h | 5 +
src/mesa/drivers/dri/i965/brw___tex_layout.c | 10 ++-
src/mesa/drivers/dri/i965/brw___wm_surface_state.c | 3 +-
src/mesa/drivers/dri/i965/__gen7_blorp.cpp | 2 +
src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c | 10 ++
src/mesa/drivers/dri/intel/__intel_mipmap_tree.c | 102
++++++++++++++++-----
src/mesa/drivers/dri/intel/__intel_mipmap_tree.h | 23
+++++-
src/mesa/drivers/dri/intel/__intel_tex_image.c | 3 +-
src/mesa/drivers/dri/intel/__intel_tex_validate.c | 3 +-
10 files changed, 134 insertions(+), 28 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/__brw_blorp.cpp
b/src/mesa/drivers/dri/i965/__brw_blorp.cpp
index f6aff44..84f92b4 100644
--- a/src/mesa/drivers/dri/i965/__brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/__brw_blorp.cpp
@@ -58,6 +58,7 @@ brw_blorp_surface_info::set(__struct
intel_mipmap_tree *mt,
{
brw_blorp_mip_info::set(mt, level, layer);
this->num_samples = mt->num_samples;
+ this->array_spacing_lod0 = mt->array_spacing_lod0;
if (mt->format == MESA_FORMAT_S8) {
/* The miptree is a W-tiled stencil buffer. Surface
states can't be set
diff --git a/src/mesa/drivers/dri/i965/__brw_blorp.h
b/src/mesa/drivers/dri/i965/__brw_blorp.h
index 8dabf2c..3e37081 100644
--- a/src/mesa/drivers/dri/i965/__brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/__brw_blorp.h
@@ -99,6 +99,11 @@ public:
bool map_stencil_as_y_tiled;
unsigned num_samples;
+
+ /* Setting this flag indicates that the surface should
be set up in
+ * ARYSPC_LOD0 mode. Ignored prior to Gen7.
+ */
+ bool array_spacing_lod0;
};
diff --git a/src/mesa/drivers/dri/i965/__brw_tex_layout.c
b/src/mesa/drivers/dri/i965/__brw_tex_layout.c
index 8bf1d3d..f742131 100644
--- a/src/mesa/drivers/dri/i965/__brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/__brw_tex_layout.c
@@ -49,7 +49,10 @@ brw_miptree_layout_texture___array(struct
intel_context *intel,
h0 = ALIGN(mt->height0, mt->align_h);
h1 = ALIGN(minify(mt->height0), mt->align_h);
- qpitch = (h0 + h1 + (intel->gen>= 7 ? 12 : 11) *
mt->align_h);
+ if (mt->array_spacing_lod0)
+ qpitch = h0;
+ else
+ qpitch = (h0 + h1 + (intel->gen>= 7 ? 12 : 11) *
mt->align_h);
if (mt->compressed)
qpitch /= 4;
@@ -165,7 +168,10 @@ brw_miptree_layout(struct intel_context
*intel, struct intel_mipmap_tree *mt)
break;
default:
- i945_miptree_layout_2d(mt);
+ if (mt->num_samples> 0&& !mt->msaa_is_interleaved)
+ brw_miptree_layout_texture___array(intel, mt);
+ else
+ i945_miptree_layout_2d(mt);
break;
}
DBG("%s: %dx%dx%d\n", __FUNCTION__,
diff --git
a/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
b/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
index 849da85..6e745cf 100644
--- a/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/__brw_wm_surface_state.c
@@ -955,7 +955,8 @@ brw_update_renderbuffer___surface(struct
brw_context *brw,
intel_image->base.Base.Level,
width, height, depth,
true,
- 0 /* num_samples */);
+ 0 /* num_samples */,
+ false /*
msaa_is_interleaved */);
intel_miptree_copy_teximage(__intel, intel_image,
new_mt);
intel_miptree_reference(&irb->__mt, intel_image->mt);
diff --git a/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
b/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
index 3775eec..dfc2272 100644
--- a/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/__gen7_blorp.cpp
@@ -149,6 +149,8 @@ gen7_blorp_emit_surface_state(__struct
brw_context *brw,
surf->ss0.surface_format = format;
surf->ss0.surface_type = BRW_SURFACE_2D;
+ surf->ss0.surface_array___spacing =
surface->array_spacing_lod0 ?
+ GEN7_SURFACE_ARYSPC_LOD0 : GEN7_SURFACE_ARYSPC_FULL;
/* reloc */
surf->ss1.base_addr = region->bo->offset; /* No tile
offsets needed */
diff --git
a/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
b/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
index 5aa62bd..9a01ddf 100644
--- a/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/__gen7_wm_surface_state.c
@@ -142,6 +142,10 @@ gen7_update_texture_surface(__struct
gl_context *ctx, GLuint unit)
return;
}
+ /* We don't support MSAA for textures. */
+ assert(!mt->array_spacing___lod0);
+ assert(mt->num_samples == 0);
+
intel_miptree_get_dimensions___for_image(firstImage,&width,&__height,&depth);
surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
@@ -296,6 +300,9 @@
gen7_update_renderbuffer___surface(struct brw_context *brw,
sizeof(*surf),
32,&brw->wm.surf_offset[unit])__;
memset(surf, 0, sizeof(*surf));
+ /* Render targets can't use MSAA interleaved layout */
+ assert(!irb->mt->msaa_is___interleaved);
+
if (irb->mt->align_h == 4)
surf->ss0.vertical_alignment = 1;
if (irb->mt->align_w == 8)
@@ -324,6 +331,9 @@
gen7_update_renderbuffer___surface(struct brw_context *brw,
}
surf->ss0.surface_type = BRW_SURFACE_2D;
+ surf->ss0.surface_array___spacing =
irb->mt->array_spacing_lod0 ?
+ GEN7_SURFACE_ARYSPC_LOD0 : GEN7_SURFACE_ARYSPC_FULL;
+
/* reloc */
surf->ss1.base_addr =
intel_renderbuffer_tile___offsets(irb,&tile_x,&tile_y);
surf->ss1.base_addr += region->bo->offset; /* reloc */
diff --git
a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
index 36d7b77..4132351 100644
--- a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.c
@@ -73,7 +73,8 @@ intel_miptree_create_internal(__struct
intel_context *intel,
GLuint height0,
GLuint depth0,
bool for_region,
- GLuint num_samples)
+ GLuint num_samples,
+ bool msaa_is_interleaved)
{
struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1);
int compress_byte = 0;
@@ -95,8 +96,14 @@ intel_miptree_create_internal(__struct
intel_context *intel,
mt->cpp = compress_byte ? compress_byte :
_mesa_get_format_bytes(mt->__format);
mt->num_samples = num_samples;
mt->compressed = compress_byte ? 1 : 0;
+ mt->msaa_is_interleaved = msaa_is_interleaved;
mt->refcount = 1;
+ /* array_spacing_lod0 is only used for non-interleaved
MSAA surfaces.
+ * TODO: can we use it elsewhere?
+ */
+ mt->array_spacing_lod0 = num_samples> 0&&
!msaa_is_interleaved;
+
if (target == GL_TEXTURE_CUBE_MAP) {
assert(depth0 == 1);
mt->depth0 = 6;
@@ -109,6 +116,8 @@ intel_miptree_create_internal(__struct
intel_context *intel,
(intel->must_use_separate___stencil ||
(intel->has_separate_stencil&&
intel->vtbl.is_hiz_depth___format(intel, format)))) {
+ /* MSAA stencil surfaces are always interleaved. */
+ bool msaa_is_interleaved = num_samples> 0;
mt->stencil_mt = intel_miptree_create(intel,
mt->target,
MESA_FORMAT_S8,
@@ -118,7 +127,8 @@ intel_miptree_create_internal(__struct
intel_context *intel,
mt->height0,
mt->depth0,
true,
- num_samples);
+ num_samples,
+
msaa_is_interleaved);
if (!mt->stencil_mt) {
intel_miptree_release(&mt);
return NULL;
@@ -165,7 +175,8 @@ intel_miptree_create(struct
intel_context *intel,
GLuint height0,
GLuint depth0,
bool expect_accelerated_upload,
- GLuint num_samples)
+ GLuint num_samples,
+ bool msaa_is_interleaved)
{
struct intel_mipmap_tree *mt;
uint32_t tiling = I915_TILING_NONE;
@@ -207,7 +218,7 @@ intel_miptree_create(struct
intel_context *intel,
mt = intel_miptree_create_internal(__intel, target, format,
first_level,
last_level, width0,
height0, depth0,
- false, num_samples);
+ false, num_samples,
msaa_is_interleaved);
/*
* pitch == 0 || height == 0 indicates the null texture
*/
@@ -243,7 +254,8 @@ intel_miptree_create_for___region(struct
intel_context *intel,
mt = intel_miptree_create_internal(__intel, target, format,
0, 0,
region->width,
region->height, 1,
- true, 0 /* num_samples
*/);
+ true, 0 /* num_samples */,
+ false /*
msaa_is_interleaved */);
if (!mt)
return mt;
@@ -252,6 +264,31 @@
intel_miptree_create_for___region(struct intel_context *intel,
return mt;
}
+/**
+ * Determine whether the MSAA surface being created should
use an interleaved
+ * layout or a sliced layout, based on the chip generation
and the surface
+ * type.
+ */
+static bool
+msaa_format_is_interleaved(__struct intel_context *intel,
gl_format format)
+{
+ /* Prior to Gen7, all surfaces used interleaved layout. */
+ if (intel->gen< 7)
+ return true;
+
+ /* In Gen7, interleaved layout is only used for depth
and stencil
+ * buffers.
+ */
+ switch (_mesa_get_format_base_format(__format)) {
+ case GL_DEPTH_COMPONENT:
+ case GL_STENCIL_INDEX:
+ case GL_DEPTH_STENCIL:
+ return true;
+ default:
+ return false;
+ }
+}
+
struct intel_mipmap_tree*
intel_miptree_create_for___renderbuffer(struct
intel_context *intel,
gl_format format,
@@ -260,25 +297,43 @@
intel_miptree_create_for___renderbuffer(struct intel_context
*intel,
uint32_t num_samples)
{
struct intel_mipmap_tree *mt;
-
- /* Adjust width/height for MSAA. Note: since samples
are interleaved in a
- * complex pattern that repeats for every 2x2 block of
pixels, we need to
- * expand the width and height to even numbers before
the width/height
- * adjustment, otherwise some of the samples in the last
row and column
- * will fall outside the boundary of the texture.
- */
- if (num_samples> 4) {
- num_samples = 8;
- width = ALIGN(width, 2) * 4;
- height = ALIGN(height, 2) * 2;
- } else if (num_samples> 0) {
- num_samples = 4;
- width = ALIGN(width, 2) * 2;
- height = ALIGN(height, 2) * 2;
+ uint32_t depth = 1;
+ bool msaa_is_interleaved = false;
+
+ if (num_samples> 0) {
+ /* Adjust width/height/depth for MSAA */
+ msaa_is_interleaved =
msaa_format_is_interleaved(__intel, format);
+ if (msaa_is_interleaved) {
+ /* Note: since samples are interleaved in a
complex pattern that
+ * repeats for every 2x2 block of pixels, we need
to expand the width
+ * and height to even numbers before the
width/height adjustment,
+ * otherwise some of the samples in the last row
and column will fall
+ * outside the boundary of the texture.
+ */
+ switch (num_samples) {
+ case 4:
+ width = ALIGN(width, 2) * 2;
+ height = ALIGN(height, 2) * 2;
+ break;
+ case 8:
+ width = ALIGN(width, 2) * 4;
+ height = ALIGN(height, 2) * 2;
+ break;
+ default:
+ /* num_samples should already have been
quantized to 0, 4, or
+ * 8.
+ */
+ assert(false);
+ }
+ } else {
+ /* Non-interleaved */
+ depth = num_samples;
+ }
}
mt = intel_miptree_create(intel, GL_TEXTURE_2D, format,
0, 0,
- width, height, 1, true,
num_samples);
+ width, height, depth, true,
num_samples,
+ msaa_is_interleaved);
return mt;
}
@@ -552,6 +607,8 @@ intel_miptree_alloc_hiz(struct
intel_context *intel,
GLuint num_samples)
{
assert(mt->hiz_mt == NULL);
+ /* MSAA HiZ surfaces are always interleaved. */
+ bool msaa_is_interleaved = num_samples> 0;
mt->hiz_mt = intel_miptree_create(intel,
mt->target,
MESA_FORMAT_X8_Z24,
@@ -561,7 +618,8 @@ intel_miptree_alloc_hiz(struct
intel_context *intel,
mt->height0,
mt->depth0,
true,
- num_samples);
+ num_samples,
+ msaa_is_interleaved);
if (!mt->hiz_mt)
return false;
diff --git
a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
index ca1666d..3883d2b 100644
--- a/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/intel/__intel_mipmap_tree.h
@@ -172,6 +172,26 @@ struct intel_mipmap_tree
GLuint num_samples;
bool compressed;
+ /**
+ * For 1D array, 2D array, cube, and 2D multisampled
surfaces on Gen7: true
+ * if the surface only contains LOD 0, and hence no
space is for LOD's
+ * other than 0 in between array slices.
+ *
+ * Corresponds to the surface_array_spacing bit in
gen7_surface_state.
+ */
+ bool array_spacing_lod0;
It seems like we ought to use ARYSPC_LOD0 mode for all
non-mipmapped renderbuffers and textures...it should save
memory. If we did that, I think we could probably just use
mt->first_level != mt->last_level rather than adding a new flag.
Then again, I haven't thought through all the ramifications of
that: I know there's some period of time where the user has
defined LOD0, and you're not sure whether they're going to
defined LOD1, 2, 3 too or just not do mipmapping. It would suck
to allocate something in ARYSPC_LOD0/non-mipmap-capable mode and
then have to redo it. But maybe we have to redo it already?
I think you're right that we have to redo it anyhow, because even
without using ARYSPC_LOD0, if we speculatively allocate a texture
that just has LOD0, then we haven't reserved enough memory for the
other mip levels of the last array slice.
Your suggestion seems reasonable. The only thing I can think of
that might get in the way is if there are restrictions in the bspec
requiring us to disable ARYSPC_LOD0 at certain times. I'll try to
dig through the bspec today and check that.
We discussed this further in person yesterday, and decided that it would
be better to wait and do it as a future optimization, on the grounds
that changing the layout of every non-mipmapped array texture is a
little more risk than we want to incur in the middle of this patch
series. I've added it to a list of items to revisit once the bulk of
MSAA is complete.
I also seem to recall that you found a subtlety about ARYSPC_LOD0 in the
bspec yesterday that we'll need to keep in mind when we do revisit this
topic. Do you remember what that was? I've forgotten the details.