It appears that Z16 on Intel hardware is in fact slower than Z24, so people are getting surprisingly hurt when trying to use Z16 as a performance-versus-precision tradeoff, or when they're targeting GLES2 and that's all you get.
GL 3.0+ have Z16 on the list of required exact format sizes, but GLES doesn't, so choose the better-performing layout in that case. Improves GLB 2.7 trex performance at 1920x1080 by 10.7% +/- 1.1% (n=3) on my IVB system. --- I don't like that we aren't totally sure of the mechanism behind the performance improvement, but in the absence of any data against this, I think we should drop Z16 at this point. src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index a74b2c7..f197639 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -566,7 +566,20 @@ brw_init_surface_formats(struct brw_context *brw) ctx->TextureFormatSupported[MESA_FORMAT_X8_Z24] = true; ctx->TextureFormatSupported[MESA_FORMAT_Z32_FLOAT] = true; ctx->TextureFormatSupported[MESA_FORMAT_Z32_FLOAT_X24S8] = true; - ctx->TextureFormatSupported[MESA_FORMAT_Z16] = true; + + /* It appears that Z16 is slower than Z24 (on Intel Ivybridge and newer + * hardware at least), so there's no real reason to prefer it unless you're + * under memory (not memory bandwidth) pressure. Our speculation is that + * this is due to either increased fragment shader execution from + * GL_LEQUAL/GL_EQUAL depth tests at the reduced precision, or due to + * increased depth stalls from a cacheline-based heuristic for detecting + * depth stalls. + * + * However, desktop GL 3.0+ require that you get exactly 16 bits when + * asking for DEPTH_COMPONENT16, so we have to respect that. + */ + if (_mesa_is_desktop_gl(ctx)) + ctx->TextureFormatSupported[MESA_FORMAT_Z16] = true; /* On hardware that lacks support for ETC1, we map ETC1 to RGBX * during glCompressedTexImage2D(). See intel_mipmap_tree::wraps_etc1. -- 1.7.10.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev