It turns out we can allow COHERENT storage/mappings all the time, regardless of LLC vs non-LLC. It just means never using temporary mappings to avoid GPU stalls, and on non-LLC we have to use the GTT intead of CPU mappings. If we were to use CPU maps on non-LLC (which might be useful if apps end up using buffer_storage on PBO reads, to avoid WC read slowness), those would be PERSISTENT but not COHERENT, but doing that would require us driving the clflushes from userspace somehow.
Using this in glamor, I got a 29.5361% +/- 2.74092% improvement in x11perf -aa10text (n=489). --- docs/GL3.txt | 2 +- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 9 +++++++-- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 5261eda..45db98f 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -170,7 +170,7 @@ GL 4.4: GLSL 4.4 not started GL_MAX_VERTEX_ATTRIB_STRIDE not started - GL_ARB_buffer_storage DONE (r300, r600, radeonsi) + GL_ARB_buffer_storage DONE (i965, r300, r600, radeonsi) GL_ARB_clear_texture not started GL_ARB_enhanced_layouts not started GL_ARB_multi_bind started (Fredrik Höglund) diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c b/src/mesa/drivers/dri/i965/intel_buffer_objects.c index 260308a..96dacde 100644 --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c @@ -401,8 +401,12 @@ intel_bufferobj_map_range(struct gl_context * ctx, * doesn't require the current contents of that range, make a new * BO, and we'll copy what they put in there out at unmap or * FlushRange time. + * + * That is, unless they're looking for a persistent mapping -- we would + * need to do blits in the MemoryBarrier call, and it's easier to just do a + * GPU stall and do a mapping. */ - if (!(access & GL_MAP_UNSYNCHRONIZED_BIT) && + if (!(access & (GL_MAP_UNSYNCHRONIZED_BIT | GL_MAP_PERSISTENT_BIT)) && (access & GL_MAP_INVALIDATE_RANGE_BIT) && drm_intel_bo_busy(intel_obj->buffer)) { /* Ensure that the base alignment of the allocation meets the alignment @@ -429,7 +433,8 @@ intel_bufferobj_map_range(struct gl_context * ctx, if (access & GL_MAP_UNSYNCHRONIZED_BIT) drm_intel_gem_bo_map_unsynchronized(intel_obj->buffer); - else if (!brw->has_llc && !(access & GL_MAP_READ_BIT)) { + else if (!brw->has_llc && (!(access & GL_MAP_READ_BIT) || + (access & GL_MAP_PERSISTENT_BIT))) { drm_intel_gem_bo_map_gtt(intel_obj->buffer); intel_bufferobj_mark_inactive(intel_obj); } else { diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index ef9aa55..860896a 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -162,6 +162,7 @@ intelInitExtensions(struct gl_context *ctx) assert(brw->gen >= 4); + ctx->Extensions.ARB_buffer_storage = true; ctx->Extensions.ARB_depth_buffer_float = true; ctx->Extensions.ARB_depth_clamp = true; ctx->Extensions.ARB_depth_texture = true; -- 1.9.rc1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev