Quoting Kenneth Graunke (2017-07-22 00:17:47) > The non-LLC story was a horror show. We uploaded data via pwrite > (drm_intel_bo_subdata), which would stall if the cache BO was in > use (being read) by the GPU. Obviously, we wanted to avoid that. > So, we tried to detect whether the buffer was busy, and if so, we'd > allocate a new BO, map the old one read-only (hopefully not stalling), > copy all shaders compiled since the dawn of time to the new buffer, > upload our new one, toss the old BO, and let the state upload code > know that our program cache BO changed. This was a lot of extra data > copying, and flagging BRW_NEW_PROGRAM_CACHE would also cause a new > STATE_BASE_ADDRESS to be emitted, stalling the entire pipeline. > > Not only that, but our rudimentary busy tracking consistented of a flag > set at execbuf time, and not cleared until we threw out the program > cache BO. So, the first shader upload after any drawing would hit this > "abandon the cache and start over" copying path. > > This is largely unnecessary - it's just ancient and crufty code. We can > use the same persistent mapping paths on all platforms. On non-ancient > kernels, this will use a write combining map, which should be reasonably > fast. > > One aspect that is worse: we do occasionally grow the program cache BO, > and copy the old contents to the newer BO. This will suffer from UC > readback performance now. To mitigate this, we use the MOVNTDQA based > streaming memcpy on platforms with SSE 4.1 (all Gen7+ atoms). Gen4-5 > are unfortunately going to be penalized. > > v2: Add MOVNTDQA path, rebase on other map flag changes.
Don't forgot cache->bo_used_by_gpu! diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 2acebaa820..a4a04daaa9 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -373,7 +373,6 @@ struct brw_cache { GLuint size, n_items; uint32_t next_offset; - bool bo_used_by_gpu; }; /* Considered adding a member to this struct to document which flags diff --git a/src/mesa/drivers/dri/i965/brw_program_cache.c b/src/mesa/drivers/dri/i965/brw_program_cache.c index 8d83af71d3..4dcfd5234d 100644 --- a/src/mesa/drivers/dri/i965/brw_program_cache.c +++ b/src/mesa/drivers/dri/i965/brw_program_cache.c @@ -238,7 +238,6 @@ brw_cache_new_bo(struct brw_cache *cache, uint32_t new_size) brw_bo_unreference(cache->bo); cache->bo = new_bo; cache->map = map; - cache->bo_used_by_gpu = false; /* Since we have a new BO in place, we need to signal the units * that depend on it (state base address on gen5+, or unit state before). diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 4461a59b80..e2f208a3d1 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -465,12 +465,6 @@ brw_finish_batch(struct brw_context *brw) PIPE_CONTROL_CS_STALL); } } - - /* Mark that the current program cache BO has been used by the GPU. - * It will be reallocated if we need to put new programs in for the - * next batch. - */ - brw->cache.bo_used_by_gpu = true; } static void _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev