On Fri, Sep 26, 2014 at 2:21 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On Friday, September 26, 2014 09:22:31 AM Kristian Høgsberg wrote: >> On Fri, Aug 29, 2014 at 11:10:50PM -0700, Kenneth Graunke wrote: >> > There's no reason to stall on pwrite - the CPU always appends to the >> > buffer and never modifies existing contents, and the GPU never writes >> > it. Further, the CPU always appends new data before submitting a batch >> > that requires it. >> > >> > This code predates the unsynchronized mapping feature, so we simply >> > didn't have the option when it was written. >> > >> > Ideally, we would do this for non-LLC platforms too, but unsynchronized >> > mapping support only exists for LLC systems. >> > >> > Saves repeated 0.001ms stalls on program upload. >> > >> > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> >> > --- >> > src/mesa/drivers/dri/i965/brw_state_cache.c | 34 >> > +++++++++++++++++++++++------ >> > 1 file changed, 27 insertions(+), 7 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c >> > b/src/mesa/drivers/dri/i965/brw_state_cache.c >> > index b9bb0fc..1d2d32f 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_state_cache.c >> > +++ b/src/mesa/drivers/dri/i965/brw_state_cache.c >> > @@ -172,14 +172,23 @@ brw_cache_new_bo(struct brw_cache *cache, uint32_t >> > new_size) >> > drm_intel_bo *new_bo; >> > >> > new_bo = drm_intel_bo_alloc(brw->bufmgr, "program cache", new_size, >> > 64); >> > + if (brw->has_llc) >> > + drm_intel_gem_bo_map_unsynchronized(new_bo); >> > >> > /* Copy any existing data that needs to be saved. */ >> > if (cache->next_offset != 0) { >> > - brw_bo_map(brw, cache->bo, false, "program cache"); >> > - drm_intel_bo_subdata(new_bo, 0, cache->next_offset, >> > cache->bo->virtual); >> > - drm_intel_bo_unmap(cache->bo); >> > + if (brw->has_llc) { >> > + memcpy(new_bo->virtual, cache->bo->virtual, cache->next_offset); >> >> Move the drm_intel_gem_bo_map_unsynchronized() and drm_intel_bo_unmap() >> calls into this block so they bracket the memcpy as for the subdata case >> below? >> >> Other than that, >> >> Reviewed-by: Kristian Høgsberg <k...@bitplanet.net> > > That won't work---the point is to map new_bo, and leave it mapped...and unmap > the old BO before throwing it away. If I moved the map call into the if > (cache->next_offset != 0) block, then the initial mapping would never occur.
Yup, that makes sense. Kristian > >> > + } else { >> > + brw_bo_map(brw, cache->bo, false, "program cache"); >> > + drm_intel_bo_subdata(new_bo, 0, cache->next_offset, >> > + cache->bo->virtual); >> > + drm_intel_bo_unmap(cache->bo); >> > + } >> > } >> > >> > + if (brw->has_llc) >> > + drm_intel_bo_unmap(cache->bo); >> > drm_intel_bo_unreference(cache->bo); >> > cache->bo = new_bo; >> > cache->bo_used_by_gpu = false; >> > @@ -222,9 +231,11 @@ brw_try_upload_using_copy(struct brw_cache *cache, >> > continue; >> > } >> > >> > - brw_bo_map(brw, cache->bo, false, "program cache"); >> > + if (!brw->has_llc) >> > + brw_bo_map(brw, cache->bo, false, "program cache"); >> > ret = memcmp(cache->bo->virtual + item->offset, data, item->size); >> > - drm_intel_bo_unmap(cache->bo); >> > + if (!brw->has_llc) >> > + drm_intel_bo_unmap(cache->bo); >> > if (ret) >> > continue; >> > >> > @@ -257,7 +268,7 @@ brw_upload_item_data(struct brw_cache *cache, >> > /* If we would block on writing to an in-use program BO, just >> > * recreate it. >> > */ >> > - if (cache->bo_used_by_gpu) { >> > + if (!brw->has_llc && cache->bo_used_by_gpu) { >> > perf_debug("Copying busy program cache buffer.\n"); >> > brw_cache_new_bo(cache, cache->bo->size); >> > } >> > @@ -280,6 +291,7 @@ brw_upload_cache(struct brw_cache *cache, >> > uint32_t *out_offset, >> > void *out_aux) >> > { >> > + struct brw_context *brw = cache->brw; >> > struct brw_cache_item *item = CALLOC_STRUCT(brw_cache_item); >> > GLuint hash; >> > void *tmp; >> > @@ -320,7 +332,11 @@ brw_upload_cache(struct brw_cache *cache, >> > cache->n_items++; >> > >> > /* Copy data to the buffer */ >> > - drm_intel_bo_subdata(cache->bo, item->offset, data_size, data); >> > + if (brw->has_llc) { >> > + memcpy((char *) cache->bo->virtual + item->offset, data, data_size); >> > + } else { >> > + drm_intel_bo_subdata(cache->bo, item->offset, data_size, data); >> > + } >> > >> > *out_offset = item->offset; >> > *(void **)out_aux = (void *)((char *)item->key + item->key_size); >> > @@ -342,6 +358,8 @@ brw_init_caches(struct brw_context *brw) >> > cache->bo = drm_intel_bo_alloc(brw->bufmgr, >> > "program cache", >> > 4096, 64); >> > + if (brw->has_llc) >> > + drm_intel_gem_bo_map_unsynchronized(cache->bo); >> > >> > cache->aux_compare[BRW_VS_PROG] = brw_vs_prog_data_compare; >> > cache->aux_compare[BRW_GS_PROG] = brw_gs_prog_data_compare; >> > @@ -408,6 +426,8 @@ brw_destroy_cache(struct brw_context *brw, struct >> > brw_cache *cache) >> > >> > DBG("%s\n", __FUNCTION__); >> > >> > + if (brw->has_llc) >> > + drm_intel_bo_unmap(cache->bo); >> > drm_intel_bo_unreference(cache->bo); >> > cache->bo = NULL; >> > brw_clear_cache(brw, cache); >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev