On 03/26/2014 11:14 AM, Eric Anholt wrote: > Kenneth Graunke <kenn...@whitecape.org> writes: > >> On 03/25/2014 11:27 AM, Eric Anholt wrote: >>> The implementation kept a page-sized area for uploading data, and >>> uploaded chunks from that to a 64kb-sized streamed buffer. This wasted >>> cache footprint (and extra state tracking to do so) when we want to just >>> write our data into the buffer immediately. >>> >>> Instead, build it around an interface like brw_state_batch() that just >>> gets you a pointer to BO memory to upload your stuff immediately. >>> >>> Improves OpenArena on HSW by 1.62209% +/- 0.355299% (n=61) and on BYT by >>> 1.7916% +/- 0.415743% (n=31). >>> >>> v2: Rebase on Mesa master, drop old prototypes. Re-do performance >>> comparison on a kernel that doesn't punish CPU efficiency >>> improvements. >>> --- >>> src/mesa/drivers/dri/i965/brw_context.h | 5 +- >>> src/mesa/drivers/dri/i965/brw_draw_upload.c | 10 +- >>> src/mesa/drivers/dri/i965/intel_buffer_objects.h | 21 +-- >>> src/mesa/drivers/dri/i965/intel_upload.c | 167 >>> +++++++++-------------- >>> 4 files changed, 77 insertions(+), 126 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h >>> b/src/mesa/drivers/dri/i965/brw_context.h >>> index 04af5d0..e119682 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_context.h >>> +++ b/src/mesa/drivers/dri/i965/brw_context.h >>> @@ -1041,10 +1041,7 @@ struct brw_context >>> >>> struct { >>> drm_intel_bo *bo; >>> - GLuint offset; >>> - uint32_t buffer_len; >>> - uint32_t buffer_offset; >>> - char buffer[4096]; >>> + uint32_t next_offset; >>> } upload; >>> >>> /** >>> diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c >>> b/src/mesa/drivers/dri/i965/brw_draw_upload.c >>> index e261163..a579025 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c >>> +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c >>> @@ -381,21 +381,17 @@ copy_array_to_vbo_array(struct brw_context *brw, >>> const unsigned char *src = element->glarray->Ptr + min * src_stride; >>> int count = max - min + 1; >>> GLuint size = count * dst_stride; >>> + uint8_t *dst = intel_upload_space(brw, size, dst_stride, >>> + &buffer->bo, &buffer->offset); >>> >> >> It sure seems like these references will exist across batchbuffers, >> meaning we may hold on to the upload buffer containing our vertex data >> until we do copy_array_to_vbo_array on it again. That seems >> unfortunate. (Or, maybe I'm wrong?) > > copy_array_to_vbo_array() happens per draw call, so it won't persist > across batchbuffers.
Technically, I think it will persist across batches after the last draw call in a batch, and before the first in the second. But that's fine...we just need to avoid long-standing references which will keep it alive too long. I guess it's okay because brw_merge_inputs is called on every draw call, and does: for (i = 0; i < brw->vb.nr_buffers; i++) { drm_intel_bo_unreference(brw->vb.buffers[i].bo); brw->vb.buffers[i].bo = NULL; } and this is before switching out brw->vb.nr_buffers and brw->vb.buffers for the new ones...so we always free our old data. I think we're fine here after all. Thanks. --Ken
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev