On Tue, Oct 14, 2014 at 7:42 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > According to INTEL_DEBUG=perf, "Borderlands: The Pre-Sequel" was > stalling on nearly every glBufferSubData call, with very slightly > overlapping busy ranges. > > It turns out the draw upload code was accidentally including an extra > stride's worth of data in the vertex buffer size due to a simple > off-by-one error. We considered this extra bit of buffer space to be > busy (in use by the GPU), when it was actually idle. > > The new diagram should make it easier to understand the formula. It's > basically what I drew on paper when working through an actual > glDrawRangeElements call. > > Eliminates all glBufferSubData stalls in "Borderlands: The Pre-Sequel." > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > Cc: mesa-sta...@lists.freedesktop.org > --- > src/mesa/drivers/dri/i965/brw_draw_upload.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > No Piglit regressions on Haswell. This might help Dota 2 and Serious Sam 3 > as well, but I haven't checked. > > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c > b/src/mesa/drivers/dri/i965/brw_draw_upload.c > index 5a12439..6cb653c 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c > @@ -486,8 +486,28 @@ brw_prepare_vertices(struct brw_context *brw) > offset = 0; > size = intel_buffer->Base.Size; > } else { > + /* Compute the size/amount of data referenced by the GPU. > + * If the data is interleaved, StrideB may be larger than > + * _ElementSize. As an example, assume we have 2 > interleaved > + * attributes A and B. The data is organized like this: > + * > + * Stride EltSize > + * _,,_ , > + * / \ / \ > + * A: --- --- --- --- --- --- > + * B: --- --- --- --- --- --- > + * > + * |===== 4 elts ======| (4-1) * Stride + EltSize > + * > + * max_index - min_index gives the number of elements that > + * will be referenced. Say we're drawing 4 elements. On
Hm... is that right? Intuitively, max_index - min_index + 1 gives the number of elements that will be referenced. Does the max already have the +1 built into it? -ilia > + * the first three, we need the full stride in order to get > + * to the next element. But on the last, we only want the > + * element size, since we don't actually read the other > + * interleaved vertex attributes stored beyond that. > + */ > offset = buffer->offset + min_index * buffer->stride; > - size = (buffer->stride * (max_index - min_index) + > + size = (buffer->stride * MAX2(max_index - min_index - 1, > 0) + > glarray->_ElementSize); > } > } > -- > 2.1.2 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev