On Tue, Oct 14, 2014 at 4:42 PM, Kenneth Graunke <kenn...@whitecape.org> wrote:
> According to INTEL_DEBUG=perf, "Borderlands: The Pre-Sequel" was > stalling on nearly every glBufferSubData call, with very slightly > overlapping busy ranges. > > It turns out the draw upload code was accidentally including an extra > stride's worth of data in the vertex buffer size due to a simple > off-by-one error. We considered this extra bit of buffer space to be > busy (in use by the GPU), when it was actually idle. > Nice Catch! > > The new diagram should make it easier to understand the formula. It's > basically what I drew on paper when working through an actual > glDrawRangeElements call. > > Eliminates all glBufferSubData stalls in "Borderlands: The Pre-Sequel." > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > Cc: mesa-sta...@lists.freedesktop.org > --- > src/mesa/drivers/dri/i965/brw_draw_upload.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > No Piglit regressions on Haswell. This might help Dota 2 and Serious Sam 3 > as well, but I haven't checked. > > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c > b/src/mesa/drivers/dri/i965/brw_draw_upload.c > index 5a12439..6cb653c 100644 > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c > @@ -486,8 +486,28 @@ brw_prepare_vertices(struct brw_context *brw) > offset = 0; > size = intel_buffer->Base.Size; > } else { > + /* Compute the size/amount of data referenced by the > GPU. > + * If the data is interleaved, StrideB may be larger > than > + * _ElementSize. As an example, assume we have 2 > interleaved > + * attributes A and B. The data is organized like this: > + * > + * Stride EltSize > + * _,,_ , > + * / \ / \ > + * A: --- --- --- --- --- --- > + * B: --- --- --- --- --- --- > + * > + * |===== 4 elts ======| (4-1) * Stride + EltSize > + * > + * max_index - min_index gives the number of elements > that > + * will be referenced. Say we're drawing 4 elements. > On > + * the first three, we need the full stride in order to > get > + * to the next element. But on the last, we only want > the > + * element size, since we don't actually read the other > + * interleaved vertex attributes stored beyond that. > + */ > offset = buffer->offset + min_index * buffer->stride; > - size = (buffer->stride * (max_index - min_index) + > + size = (buffer->stride * MAX2(max_index - min_index - > 1, 0) + > glarray->_ElementSize); > I'm not so sure about this new formula. Looking at the docs on glDrawRangeElements, the index range is inclusive on both ends meaning that the number of elements drawn is max_index - min_index + 1. If that's the case, then the old formula was correct. Are we using a half-open interval in mesa? > } > } > -- > 2.1.2 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev