On Wednesday, October 15, 2014 09:50:42 AM Jason Ekstrand wrote: > On Tue, Oct 14, 2014 at 4:42 PM, Kenneth Graunke <kenn...@whitecape.org> > wrote: > > > According to INTEL_DEBUG=perf, "Borderlands: The Pre-Sequel" was > > stalling on nearly every glBufferSubData call, with very slightly > > overlapping busy ranges. > > > > It turns out the draw upload code was accidentally including an extra > > stride's worth of data in the vertex buffer size due to a simple > > off-by-one error. We considered this extra bit of buffer space to be > > busy (in use by the GPU), when it was actually idle. > > > > Nice Catch! > > > > > > The new diagram should make it easier to understand the formula. It's > > basically what I drew on paper when working through an actual > > glDrawRangeElements call. > > > > Eliminates all glBufferSubData stalls in "Borderlands: The Pre-Sequel." > > > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > > Cc: mesa-sta...@lists.freedesktop.org > > --- > > src/mesa/drivers/dri/i965/brw_draw_upload.c | 22 +++++++++++++++++++++- > > 1 file changed, 21 insertions(+), 1 deletion(-) > > > > No Piglit regressions on Haswell. This might help Dota 2 and Serious Sam 3 > > as well, but I haven't checked. > > > > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c > > b/src/mesa/drivers/dri/i965/brw_draw_upload.c > > index 5a12439..6cb653c 100644 > > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c > > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c > > @@ -486,8 +486,28 @@ brw_prepare_vertices(struct brw_context *brw) > > offset = 0; > > size = intel_buffer->Base.Size; > > } else { > > + /* Compute the size/amount of data referenced by the > > GPU. > > + * If the data is interleaved, StrideB may be larger > > than > > + * _ElementSize. As an example, assume we have 2 > > interleaved > > + * attributes A and B. The data is organized like this: > > + * > > + * Stride EltSize > > + * _,,_ , > > + * / \ / \ > > + * A: --- --- --- --- --- --- > > + * B: --- --- --- --- --- --- > > + * > > + * |===== 4 elts ======| (4-1) * Stride + EltSize > > + * > > + * max_index - min_index gives the number of elements > > that > > + * will be referenced. Say we're drawing 4 elements. > > On > > + * the first three, we need the full stride in order to > > get > > + * to the next element. But on the last, we only want > > the > > + * element size, since we don't actually read the other > > + * interleaved vertex attributes stored beyond that. > > + */ > > offset = buffer->offset + min_index * buffer->stride; > > - size = (buffer->stride * (max_index - min_index) + > > + size = (buffer->stride * MAX2(max_index - min_index - > > 1, 0) + > > glarray->_ElementSize); > > > > I'm not so sure about this new formula. Looking at the docs on > glDrawRangeElements, the index range is inclusive on both ends meaning that > the number of elements drawn is max_index - min_index + 1. If that's the > case, then the old formula was correct. Are we using a half-open interval > in mesa?
Yes, you're both right...I got tripped up by seeing (max_index - min_index) instead of (elements = max_index - min_index + 1) and then (elements - 1). Looking over the code, everything sure looks correct. Which leaves the question of why both Borderlands: TPS and Serious Sam 3 have all kinds of stalling with the current code, and work fine with no stalls with the patch. I wonder if the games are incorrectly translating DirectX's DrawIndexedPrimitive call, which takes MinIndex and NumVertices, where NumVertices is apparently supposed to be (Max-Min+1). I suppose I could ask...
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev