On 10/22/2013 04:30 AM, Marek Olšák wrote: > On Fri, Oct 18, 2013 at 8:09 AM, Kenneth Graunke <kenn...@whitecape.org> > wrote: >> DrawTransformFeedback() needs to obtain the number of vertices written >> to a particular stream during the last Begin/EndTransformFeedback block. >> The new driver hook returns exactly that information. >> >> Gallium drivers already implement this functionality by passing the >> transform feedback object to the drawing function. I prefer to avoid >> this for two reasons: >> >> 1. Complexity: >> >> Normally, the drawing function takes an array of _mesa_prim objects, >> each of which specifies a vertex count. If tfb_vertcount != NULL, >> however, there will only be one _mesa_prim object with an invalid >> vertex count (of 1), so it needs to be ignored. >> >> Since the _mesa_prim pointers are const, you can't even override it to >> the proper value; you need to pass around extra "ignore that, here's >> the real count" parameters. >> >> The drawing function is already terribly complicated, so I don't want to >> make it even more complicated. > > I don't understand this. Are you saying that the software emulation of > the feature is always better because of complexity the real > hardware-accelerated solution would have?
On Ivybridge hardware, I think that a GPU-only implementation of DrawTransformFeedback would be very complicated, and probably less efficient than this (extremely simple) software solution. It might be possible to do a reasonable GPU-only implementation on Haswell, but I haven't looked into the details yet. (See my reply to Eric.) At least for Ivybridge, I think I want this software path 100% of the time. We may want to remove the stall on Haswell as a later optimization. It sounds like for Gallium, you already have a decent GPU-only solution. I tried to follow that code to understand how it works, and got lost after jumping through around 5 files...which is probably just my poor understanding of the Gallium architecture. [snip] >> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c >> index 1670409..11bb76a 100644 >> --- a/src/mesa/vbo/vbo_exec_array.c >> +++ b/src/mesa/vbo/vbo_exec_array.c >> @@ -1464,6 +1464,12 @@ vbo_draw_transform_feedback(struct gl_context *ctx, >> GLenum mode, >> return; >> } >> >> + if (ctx->Driver.GetTransformFeedbackVertexCount) { >> + GLsizei n = ctx->Driver.GetTransformFeedbackVertexCount(ctx, obj, >> stream); >> + vbo_draw_arrays(ctx, mode, 0, n, numInstances, 0); >> + return; >> + } > > As you mentioned, the only issue is with primitive restart, so why is > this done even if primitive restart is disabled? Drivers which will > have to implement this just to make e.g. non-VBO vertex uploads work > will suffer from the CPU-GPU synchronization this code forces. > > Marek I hadn't thought about non-VBO vertex uploads. What does Gallium do in that case? Has it just been broken this whole time? I guess I figured drivers would either implement this hook, or do the tfb_vertcount approach, but not both. Maybe that's a bad assumption. --Ken _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev