On Wed, Nov 10, 2010 at 9:56 AM, Christian König <deathsim...@vodafone.de> wrote: > I also started to profile the performance of the code a bit, as expected > we spend far to much time deciding of how and where to draw something > compared to really drawing something. Up to 50% of the whole cpu time is > spend in gen_block_verts!!! > > My todo list now looks something like this: > 1. Speed thinks up allot by using a different vertex buffer approach. > 2. Implement different colour formats. > 3. Maybe implement missing motion types (16x8 and dualprime) > 4. Either do iDCT myself or relay on the work of somebody else. > 5. Get drunk while watching an XvMC accelerated episode of Simpsons.
Keep in mind that the XvMC interface makes no guarantees about the order in which macroblocks are submitted. Also, we're already sorting macroblocks by I/P/B type in order to batch draw calls. Otherwise it would be possible to sort on x,y coords and use a sort that performs well on sorted/nearly sorted inputs to take advantage of the fact that most clients always submit macroblocks in the obvious order. At the time I chose to sort on type to batch draw calls rather and came up with the zero-block scheme to cut down on per frame data that needs to be generated. You could use index buffers, but somehow I don't think that would be a win, especially on HW that doesn't actually do index lookup in HW. If you have any better ideas they'd be welcome. In the meantime, I suggest you check if your vertex buffers are in sytem memory (preferably at least WC-ed if not cached); I don't recall spending that much time in gen_block_verts in Nouveau. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev