On Wed, Aug 31, 2016 at 7:00 PM, Michel Dänzer <mic...@daenzer.net> wrote:
> On 31/08/16 11:21 PM, Jason Ekstrand wrote: > > On Aug 19, 2016 12:07 AM, "Michel Dänzer" <mic...@daenzer.net > > <mailto:mic...@daenzer.net>> wrote: > >> From: Michel Dänzer <michel.daen...@amd.com > > <mailto:michel.daen...@amd.com>> > >> > >> Always use 3 buffers when flipping. With only 2 buffers, we have to wait > >> for a flip to complete (which takes non-0 time even with asynchronous > >> flips) before we can start working on the next frame. We were previously > >> only using 2 buffers for flipping if the X server supports asynchronous > >> flips, even when we're not using asynchronous flips. This could result > >> in bad performance (the referenced bug report is an extreme case, where > >> the inter-frame stalls were preventing the GPU from reaching its maximum > >> clocks). > > > > Sorry for the post-push review but I don't usually pay much attention to > > the window system code. In any case, I believe you're doing your > > counting wrong. When flipping with swapinterval=0, you need 4 buffers: > > > > 1. The buffer currently being scanned out (will be released at next > vblank) > > 2. The buffer X has queued for scanout but is waiting on vblank > > s/vblank/flip/g, since async flips may not wait for vblank, but yeah. > > > 3. The buffer the application has just submitted which X will queue next > > of it doesn't get another before the window closes. > > 4. The buffer the application is using for rendering. > > > > With only 3, you get a stall during that window in which X has queued > > another flip but we're waiting on vblank before the flip begins. An I > > missing something? > > Nothing, except maybe the paragraph below stating that I couldn't > measure any benefit from using 4 buffers. :) I'm not exactly sure why, > but I suspect it might be because even with just 3 buffers, the GPU can > always work on at least one frame ahead of time. > > Also note that even before my change, we were only using 3 buffers when > the X driver supports async flips (with swap interval 0; only 2 buffers > with swap interval > 0). > Yes, because with async flips you don't have a buffer sitting queued in the kernel waiting to be flipped which you can't cancel. that makes perfect sense. > That said, I'd be interested in hearing about any test cases where 4 > buffers provide a significant boost over 3. > A little history that may be useful: Quadbuffering was originally added for DRI3+present here: https://cgit.freedesktop.org/mesa/mesa/commit/?id=f7a355556ef5fe23056299a77414f9ad8b5e5a1d In Wayland, the change was made here https://cgit.freedesktop.org/mesa/mesa/commit/?id=992a2dbba80aba35efe83202e1013bd6143f0dba Unfortunately, neither of those specify precise metrics. Eero's bug had some very concrete numbers. Hopefully he can provide you with the details you need for further analysis. > > >> I couldn't measure any performance boost using 4 buffers with flipping. > >> Performance actually seemed to go down slightly, but that might have > >> been just noise. > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev