Hi This patch-series tries to improve amdgpu's below-the-range behaviour with Freesync, hopefully not only for my use case, but also for games etc.
Patch 1/4 adds a bit of debug output i found very useful, so maybe worth adding? Patch 2/4 fixes a bug i found when reading over the freesync code. Patches 3/4 and 4/4 are optimizations to improve stability in BTR. My desired application of VRR for neuroscience/vision research is to control the timing of when frames show up onscreen, e.g., to show animations at different "unconventional" framerates, so i'm mostly interested in how well one can control the timing between successive OpenGL bufferswaps. This is a bit different from what a game wants to get out of VRR, probably closer to what a movie player might want to do. I spent quite a bit of time testing how FreeSync behaves when flipping at a rate below the displays VRR minimum refresh rate. For that, my own application was submitting glXSwapBuffers() flip requests at different fps rates / time delays between successive flips. The test script is for GNU/Octave + psychtoolbox-3, but in principle the C equivalent would be this pseudo-code: for (i = 0; i < n; i++) { // Wait for pending flip to complete, get pageflip timestamp t_last // of flip completion: glXWaitForSbcOML(...., &t_last[i],...); // Fetch some delay value until next flip tdelay[i] tdelay[i] = Some function of varying frame delay over samples. // Try to flip tdelay[i] secs after previous flip t_last[i]: t_next = t_last[i] + tdelay[i]; clock_nanosleep(t_next); // Flip glXSwapBuffers(...); } For tdelay[i] i used different test profiles, e.g., on a display with a VRR range from 30 Hz to 144 Hz ~ 7 msecs - 33 msecs: tdelay[i] = 0.050 // One flip each 50 msecs, want constant 20 fps. tdelay[i] = rand() // Some randomly chosen delay for each flip. tdelay[i] = 0.007 + some sin() sine profile of changing delays/fps. tdelay[i] = 0.007 + i * 0.001; linear increase in delay by 1 msec/flip, starting at 7 msecs. tdelay[i] = ... linear decrease by 1 msec/flip.. starting at 120 msecs. etc. Then i plotted requested flip delays tdelay[] against actual flip delays (~ t_last[i+1] - t_last[i]) to see how well VRR can follow requested fps. Inside the VRR range ~ 7 msecs - 33 msecs, Freesync behaved basically perfect with average errors of less than 0.1 msecs and jitter of less than 1 msec. When going for tdelay's > 33 msecs, ie. when low framerate compensation/ BTR kicked in, my DCN-1 Raven Ridge APU behaved almost as well as within the VRR range (for reasonably smooth changes in fps). When doing the same on a DCE-8 and DCE-11 gpu, BTR made much bigger errors between what was requested and what was measured. Patch 3/4 helps avoiding glitches on DCN when transitioning from VRR range to below min VRR range, and helps even more in avoiding glitches on DCE. Patch 4/4 tries to improve behaviour on pre-DCE12. It helps quite a lot when testing on DCE-8 and DCE-11 as described in the commit message. This makes sense, as the code has some TODO comment about the different hw behaviour of pre-DCE12, mentioning that pre-DCE12 hw only responds to programming different VTOTAL_MIN/MAX values with a lag of 1 frame. The patch tries to work around this hardware limitation with some success. DCE8/11 behaviour is still not as good as DCN-1 behaviour, but at least it is not totally useless for my type of application with this patchset. I don't have a Vega discrete gpu with DCE-12, but assume it would behave like DCN-1 if the comments in the code are correct? Therefore the patch switches BTR handling for < AMDGPU_FAMILY_AI vs. >= AMDGPU_FAMILY_AI. Thanks, -mario _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx