[All reviewed, sending for more acks.] A small set of drm_syncobj optimisations which should make things a tiny bit more efficient on the CPU side of things.
Improvement seems to be around 1.5%* more FPS if observed with "vkgears -present-mailbox" on a Steam Deck Plasma desktop, but I am reluctant to make a definitive claim on the numbers since there is some run to run variance. But, as suggested by Michel Dänzer, I did do a five ~100 second runs on the each kernel to be able to show the ministat analysis. x before + after +------------------------------------------------------------+ | x + | | x x + | | x xx ++++ | | x x xx x ++++ | | x xx x xx x+ ++++ | | xxxxx xxxxxx+ ++++ + + | | xxxxxxx xxxxxx+x ++++ +++ | | x xxxxxxxxxxx*xx+* x++++++++ ++ | | x x xxxxxxxxxxxx**x*+*+*++++++++ ++++ + | | xx x xxxxxxxxxx*x****+***+**+++++ ++++++ | |x xxx x xxxxx*x****x***********+*++**+++++++ + + +| | |_______A______| | | |______A_______| | +------------------------------------------------------------+ N Min Max Median Avg Stddev x 135 21697.58 22809.467 22321.396 22307.707 198.75011 + 118 22200.746 23277.09 22661.4 22671.442 192.10609 Difference at 95.0% confidence 363.735 +/- 48.3345 1.63054% +/- 0.216672% (Student's t, pooled s = 195.681) Or when tested on Intel Alderlake, KDE Wayland: x base + syncobj +--------------------------------------------------------------+ | + | | + + | | + + | | + ++ | | ++ ++ | | x ++ ++ | | x x + ++ ++ | | x xx xx x x +++++++ | | x x xx xxx xxxx*xxx +++++++++ | |x xx x x x xx xxxxxxxxxx*xxx****xxx +x+ ++++++++++| | |__________A_M_______| |____A_M___| | +--------------------------------------------------------------+ N Min Max Median Avg Stddev x 55 7158.232 8058.753 7803.506 7754.5195 191.69526 + 55 7801.23 8272.271 8172.435 8150.6303 105.84085 Difference at 95.0% confidence 396.111 +/- 57.8717 5.10813% +/- 0.746296% (Student's t, pooled s = 154.838) Scores may seem low but I had to fix to conservative CPU freq to avoid some pretty strong thermal throttling causing wild swings within a run. Nevertheless the improvement is clearly shown here as well. v2: * Implemented review feedback - see patch change logs. v3: * Moved #define DRM_SYNCOBJ_FAST_PATH_ENTRIES one patch earlier for less churn. v3.1: * Consolidated testing results. Cc: Maíra Canal <mca...@igalia.com> Tvrtko Ursulin (7): drm/syncobj: Remove unhelpful helper drm/syncobj: Do not allocate an array to store zeros when waiting drm/syncobj: Avoid one temporary allocation in drm_syncobj_array_find drm/syncobj: Use put_user in drm_syncobj_query_ioctl drm/syncobj: Avoid temporary allocation in drm_syncobj_timeline_signal_ioctl drm/syncobj: Add a fast path to drm_syncobj_array_wait_timeout drm/syncobj: Add a fast path to drm_syncobj_array_find drivers/gpu/drm/drm_syncobj.c | 286 ++++++++++++++++++---------------- 1 file changed, 154 insertions(+), 132 deletions(-) -- 2.48.0