On 28/03/2025 16:46, Tvrtko Ursulin wrote:
A small set of drm_syncobj optimisations which should make things a tiny bit
more efficient on the CPU side of things.
Improvement seems to be around 1.5%* more FPS if observed with "vkgears
-present-mailbox" on a Steam Deck Plasma desktop, but I am reluctant to make a
definitive claim on the numbers since there is some run to run variance. But, as
suggested by Michel Dänzer, I did do a five ~100 second runs on the each kernel
to be able to show the ministat analysis.
x before
+ after
+------------------------------------------------------------+
| x + |
| x x + |
| x xx ++++ |
| x x xx x ++++ |
| x xx x xx x+ ++++ |
| xxxxx xxxxxx+ ++++ + + |
| xxxxxxx xxxxxx+x ++++ +++ |
| x xxxxxxxxxxx*xx+* x++++++++ ++ |
| x x xxxxxxxxxxxx**x*+*+*++++++++ ++++ + |
| xx x xxxxxxxxxx*x****+***+**+++++ ++++++ |
|x xxx x xxxxx*x****x***********+*++**+++++++ + + +|
| |_______A______| |
| |______A_______| |
+------------------------------------------------------------+
N Min Max Median Avg Stddev
x 135 21697.58 22809.467 22321.396 22307.707 198.75011
+ 118 22200.746 23277.09 22661.4 22671.442 192.10609
Difference at 95.0% confidence
363.735 +/- 48.3345
1.63054% +/- 0.216672%
(Student's t, pooled s = 195.681)
Intel Alderlake laptop, KDE Wayland:
x base
+ syncobj
+--------------------------------------------------------------+
| + |
| + + |
| + + |
| + ++ |
| ++ ++ |
| x ++ ++ |
| x x + ++ ++ |
| x xx xx x x +++++++ |
| x x xx xxx xxxx*xxx +++++++++ |
|x xx x x x xx xxxxxxxxxx*xxx****xxx +x+ ++++++++++|
| |__________A_M_______| |____A_M___| |
+--------------------------------------------------------------+
N Min Max Median Avg Stddev
x 55 7158.232 8058.753 7803.506 7754.5195 191.69526
+ 55 7801.23 8272.271 8172.435 8150.6303 105.84085
Difference at 95.0% confidence
396.111 +/- 57.8717
5.10813% +/- 0.746296%
(Student's t, pooled s = 154.838)
* Scores may seem low but I had to fix CPU freq to avoid some pretty
strong thermal throttling causing wild swings within a run.
Benchmarking script:
#! /bin/sh
base=$1
for i in $(seq 5)
do
timeout 60 vkgears -present-mailbox | tee -a "vkbench-$base.log"
sleep 5
done
Regards,
Tvrtko
v2:
* Implemented review feedback - see patch change logs.
v3:
* Moved #define DRM_SYNCOBJ_FAST_PATH_ENTRIES one patch earlier for less
churn.
Cc: Maíra Canal <mca...@igalia.com>
Tvrtko Ursulin (7):
drm/syncobj: Remove unhelpful helper
drm/syncobj: Do not allocate an array to store zeros when waiting
drm/syncobj: Avoid one temporary allocation in drm_syncobj_array_find
drm/syncobj: Use put_user in drm_syncobj_query_ioctl
drm/syncobj: Avoid temporary allocation in
drm_syncobj_timeline_signal_ioctl
drm/syncobj: Add a fast path to drm_syncobj_array_wait_timeout
drm/syncobj: Add a fast path to drm_syncobj_array_find
drivers/gpu/drm/drm_syncobj.c | 286 ++++++++++++++++++----------------
1 file changed, 154 insertions(+), 132 deletions(-)