On 28/07/2025 10:28, Pierre-Eric Pelloux-Prayer wrote:
Le 24/07/2025 à 16:19, Tvrtko Ursulin a écrit :
GPUs generally don't implement preemption and DRM scheduler definitely
does not support it at the front end scheduling level. This means
execution quanta can be quite long and is contr
On 24/07/2025 15:19, Tvrtko Ursulin wrote:
As a summary, the new scheduling algorithm is insipired by the original Linux
CFS and so far no scheduling regressions have been found relative to FIFO.
There are improvements in fairness and scheduling of interactive clients when
running in parallel
On 02/06/2025 17:22, Michal Koutný wrote:
Hello.
Hi and apologies for the delay.
On Fri, May 02, 2025 at 01:32:53PM +0100, Tvrtko Ursulin
wrote:
From: Tvrtko Ursulin
Similar to CPU and IO scheduling, implement a concept of weight in the DRM
cgroup controller.
Individual drivers are
the
tree only containing runnable entities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_rq.c | 28 +---
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a
Now that the run queue to scheduler relationship is always 1:1 we can
embed it (the run queue) directly in the scheduler struct and save on
some allocation error handling code and such.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp
so rename it to just the tree.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 23 ++--
drivers/gpu/drm/scheduler/sched_entity.c | 29 +
drivers/gpu/drm/scheduler
way to look at this is that it is adding a little bit of limited
random round-robin behaviour to the fair scheduling algorithm.
Net effect is a significant improvemnt to the scheduling unit tests which
check the scheduling quality for the interactive client running in
parallel with GPU hogs.
S
.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
---
drivers/gpu/drm/scheduler/sched_entity.c | 28 ++--
drivers/gpu/drm/scheduler/sched_internal.h | 7 +-
drivers/gpu/drm/scheduler/sched_main.c
e has been calculated.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_entity.c | 39 +
drivers/gpu/drm/scheduler/sched_internal.h | 66 ++
drivers/gpu/drm/sche
the submission side of things.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 13 ++---
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/scheduler
Lets move all the code dealing with struct drm_sched_rq into a separate
compilation unit. Advantage being sched_main.c is left with a clearer set
of responsibilities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu
ng for 9ms (effective 10% GPU load). Here we can see
the PCT and CPS reflecting real progress.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
Acked-by: Christian König
---
drivers/gpu/drm/scheduler/test
Move the code dealing with entities entering and exiting run queues to
helpers to logically separate it from jobs entering and exiting entities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler
to a single code path and remove a
bunch of code. Downside is round-robin mode now needs to lock on the job
pop path but that should not be visible.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_ent
(x100)=60
Every 100ms for the duration of the test test logs how many jobs each
client had completed, prefixed by minimum, average and maximum numbers.
When finished overall average delta between max and min is output as a
rough indicator to scheduling fairness.
Signed-off-by: Tvrtko Ursulin
Cc
w Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
Cc: Michel Dänzer
Tvrtko Ursulin (12):
drm/sched: Add some scheduling quality unit tests
drm/sched: Add some more scheduling quality unit tests
drm/sched: Implement RR via FIFO
drm/sched: Consolidate entity run queue management
On 18/07/2025 10:41, Philipp Stanner wrote:
On Fri, 2025-07-18 at 10:35 +0100, Tvrtko Ursulin wrote:
On 18/07/2025 10:31, Philipp Stanner wrote:
On Fri, 2025-07-18 at 08:13 +0100, Tvrtko Ursulin wrote:
On 16/07/2025 21:44, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 11:46, Tvrtko Ursulin
On 18/07/2025 10:31, Philipp Stanner wrote:
On Fri, 2025-07-18 at 08:13 +0100, Tvrtko Ursulin wrote:
On 16/07/2025 21:44, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 11:46, Tvrtko Ursulin wrote:
On 16/07/2025 15:30, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 10:49, Tvrtko Ursulin wrote
On 16/07/2025 21:44, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 11:46, Tvrtko Ursulin wrote:
On 16/07/2025 15:30, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 10:49, Tvrtko Ursulin wrote:
On 16/07/2025 14:31, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 05:51, Tvrtko Ursulin wrote:
Currently
the code would rely on the TDR handler restarting
itself then it would fail to do that if the job arrived on the pending
list after the check.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Maíra Canal
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm
On 16/07/2025 15:30, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 10:49, Tvrtko Ursulin wrote:
On 16/07/2025 14:31, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 05:51, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock
first time
to see if there are any j
On 16/07/2025 14:00, Christian König wrote:
On 16.07.25 14:51, Tvrtko Ursulin wrote:
be disabled once GFX/SDMA is no longer active. In this particular
case there was a race condition somewhere in the internal handshaking
with SDMA which led to SDMA missing doorbells sometimes and not
On 16/07/2025 14:31, Maíra Canal wrote:
Hi Tvrtko,
On 16/07/25 05:51, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock first
time
to see if there are any jobs, free a single job, and then lock again to
decide whether to re-queue itself if there are m
On 11/07/2025 17:51, Alex Deucher wrote:
On Fri, Jul 11, 2025 at 12:07 PM Tvrtko Ursulin
wrote:
On 11/07/2025 16:51, Alex Deucher wrote:
On Fri, Jul 11, 2025 at 9:58 AM Tvrtko Ursulin
wrote:
On 11/07/2025 14:39, Alex Deucher wrote:
On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin
we can simply add the signaled check and have it return the presence
of more jobs to be freed to the caller. That way the work item does not
have to lock the list again and repeat the signaled check.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Maíra Canal
Cc: Matt
On 12/07/2025 14:12, Maíra Canal wrote:
Hi Danilo,
On 7/11/25 16:22, Danilo Krummrich wrote:
On 7/11/25 9:08 PM, Maíra Canal wrote:
Hi Tvrtko,
On 11/07/25 12:09, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock
first time
to see if there are any j
On 11/07/2025 16:51, Alex Deucher wrote:
On Fri, Jul 11, 2025 at 9:58 AM Tvrtko Ursulin
wrote:
On 11/07/2025 14:39, Alex Deucher wrote:
On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin
wrote:
On 11/07/2025 13:45, Christian König wrote:
On 11.07.25 14:23, Tvrtko Ursulin wrote:
Commit
On 11/07/2025 14:04, Philipp Stanner wrote:
Late to the party; had overlooked that the discussion with Matt is
resolved. Some comments below
On Tue, 2025-07-08 at 13:20 +0100, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock first time
to see if there
we can simply add the signaled check and have it return the presence
of more jobs to be freed to the caller. That way the work item does not
have to lock the list again and repeat the signaled check.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
On 11/07/2025 14:39, Alex Deucher wrote:
On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin
wrote:
On 11/07/2025 13:45, Christian König wrote:
On 11.07.25 14:23, Tvrtko Ursulin wrote:
Commit
94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks")
added a workar
On 11/07/2025 13:45, Christian König wrote:
On 11.07.25 14:23, Tvrtko Ursulin wrote:
Commit
94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks")
added a workaround which disables GFXOFF for the duration of the job
submit stage (with a 100ms trailing hysteresis).
E
On 09/07/2025 18:22, Matthew Brost wrote:
On Wed, Jul 09, 2025 at 11:49:44AM +0100, Tvrtko Ursulin wrote:
On 09/07/2025 05:45, Matthew Brost wrote:
On Tue, Jul 08, 2025 at 01:20:32PM +0100, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock first time
14021147-255
jpeg_v1_0_decode_ring_emit_fence17881507-281
Total: Before=8949691, After=8929695, chg -0.22%
Signed-off-by: Tvrtko Ursulin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 16 ++--
1 file changed, 10 insertions(+), 6
E_EWMA(latency, 6, 4)) -
Approximately 30 SDMA submissions per second, ewma average logged once
per second therefore significantly hides the worst case latency. Eg.
the real improvement in max submission latency is severely understated by
these numbers.
Signed-off-by: Tvrtko Ursulin
References
On 09/07/2025 05:45, Matthew Brost wrote:
On Tue, Jul 08, 2025 at 01:20:32PM +0100, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock first time
to see if there are any jobs, free a single job, and then lock again to
decide whether to re-queue itself
On 08/07/2025 14:02, Christian König wrote:
On 08.07.25 14:54, Tvrtko Ursulin wrote:
On 08/07/2025 13:37, Christian König wrote:
On 08.07.25 11:51, Tvrtko Ursulin wrote:
There is no reason to queue just a single job if scheduler can take more
and re-queue the worker to queue more.
That
d
Panthor
Reviewed-by: Christian Gmeiner # for Etnaviv
Reviewed-by: Frank Binns # for Imagination
Reviewed-by: Tvrtko Ursulin # for Sched
Reviewed-by: Maíra Canal # for v3d
---
Changes in v4:
- Add forgotten driver accel/amdxdna. (Me)
- Rephrase the "init to NULL" comments. (Tvrtk
On 08/07/2025 13:37, Christian König wrote:
On 08.07.25 11:51, Tvrtko Ursulin wrote:
There is no reason to queue just a single job if scheduler can take more
and re-queue the worker to queue more.
That's not correct. This was intentionally avoided.
If more than just the scheduler is
On 08/07/2025 13:19, Philipp Stanner wrote:
On Tue, 2025-07-08 at 10:51 +0100, Tvrtko Ursulin wrote:
There is no reason to queue just a single job if scheduler can take
more
and re-queue the worker to queue more. We can simply feed the
hardware
with as much as it can take in one go and
On 08/07/2025 12:31, Philipp Stanner wrote:
On Tue, 2025-07-08 at 10:51 +0100, Tvrtko Ursulin wrote:
Extract out two copies of the identical code to function epilogue to
make
it smaller and more readable.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew
On 08/07/2025 12:22, Philipp Stanner wrote:
On Tue, 2025-07-08 at 10:51 +0100, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock first
time
to see if there are any jobs, free a single job, and then lock again
to
decide whether to re-queue itself if th
Extract out two copies of the identical code to function epilogue to make
it smaller and more readable.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 48 +++---
1
we can simply add the signaled check and have it return the presence
of more jobs to free to the caller. That way the work item does not have
to lock the list again and repeat the signaled check.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Phil
On 12/06/2025 11:44, Tvrtko Ursulin wrote:
Replace some allocate + copy_from_user patterns with dedicated helpers.
This shrinks the source code and is also good for security due SLAB bucket
separation between the kernel and uapi.
Any takers for easy reviews?
Regards,
Tvrtko
Tvrtko
Now that the run queue to scheduler relationship is always 1:1 we can
embed it (the run queue) directly in the scheduler struct and save on
some allocation error handling code and such.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp
.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
---
drivers/gpu/drm/scheduler/sched_entity.c | 28 +
drivers/gpu/drm/scheduler/sched_internal.h | 70 +-
drivers/gpu/drm
should not be visible.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_entity.c | 46 --
drivers/gpu/drm/scheduler/sched_main.c | 76 ++--
include/drm
e has been calculated.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_entity.c | 39 +
drivers/gpu/drm/scheduler/sched_internal.h | 66 ++
drivers/gpu/drm/sche
the submission side of things.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 13 ++---
1 file changed, 2 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/scheduler
Lets move all the code dealing with struct drm_sched_rq into a separate
compilation unit. Advantage being sched_main.c is left with a clearer set
of responsibilities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu
There is no reason to queue just a single job if scheduler can take more
and re-queue the worker to queue more. We can simply feed the hardware
with as much as it can take in one go and hopefully win some latency.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc
There is no need to keep entities with no jobs in the tree so lets remove
it once the last job is consumed. This keeps the tree smaller which is
nicer and more efficient as entities are removed and re-added on every
popped job.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo
so rename it to just the tree.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 23 ++--
drivers/gpu/drm/scheduler/sched_entity.c | 29 +
drivers/gpu/drm/scheduler
ng for 9ms (effective 10% GPU load). Here we can see
the PCT and CPS reflecting real progress.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
Acked-by: Christian König
---
drivers/gpu/drm/scheduler/test
the code would rely on the TDR handler restarting
itself then it would fail to do that if the job arrived on the pending
list after the check.
Also fix one stale comment while touching the function.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc
-lock on the job
free path". (Maira, Philipp)
Cc: Christian König
Cc: Danilo Krummrich
CC: Leo Liu
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
Cc: Michel Dänzer
Tvrtko Ursulin (15):
drm/sched: Add some scheduling quality unit tests
drm/sched: Add some more
Extract out two copies of the identical code to function epilogue to make
it smaller and more readable.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 48 +++---
1
Move the code dealing with entities entering and exiting run queues to
helpers to logically separate it from jobs entering and exiting entities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler
we can simply add the signaled check and have it return the presence
of more jobs to free to the caller. That way the work item does not have
to lock the list again and repeat the signaled check.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Phil
(x100)=60
Every 100ms for the duration of the test test logs how many jobs each
client had completed, prefixed by minimum, average and maximum numbers.
When finished overall average delta between max and min is output as a
rough indicator to scheduling fairness.
Signed-off-by: Tvrtko Ursulin
Cc
On 04/07/2025 15:18, Maíra Canal wrote:
Hi Tvrtko,
In general, LGTM, but I miss documentation for all the new structures
and functions that you implemented.
Okay, I added some kerneldoc locally.
Regards,
Tvrtko
On 23/06/25 09:27, Tvrtko Ursulin wrote:
To implement fair scheduling we
On 04/07/2025 14:51, Maíra Canal wrote:
Hi Tvrtko,
On 23/06/25 09:27, Tvrtko Ursulin wrote:
Move the code dealing with entities entering and exiting run queues to
helpers to logically separate it from jobs entering and exiting entities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc
On 04/07/2025 14:59, Philipp Stanner wrote:
On Fri, 2025-07-04 at 14:30 +0100, Tvrtko Ursulin wrote:
On 04/07/2025 13:56, Philipp Stanner wrote:
On Fri, 2025-07-04 at 09:29 -0300, Maíra Canal wrote:
Hi Tvrtko,
On 23/06/25 09:27, Tvrtko Ursulin wrote:
Currently the job free work item will
On 04/07/2025 14:32, Maíra Canal wrote:
Hi Tvrtko,
On 23/06/25 09:27, Tvrtko Ursulin wrote:
Extract out two copies of the identical code to function epilogue to make
it smaller and more readable.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc
On 04/07/2025 14:18, Maíra Canal wrote:
Hi Tvrtko,
On 23/06/25 09:27, Tvrtko Ursulin wrote:
Round-robin being the non-default policy and unclear how much it is used,
we can notice that it can be implemented using the FIFO data
structures if
we only invent a fake submit timestamp which is
On 04/07/2025 13:56, Philipp Stanner wrote:
On Fri, 2025-07-04 at 09:29 -0300, Maíra Canal wrote:
Hi Tvrtko,
On 23/06/25 09:27, Tvrtko Ursulin wrote:
Currently the job free work item will lock sched->job_list_lock
first time
to see if there are any jobs, free a single job, and then l
On 04/07/2025 13:59, Philipp Stanner wrote:
On Mon, 2025-06-23 at 13:27 +0100, Tvrtko Ursulin wrote:
Move work queue allocation into a helper for a more streamlined
function
body.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
On 24/06/2025 12:34, Sunil Khatri wrote:
add support to add a directory for each client-id
with root at the dri level. Since the clients are
unique and not just related to one single drm device,
so it makes more sense to add all the client based
nodes with root as dri.
Also create a symlink ba
)
Signed-off-by: Lin.Cao
Reviewed-by: Tvrtko Ursulin
Acked-by: Christian König
Looks good to me.
Regards,
Tvrtko
*From:* Koenig, Christian
*Sent:* Tuesday, June 24, 2025 21:11
*To:* Tvrtko Ursulin ; cao, lin ;
amd-g
ed and cause memleak.
Ouch I removed the wrong one. :( Probably misread kref_put as kref_read..
Reviewed-by: Tvrtko Ursulin
But is the SHA correct? I see it is dd64956685fa.
Which would mean adding:
Fixes: dd64956685fa ("drm/amdgpu: Remove duplicated "context still
alive" chec
On 23/06/2025 15:55, Christian König wrote:
On 23.06.25 15:07, Tvrtko Ursulin wrote:
On 23/06/2025 11:24, Khatri, Sunil wrote:
On 6/23/2025 2:58 PM, Tvrtko Ursulin wrote:
On 18/06/2025 14:47, Sunil Khatri wrote:
add support to add a directory for each client-id
with root at the dri
On 23/06/2025 11:24, Khatri, Sunil wrote:
On 6/23/2025 2:58 PM, Tvrtko Ursulin wrote:
On 18/06/2025 14:47, Sunil Khatri wrote:
add support to add a directory for each client-id
with root at the dri level. Since the clients are
unique and not just related to one single drm device,
so it
e has been calculated.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_entity.c | 29
drivers/gpu/drm/scheduler/sched_internal.h | 40 ++
drivers/gpu/drm/sche
so rename it to just the tree.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 23 ++--
drivers/gpu/drm/scheduler/sched_entity.c | 29 +
drivers/gpu/drm/scheduler
Now that the run queue to scheduler relationship is always 1:1 we can
embed it (the run queue) directly in the scheduler struct and save on
some allocation error handling code and such.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp
.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
---
drivers/gpu/drm/scheduler/sched_entity.c | 28 +
drivers/gpu/drm/scheduler/sched_internal.h | 70 +-
drivers/gpu/drm
should not be visible.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_entity.c | 45 --
drivers/gpu/drm/scheduler/sched_main.c | 76 ++--
include/drm
mprove commit message justification for one patch. (Philipp)
* Add comment in drm_sched_alloc_wq. (Christian)
Cc: Christian König
Cc: Danilo Krummrich
CC: Leo Liu
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
Cc: Michel Dänzer
Tvrtko Ursulin (16):
drm/sched: Add some
the
tree only containing runnable entities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_rq.c | 24 +---
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a
There is no reason to queue just a single job if scheduler can take more
and re-queue the worker to queue more. We can simply feed the hardware
with as much as it can take in one go and hopefully win some latency.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc
Extract out two copies of the identical code to function epilogue to make
it smaller and more readable.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 48 +++---
1
Move the code dealing with entities entering and exiting run queues to
helpers to logically separate it from jobs entering and exiting entities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler
the submission side of things.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 15 ++-
1 file changed, 2 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/scheduler
Lets move all the code dealing with struct drm_sched_rq into a separate
compilation unit. Advantage being sched_main.c is left with a clearer set
of responsibilities.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu
(x100)=60
Every 100ms for the duration of the test test logs how many jobs each
client had completed, prefixed by minimum, average and maximum numbers.
When finished overall average delta between max and min is output as a
rough indicator to scheduling fairness.
Signed-off-by: Tvrtko Ursulin
Cc
we can simply add the signaled check and have it return the presence
of more jobs to free to the caller. That way the work item does not have
to lock the list again and repeat the signaled check.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Phil
Move work queue allocation into a helper for a more streamlined function
body.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
---
drivers/gpu/drm/scheduler/sched_main.c | 33 --
1 file changed, 20
the code would rely on the TDR handler restarting
itself then it would fail to do that if the job arrived on the pending
list after the check.
Also fix one stale comment while touching the function.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc
ng for 9ms (effective 10% GPU load). Here we can see
the PCT and CPS reflecting real progress.
Signed-off-by: Tvrtko Ursulin
Cc: Christian König
Cc: Danilo Krummrich
Cc: Matthew Brost
Cc: Philipp Stanner
Cc: Pierre-Eric Pelloux-Prayer
Acked-by: Christian König
---
drivers/gpu/drm/scheduler/test
On 18/06/2025 14:47, Sunil Khatri wrote:
add support to add a directory for each client-id
with root at the dri level. Since the clients are
unique and not just related to one single drm device,
so it makes more sense to add all the client based
nodes with root as dri.
Also create a symlink b
On 13/06/2025 08:15, Sunil Khatri wrote:
root@amd-X570-AORUS-ELITE:~# cat /sys/kernel/debug/dri/0/clients
command tgid dev master a uid magic
name client-id
systemd-logind 1056 0 yy 0
and undo the
smarts from the tracpoints side of things. There is no functional change
since the rest is left in place. Later we can consider changing the
dma_fence_ops return types too, and handle all the individual drivers
which define them.
Signed-off-by: Tvrtko Ursulin
Fixes: 506aa8b02a8d (&quo
On 12/06/2025 18:49, Lucas De Marchi wrote:
On Tue, Jun 10, 2025 at 05:42:26PM +0100, Tvrtko Ursulin wrote:
Xe can free some of the data pointed to by the dma-fences it exports.
Most
notably the timeline name can get freed if userspace closes the
associated
submit queue. At the same time
Replace kvmalloc_array() + copy_from_user() with vmemdup_array_user() on
the fast path.
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 41 +
1 file
On 12/06/2025 08:21, Christian König wrote:
On 6/11/25 17:29, Tvrtko Ursulin wrote:
On 11/06/2025 15:21, Christian König wrote:
On 6/11/25 16:00, Tvrtko Ursulin wrote:
Running the Cyberpunk 2077 benchmark we can observe that the lookup helper
is relatively hot, but the 97% of the calls are
Replace k(v)malloc_array() + copy_from_user() with (v)memdup_array_user().
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 33 --
1 file changed, 10
Replace kmalloc_array() + copy_from_user() with memdup_array_user().
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 19 +--
1 file changed, 5 insertions
Replace kzalloc() + copy_from_user() with memdup_user().
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c | 14 +-
1 file changed, 5 insertions(+), 9 deletions
Replace some allocate + copy_from_user patterns with dedicated helpers.
This shrinks the source code and is also good for security due SLAB bucket
separation between the kernel and uapi.
Tvrtko Ursulin (4):
drm/amdgpu: Use vmemdup_array_user in
amdgpu_bo_create_list_entry_array
drm
On 11/06/2025 15:21, Christian König wrote:
On 6/11/25 16:00, Tvrtko Ursulin wrote:
Running the Cyberpunk 2077 benchmark we can observe that the lookup helper
is relatively hot, but the 97% of the calls are for a single object. (~3%
for two points, and never more than three points. While a
1 - 100 of 639 matches
Mail list logo