Am 09.04.25 um 12:28 schrieb Philipp Stanner: > On Fri, 2025-03-21 at 16:58 +0100, Christian König wrote: >> Sometimes drivers need to be able to submit multiple jobs which >> depend on >> each other to different schedulers at the same time, but using >> drm_sched_job_add_dependency() can't fail any more after the first >> job is >> initialized. >> >> This function preallocate memory for dependency slots so that no >> ENOMEM >> can come later while adding dependencies. >> >> v2: rework implementation an documentation >> >> Signed-off-by: Christian König <christian.koe...@amd.com> >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 44 >> ++++++++++++++++++++++++-- >> include/drm/gpu_scheduler.h | 2 ++ >> 2 files changed, 43 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >> b/drivers/gpu/drm/scheduler/sched_main.c >> index 4d4219fbe49d..ee3701f346b2 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -852,6 +852,39 @@ void drm_sched_job_arm(struct drm_sched_job >> *job) >> } >> EXPORT_SYMBOL(drm_sched_job_arm); >> >> +/** >> + * drm_sched_job_prealloc_dependency_slots - avoid ENOMEM on adding >> dependencies >> + * @job: scheduler job where dependencies will be added >> + * @num_deps: number of dependencies to preallocate slots for >> + * >> + * Sometimes drivers need to be able to submit multiple jobs which >> depend on >> + * each other to different schedulers at the same time, but using >> + * drm_sched_job_add_dependency() can't fail any more after the >> first job is >> + * initialized. >> + * >> + * This function preallocate memory for dependency slots so that no >> ENOMEM can >> + * come later while adding dependencies. >> + * >> + * Return: >> + * 0 on success, or an error on failing to expand the array. >> + */ >> +int drm_sched_job_prealloc_dependency_slots(struct drm_sched_job >> *job, >> + unsigned int num_deps) >> +{ >> + u32 id = 0; >> + int ret; >> + >> + while (num_deps--) { >> + ret = xa_alloc(&job->dependencies, &id, >> XA_ZERO_ENTRY, >> + xa_limit_32b, GFP_KERNEL); > I've had some time to re-read the xarray documentation and I think that > this is what xa_reserve() was born for. The Book of Documentation/core- > api/xarray.rst sayeth: > > "Sometimes you need to ensure that a subsequent call to xa_store() > will not need to allocate memory. The xa_reserve() function > will store a reserved entry at the indicated index. Users of the > normal API will see this entry as containing ``NULL``." > > That's far better, this way we don't have to use that more or less > xarray-internal flag.
Yeah I have seen that as well. The reason why I didn't followed this route was that I wasn't sure if I then need to check for NULL entries while iterating over the XA. Additional to that I couldn't figure out of hand how to determine a the next free index slot. Have you found any example how to use that? I mean the documentation could certainly be improved a bit. Regards, Christian. > > >> + if (ret != 0) >> + return ret; >> + } >> + >> + return 0; >> +} >> +EXPORT_SYMBOL(drm_sched_job_prealloc_dependency_slots); >> + >> /** >> * drm_sched_job_add_dependency - adds the fence as a job dependency >> * @job: scheduler job to add the dependencies to >> @@ -878,10 +911,15 @@ int drm_sched_job_add_dependency(struct >> drm_sched_job *job, >> * engines involved, rather than the number of BOs. >> */ >> xa_for_each(&job->dependencies, index, entry) { >> - if (entry->context != fence->context) >> + if (xa_is_zero(entry)) { >> + /* >> + * Reserved entries must not alloc memory, >> but let's >> + * use GFP_ATOMIC just to be on the >> defensive side. >> + */ >> + xa_store(&job->dependencies, index, fence, >> GFP_ATOMIC); > And regarding this – it can actually never happen, but you provide > ATOMIC just to be sure? > > I think it would be better if we'd just run into an obvious bug here > instead, so like a deadlock with GFP_KERNEL. > > That's how we do it with pointers that cannot be NULL, too. If the > impossible were to happen and it were NULL, we'd crash. > > P. > >> + } else if (entry->context != fence->context) { >> continue; >> - >> - if (dma_fence_is_later(fence, entry)) { >> + } else if (dma_fence_is_later(fence, entry)) { >> dma_fence_put(entry); >> xa_store(&job->dependencies, index, fence, >> GFP_KERNEL); >> } else { >> diff --git a/include/drm/gpu_scheduler.h >> b/include/drm/gpu_scheduler.h >> index 1a7e377d4cbb..916e820b27ff 100644 >> --- a/include/drm/gpu_scheduler.h >> +++ b/include/drm/gpu_scheduler.h >> @@ -632,6 +632,8 @@ int drm_sched_job_init(struct drm_sched_job *job, >> u32 credits, void *owner); >> void drm_sched_job_arm(struct drm_sched_job *job); >> void drm_sched_entity_push_job(struct drm_sched_job *sched_job); >> +int drm_sched_job_prealloc_dependency_slots(struct drm_sched_job >> *job, >> + unsigned int num_deps); >> int drm_sched_job_add_dependency(struct drm_sched_job *job, >> struct dma_fence *fence); >> int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,