On Fri, 2025-03-21 at 16:58 +0100, Christian König wrote: > Sometimes drivers need to be able to submit multiple jobs which > depend on > each other to different schedulers at the same time, but using > drm_sched_job_add_dependency() can't fail any more after the first > job is > initialized. > > This function preallocate memory for dependency slots so that no > ENOMEM > can come later while adding dependencies. > > v2: rework implementation an documentation > > Signed-off-by: Christian König <christian.koe...@amd.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 44 > ++++++++++++++++++++++++-- > include/drm/gpu_scheduler.h | 2 ++ > 2 files changed, 43 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 4d4219fbe49d..ee3701f346b2 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -852,6 +852,39 @@ void drm_sched_job_arm(struct drm_sched_job > *job) > } > EXPORT_SYMBOL(drm_sched_job_arm); > > +/** > + * drm_sched_job_prealloc_dependency_slots - avoid ENOMEM on adding > dependencies > + * @job: scheduler job where dependencies will be added > + * @num_deps: number of dependencies to preallocate slots for > + * > + * Sometimes drivers need to be able to submit multiple jobs which > depend on > + * each other to different schedulers at the same time, but using > + * drm_sched_job_add_dependency() can't fail any more after the > first job is > + * initialized. > + * > + * This function preallocate memory for dependency slots so that no > ENOMEM can > + * come later while adding dependencies. > + * > + * Return: > + * 0 on success, or an error on failing to expand the array. > + */ > +int drm_sched_job_prealloc_dependency_slots(struct drm_sched_job > *job, > + unsigned int num_deps) > +{ > + u32 id = 0; > + int ret; > + > + while (num_deps--) { > + ret = xa_alloc(&job->dependencies, &id, > XA_ZERO_ENTRY, > + xa_limit_32b, GFP_KERNEL);
I've had some time to re-read the xarray documentation and I think that this is what xa_reserve() was born for. The Book of Documentation/core- api/xarray.rst sayeth: "Sometimes you need to ensure that a subsequent call to xa_store() will not need to allocate memory. The xa_reserve() function will store a reserved entry at the indicated index. Users of the normal API will see this entry as containing ``NULL``." That's far better, this way we don't have to use that more or less xarray-internal flag. > + if (ret != 0) > + return ret; > + } > + > + return 0; > +} > +EXPORT_SYMBOL(drm_sched_job_prealloc_dependency_slots); > + > /** > * drm_sched_job_add_dependency - adds the fence as a job dependency > * @job: scheduler job to add the dependencies to > @@ -878,10 +911,15 @@ int drm_sched_job_add_dependency(struct > drm_sched_job *job, > * engines involved, rather than the number of BOs. > */ > xa_for_each(&job->dependencies, index, entry) { > - if (entry->context != fence->context) > + if (xa_is_zero(entry)) { > + /* > + * Reserved entries must not alloc memory, > but let's > + * use GFP_ATOMIC just to be on the > defensive side. > + */ > + xa_store(&job->dependencies, index, fence, > GFP_ATOMIC); And regarding this – it can actually never happen, but you provide ATOMIC just to be sure? I think it would be better if we'd just run into an obvious bug here instead, so like a deadlock with GFP_KERNEL. That's how we do it with pointers that cannot be NULL, too. If the impossible were to happen and it were NULL, we'd crash. P. > + } else if (entry->context != fence->context) { > continue; > - > - if (dma_fence_is_later(fence, entry)) { > + } else if (dma_fence_is_later(fence, entry)) { > dma_fence_put(entry); > xa_store(&job->dependencies, index, fence, > GFP_KERNEL); > } else { > diff --git a/include/drm/gpu_scheduler.h > b/include/drm/gpu_scheduler.h > index 1a7e377d4cbb..916e820b27ff 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -632,6 +632,8 @@ int drm_sched_job_init(struct drm_sched_job *job, > u32 credits, void *owner); > void drm_sched_job_arm(struct drm_sched_job *job); > void drm_sched_entity_push_job(struct drm_sched_job *sched_job); > +int drm_sched_job_prealloc_dependency_slots(struct drm_sched_job > *job, > + unsigned int num_deps); > int drm_sched_job_add_dependency(struct drm_sched_job *job, > struct dma_fence *fence); > int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,