On Tue, 12 Sep 2023 15:34:41 +0200 Danilo Krummrich <d...@redhat.com> wrote:
> On Tue, Sep 12, 2023 at 03:27:05PM +0200, Boris Brezillon wrote: > > On Fri, 25 Aug 2023 15:45:49 +0200 > > Christian König <christian.koe...@amd.com> wrote: > > > > > >>>> I tried this patch with Nouveau and found a race condition: > > > >>>> > > > >>>> In drm_sched_run_job_work() the job is added to the pending_list via > > > >>>> drm_sched_job_begin(), then the run_job() callback is called and the > > > >>>> scheduled > > > >>>> fence is signaled. > > > >>>> > > > >>>> However, in parallel drm_sched_get_cleanup_job() might be called from > > > >>>> drm_sched_free_job_work(), which picks the first job from the > > > >>>> pending_list and > > > >>>> for the next job on the pending_list sets the scheduled fence' > > > >>>> timestamp field. > > > >> Well why can this happen in parallel? Either the work items are > > > >> scheduled to > > > >> a single threaded work queue or you have protected the pending list > > > >> with > > > >> some locks. > > > >> > > > > Xe uses a single-threaded work queue, Nouveau does not (desired > > > > behavior). > > > > > > > > The list of pending jobs is protected by a lock (safe), the race is: > > > > > > > > add job to pending list > > > > run_job > > > > signal scheduled fence > > > > > > > > dequeue from pending list > > > > free_job > > > > update timestamp > > > > > > > > Once a job is on the pending list its timestamp can be accessed which > > > > can blow up if scheduled fence isn't signaled or more specifically > > > > unless > > > > DMA_FENCE_FLAG_TIMESTAMP_BIT is set. > > > > I'm a bit lost. How can this lead to a NULL deref? Timestamp is a > > ktime_t embedded in dma_fence, and finished/scheduled are both > > dma_fence objects embedded in drm_sched_fence. So, unless > > {job,next_job}->s_fence is NULL, or {job,next_job} itself is NULL, I > > don't really see where the NULL deref is. If s_fence is NULL, that means > > drm_sched_job_init() wasn't called (unlikely to be detected that late), > > or ->free_job()/drm_sched_job_cleanup() was called while the job was > > still in the pending list. I don't really see a situation where job > > could NULL to be honest. > > I think the problem here was that a dma_fence' timestamp field is within a > union > together with it's cb_list list_head [1]. If a timestamp is set before the > fence > is actually signalled, dma_fence_signal_timestamp_locked() will access the > cb_list to run the particular callbacks registered to this dma_fence. However, > writing the timestap will overwrite this list_head since it's a union, hence > we'd try to dereference the timestamp while iterating the list. Ah, right. I didn't notice it was a union, thought it was a struct... > > [1] > https://elixir.bootlin.com/linux/latest/source/include/linux/dma-fence.h#L87 > > > > > While I agree that updating the timestamp before the fence has been > > flagged as signaled/timestamped is broken (timestamp will be > > overwritten when dma_fence_signal(scheduled) is called) I don't see a > > situation where it would cause a NULL/invalid pointer deref. So I > > suspect there's another race causing jobs to be cleaned up while > > they're still in the pending_list. > > > > > > > > Ah, that problem again. No that is actually quite harmless. > > > > > > You just need to double check if the DMA_FENCE_FLAG_TIMESTAMP_BIT is > > > already set and if it's not set don't do anything. > > >