Am 03.04.25 um 16:40 schrieb Philipp Stanner: >>>>> >>>> >>>> That looks like a really really awkward approach. The driver >>>> basically uses a the DMA fence infrastructure as middle layer and >>>> callbacks into itself to cleanup it's own structures. >>>> >>> >>> What else are callbacks good for, if not to do something >>> automatically >>> when the fence gets signaled? >>> >> >> Well if you add a callback for a signal you issued yourself then >> that's kind of awkward. >> >> E.g. you call into the DMA fence code, just for the DMA fence code >> to call yourself back again. > Now we're entering CS-Philosophy, because it depends on who "you" and > "yourself" are. In case of the driver, yes, naturally it registers a > callback because at some other place (e.g., in the driver's interrupt > handler) the fence will be signaled and the driver wants the callback > stuff to be done. > > If that's not dma_fences' callbacks' purpose, then I'd be interested in > knowing what their purpose is, because from my POV this discussion > seems to imply that we effectively must never use them for anything. > > How could it ever be different? Who, for example, registers dma_fence > callbacks while not signaling them "himself"?
Let me try to improve that explanation. First of all we have components, they can be drivers, frameworks, helpers like the DRM scheduler or generally any code which is more or less stand alone. The definition of components is a bit tricky, but in general people usually get what they mean. E.g. in this case here we have nouveau as single component. Now the DMA fence interface allows sending signals between different components in a standardized way, one component can send a signal to another one and they don't necessarily need to know anything from each other except that both are using the DMA fence framework in the documented manner. When a component is now both the provide and the consumer at the same time you actually need a reason for that. Could be for example that it wants to consume signals from both itself as well as others, but this doesn't apply for this use case here. Considering pool or billiard you can of course do a trick shot and try to hit the 8, but going straight for it just has a better chance to succeed. > > >> >> >>> >>>> >>>> Additional to that we don't guarantee any callback order for the >>>> DMA >>>> fence and so it can be that mix cleaning up the callback with >>>> other >>>> work which is certainly not good when you want to guarantee that >>>> the >>>> cleanup happens under the same lock. >>>> >>> >>> Isn't my perception correct that the primary issue you have with >>> this >>> approach is that dma_fence_put() is called from within the >>> callback? Or >>> do you also take issue with deleting from the list? >>> >> >> Well kind of both. The issue is that the caller of >> dma_fence_signal() or dma_fence_signal_locked() must hold the >> reference until the function returns. >> >> When you do the list cleanup and the drop inside the callback it is >> perfectly possible that the fence pointer becomes stale before you >> return and that's really not a good idea. > In other words, you would prefer if this patch would have a function > with my callback's code in a function, and that function would be > called at every place where the driver signals a fence? > > If that's your opinion, then, IOW, it would mean for us to go almost > back to status quo, with nouveau_fence_signal2.0, but with the > dma_fence_is_signaled() part fixed. Well it could potentially be cleaned up more, but as far as I can see only the two lines I pointed out in the other mail need to move at the right place, yes. I mean it's just two lines. If you open code that or if you make a nouveau_cleanup_list_ref() function (or similar) is perfectly up to you. Regards, Christian. > > > P. > >> >> >>> >>> >>> >>>> >>>> Instead the call to dma_fence_signal_locked() should probably be >>>> removed from nouveau_fence_signal() and into >>>> nouveau_fence_context_kill() and nouveau_fence_update(). >>>> >>>> This way nouveau_fence_is_signaled() can call this function as >>>> well. >>>> >>> >>> Which "this function"? dma_fence_signal_locked() >>> >> >> No the cleanup function for the list entry. Whatever you call that >> then, the name nouveau_fence_signal() is probably not appropriate any >> more. >> >> >>> >>> >>> >>>> >>>> BTW: nouveau_fence_no_signaling() looks completely broken as >>>> well. It >>>> calls nouveau_fence_is_signaled() and then list_del() on the >>>> fence >>>> head. >>>> >>> >>> I can assure you that a great many things in Nouveau look >>> completely >>> broken. >>> >>> The question for us is always the cost-benefit-ratio when fixing >>> bugs. >>> There are fixes that solve the bug with reasonable effort, and >>> there >>> are great reworks towards an ideal state. >>> >> >> I would just simply drop that function. As far as I can see it >> severs no purpose other than doing exactly what the common DMA fence >> code does anyway. >> >> Just one less thing which could fail. >> >> Christian. >> >> >>> >>> >>> P. >>> >>> >>> >>>> >>>> As far as I can see that is completely superfluous and should >>>> probably be dropped. IIRC I once had a patch to clean that up but >>>> it >>>> was dropped for some reason. >>>> >>>> Regards, >>>> Christian. >>>> >>>> >>>> >>>>> >>>>> + dma_fence_put(&fence->base); >>>>> + if (ret) >>>>> + return ret; >>>>> + >>>>> ret = fctx->emit(fence); >>>>> if (!ret) { >>>>> dma_fence_get(&fence->base); >>>>> @@ -246,8 +251,7 @@ nouveau_fence_emit(struct nouveau_fence >>>>> *fence) >>>>> return -ENODEV; >>>>> } >>>>> >>>>> - if (nouveau_fence_update(chan, fctx)) >>>>> - nvif_event_block(&fctx->event); >>>>> + nouveau_fence_update(chan, fctx); >>>>> >>>>> list_add_tail(&fence->head, &fctx->pending); >>>>> spin_unlock_irq(&fctx->lock); >>>>> @@ -270,8 +274,8 @@ nouveau_fence_done(struct nouveau_fence >>>>> *fence) >>>>> >>>>> spin_lock_irqsave(&fctx->lock, flags); >>>>> chan = rcu_dereference_protected(fence- >>>>>> channel, >>>>> lockdep_is_held(&fctx->lock)); >>>>> - if (chan && nouveau_fence_update(chan, fctx)) >>>>> - nvif_event_block(&fctx->event); >>>>> + if (chan) >>>>> + nouveau_fence_update(chan, fctx); >>>>> spin_unlock_irqrestore(&fctx->lock, flags); >>>>> } >>>>> return dma_fence_is_signaled(&fence->base); >>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h >>>>> b/drivers/gpu/drm/nouveau/nouveau_fence.h >>>>> index 8bc065acfe35..e6b2df7fdc42 100644 >>>>> --- a/drivers/gpu/drm/nouveau/nouveau_fence.h >>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.h >>>>> @@ -10,6 +10,7 @@ struct nouveau_bo; >>>>> >>>>> struct nouveau_fence { >>>>> struct dma_fence base; >>>>> + struct dma_fence_cb cb; >>>>> >>>>> struct list_head head; >>>>> >>>>> >>>> >>> >> >>