On 19.08.25 11:04, Janusz Krzysztofik wrote:
> On Monday, 18 August 2025 16:42:56 CEST Christian König wrote:
>> On 18.08.25 16:30, Janusz Krzysztofik wrote:
>>> Hi Christian,
>>>
>>> On Thursday, 14 August 2025 14:24:29 CEST Christian König wrote:
>>>>
>>>> On 14.08.25 10:16, Janusz Krzysztofik wrote:
>>>>> When first user starts waiting on a not yet signaled fence of a chain
>>>>> link, a dma_fence_chain callback is added to a user fence of that link.
>>>>> When the user fence of that chain link is then signaled, the chain is
>>>>> traversed in search for a first not signaled link and the callback is
>>>>> rearmed on a user fence of that link.
>>>>>
>>>>> Since chain fences may be exposed to user space, e.g. over drm_syncobj
>>>>> IOCTLs, users may start waiting on any link of the chain, then many links
>>>>> of a chain may have signaling enabled and their callbacks added to their
>>>>> user fences.  Once an arbitrary user fence is signaled, all
>>>>> dma_fence_chain callbacks added to it so far must be rearmed to another
>>>>> user fence of the chain.  In extreme scenarios, when all N links of a
>>>>> chain are awaited and then signaled in reverse order, the dma_fence_chain
>>>>> callback may be called up to N * (N + 1) / 2 times (an arithmetic series).
>>>>>
>>>>> To avoid that potential excessive accumulation of dma_fence_chain
>>>>> callbacks, rearm a trimmed-down, signal only callback version to the base
>>>>> fence of a previous link, if not yet signaled, otherwise just signal the
>>>>> base fence of the current link instead of traversing the chain in search
>>>>> for a first not signaled link and moving all callbacks collected so far to
>>>>> a user fence of that link.
>>>>
>>>> Well clear NAK to that! You can easily overflow the kernel stack with that!
>>>
>>> I'll be happy to propose a better solution, but for that I need to 
>>> understand 
>>> better your message.  Could you please point out an exact piece of the 
>>> proposed code and/or describe a scenario where you can see the risk of 
>>> stack 
>>> overflow?
>>
>> The sentence "rearm .. to the base fence of a previous link" sounds like you 
>> are trying to install a callback on the signaling to the previous chain 
>> element.
>>
>> That is exactly what I pointed out previously where you need to be super 
>> careful because when this chain signals the callbacks will execute 
>> recursively which means that you can trivially overflow the kernel stack if 
>> you have more than a handful of chain elements.
>>
>> In other words A waits for B, B waits for C, C waits for D etc.... when D 
>> finally signals it will call C which in turn calls B which in turn calls A.
> 
> OK, maybe my commit description was not precise enough, however, I didn't 
> describe implementation details (how) intentionally.
> When D signals then it doesn't call C directly, only it submits an irq work 
> that calls C.  Then C doesn't just call B, only it submits another irq work 
> that calls B, and so on.
> Doesn't that code pattern effectively break the recursion loop into separate 
> work items, each with its own separate stack?

No, it's architecture dependent if the irq_work executes on a separate stack or 
not.

You would need a work_struct to really separate the two and I would reject that 
because it adds additional latency to the signaling path.

>>
>> Even if the chain is a recursive data structure you absolutely can't use 
>> recursion for the handling of it.
>>
>> Maybe I misunderstood your textual description but reading a sentence like 
>> this rings all alarm bells here. Otherwise I can't see what the patch is 
>> supposed to be optimizing.
> 
> OK, maybe I should start my commit description of this patch with a copy of 
> the first sentence from cover letter and also from patch 1/4 description that 
> informs about the problem as reported by CI.  Maybe I should also provide a 
> comparison of measured signaling times from trybot executions [1][2][3].  
> Here are example numbers from CI machine fi-bsw-n3050:

Yeah and I've pointed out before that this is irrelevant.

The problem is *not* the dma_fence_chain implementation, that one is doing 
exactly what is expected to do.

The problem is that the test case is nonsense. I've pointed that out numerous 
times now!

Regards,
Christian.

> 
> With signaling time reports only added to selftests (patch 1 of 4):
> <6> [777.914451] dma-buf: Running dma_fence_chain/wait_forward
> <6> [778.123516] wait_forward: 4096 signals in 21373487 ns
> <6> [778.335709] dma-buf: Running dma_fence_chain/wait_backward
> <6> [795.791546] wait_backward: 4096 signals in 17249051192 ns
> <6> [795.859699] dma-buf: Running dma_fence_chain/wait_random
> <6> [796.161375] wait_random: 4096 signals in 97386256 ns
> 
> With dma_fence_enable_signaling() replaced in selftests with dma_fence_wait() 
> (patches 1-3 of 4):
> <6> [782.505692] dma-buf: Running dma_fence_chain/wait_forward
> <6> [784.609213] wait_forward: 4096 signals in 36513103 ns
> <3> [784.837226] Reported -4 for kthread_stop_put(0)!
> <6> [785.147643] dma-buf: Running dma_fence_chain/wait_backward
> <6> [806.367763] wait_backward: 4096 signals in 18428009499 ns
> <6> [807.175325] dma-buf: Running dma_fence_chain/wait_random
> <6> [809.453942] wait_random: 4096 signals in 119761950 ns
> 
> With the fix (patches 1-4 of 4):
> <6> [731.519020] dma-buf: Running dma_fence_chain/wait_forward
> <6> [733.623375] wait_forward: 4096 signals in 31890220 ns
> <6> [734.258972] dma-buf: Running dma_fence_chain/wait_backward
> <6> [736.267325] wait_backward: 4096 signals in 39007955 ns
> <6> [736.700221] dma-buf: Running dma_fence_chain/wait_random
> <6> [739.346706] wait_random: 4096 signals in 48384865 ns
> 
> Signaling time in wait_backward selftest has been reduced from 17s to 39ms.
> 
> [1] https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_152785v1/index.html?
> [2] https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_152828v2/index.html?
> [3] https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_152830v2/index.html?
> 
>>
>>>>
>>>> Additional to this messing with the fence ops outside of the dma_fence 
>>>> code is an absolute no-go.
>>>
>>> Could you please explain what piece of code you are referring to when you 
>>> say 
>>> "messing with the fence ops outside the dma_fence code"?  If not this patch 
>>> then which particular one of this series did you mean?  I'm assuming you 
>>> didn't mean drm_syncobj code that I mentioned in my commit descriptions.
>>
>> See below.
>>
>>>
>>> Thanks,
>>> Janusz
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12904
>>>>> Suggested-by: Chris Wilson <chris.p.wil...@linux.intel.com>
>>>>> Signed-off-by: Janusz Krzysztofik <janusz.krzyszto...@linux.intel.com>
>>>>> ---
>>>>>  drivers/dma-buf/dma-fence-chain.c | 101 +++++++++++++++++++++++++-----
>>>>>  1 file changed, 84 insertions(+), 17 deletions(-)
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-fence-chain.c 
>>>>> b/drivers/dma-buf/dma-fence-chain.c
>>>>> index a8a90acf4f34d..90eff264ee05c 100644
>>>>> --- a/drivers/dma-buf/dma-fence-chain.c
>>>>> +++ b/drivers/dma-buf/dma-fence-chain.c
>>>>> @@ -119,46 +119,113 @@ static const char 
>>>>> *dma_fence_chain_get_timeline_name(struct dma_fence *fence)
>>>>>          return "unbound";
>>>>>  }
>>>>>  
>>>>> -static void dma_fence_chain_irq_work(struct irq_work *work)
>>>>> +static void signal_irq_work(struct irq_work *work)
>>>>>  {
>>>>>   struct dma_fence_chain *chain;
>>>>>  
>>>>>   chain = container_of(work, typeof(*chain), work);
>>>>>  
>>>>> - /* Try to rearm the callback */
>>>>> - if (!dma_fence_chain_enable_signaling(&chain->base))
>>>>> -         /* Ok, we are done. No more unsignaled fences left */
>>>>> -         dma_fence_signal(&chain->base);
>>>>> + dma_fence_signal(&chain->base);
>>>>>   dma_fence_put(&chain->base);
>>>>>  }
>>>>>  
>>>>> -static void dma_fence_chain_cb(struct dma_fence *f, struct dma_fence_cb 
>>>>> *cb)
>>>>> +static void signal_cb(struct dma_fence *f, struct dma_fence_cb *cb)
>>>>> +{
>>>>> + struct dma_fence_chain *chain;
>>>>> +
>>>>> + chain = container_of(cb, typeof(*chain), cb);
>>>>> + init_irq_work(&chain->work, signal_irq_work);
>>>>> + irq_work_queue(&chain->work);
>>>>> +}
>>>>> +
>>>>> +static void rearm_irq_work(struct irq_work *work)
>>>>> +{
>>>>> + struct dma_fence_chain *chain;
>>>>> + struct dma_fence *prev;
>>>>> +
>>>>> + chain = container_of(work, typeof(*chain), work);
>>>>> +
>>>>> + rcu_read_lock();
>>>>> + prev = rcu_dereference(chain->prev);
>>>>> + if (prev && dma_fence_add_callback(prev, &chain->cb, signal_cb))
>>>>> +         prev = NULL;
>>>>> + rcu_read_unlock();
>>>>> + if (prev)
>>>>> +         return;
>>>>> +
>>>>> + /* Ok, we are done. No more unsignaled fences left */
>>>>> + signal_irq_work(work);
>>>>> +}
>>>>> +
>>>>> +static inline bool fence_is_signaled__nested(struct dma_fence *fence)
>>>>> +{
>>>>> + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>>>>> +         return true;
>>>>> +
>>
>>>>> + if (fence->ops->signaled && fence->ops->signaled(fence)) {
>>
>> Calling this outside of dma-fence.[ch] is a clear no-go.
> 
> But this patch applies only to drivers/dma-buf/dma-fence-chain.c, not 
> outside of it.
> 
> Thanks,
> Janusz
> 
>>
>> Regards,
>> Christian.
>>
>>>>> +         unsigned long flags;
>>>>> +
>>>>> +         spin_lock_irqsave_nested(fence->lock, flags, 
>>>>> SINGLE_DEPTH_NESTING);
>>>>> +         dma_fence_signal_locked(fence);
>>>>> +         spin_unlock_irqrestore(fence->lock, flags);
>>>>> +
>>>>> +         return true;
>>>>> + }
>>>>> +
>>>>> + return false;
>>>>> +}
>>>>> +
>>>>> +static bool prev_is_signaled(struct dma_fence_chain *chain)
>>>>> +{
>>>>> + struct dma_fence *prev;
>>>>> + bool result;
>>>>> +
>>>>> + rcu_read_lock();
>>>>> + prev = rcu_dereference(chain->prev);
>>>>> + result = !prev || fence_is_signaled__nested(prev);
>>>>> + rcu_read_unlock();
>>>>> +
>>>>> + return result;
>>>>> +}
>>>>> +
>>>>> +static void rearm_or_signal_cb(struct dma_fence *f, struct dma_fence_cb 
>>>>> *cb)
>>>>>  {
>>>>>   struct dma_fence_chain *chain;
>>>>>  
>>>>>   chain = container_of(cb, typeof(*chain), cb);
>>>>> - init_irq_work(&chain->work, dma_fence_chain_irq_work);
>>>>> + if (prev_is_signaled(chain)) {
>>>>> +         /* Ok, we are done. No more unsignaled fences left */
>>>>> +         init_irq_work(&chain->work, signal_irq_work);
>>>>> + } else {
>>>>> +         /* Try to rearm the callback */
>>>>> +         init_irq_work(&chain->work, rearm_irq_work);
>>>>> + }
>>>>> +
>>>>>   irq_work_queue(&chain->work);
>>>>> - dma_fence_put(f);
>>>>>  }
>>>>>  
>>>>>  static bool dma_fence_chain_enable_signaling(struct dma_fence *fence)
>>>>>  {
>>>>>   struct dma_fence_chain *head = to_dma_fence_chain(fence);
>>>>> + int err = -ENOENT;
>>>>>  
>>>>> - dma_fence_get(&head->base);
>>>>> - dma_fence_chain_for_each(fence, &head->base) {
>>>>> -         struct dma_fence *f = dma_fence_chain_contained(fence);
>>>>> + if (WARN_ON(!head))
>>>>> +         return false;
>>>>>  
>>>>> -         dma_fence_get(f);
>>>>> -         if (!dma_fence_add_callback(f, &head->cb, dma_fence_chain_cb)) {
>>>>> + dma_fence_get(fence);
>>>>> + if (head->fence)
>>>>> +         err = dma_fence_add_callback(head->fence, &head->cb, 
>>>>> rearm_or_signal_cb);
>>>>> + if (err) {
>>>>> +         if (prev_is_signaled(head)) {
>>>>>                   dma_fence_put(fence);
>>>>> -                 return true;
>>>>> +         } else {
>>>>> +                 init_irq_work(&head->work, rearm_irq_work);
>>>>> +                 irq_work_queue(&head->work);
>>>>> +                 err = 0;
>>>>>           }
>>>>> -         dma_fence_put(f);
>>>>>   }
>>>>> - dma_fence_put(&head->base);
>>>>> - return false;
>>>>> +
>>>>> + return !err;
>>>>>  }
>>>>>  
>>>>>  static bool dma_fence_chain_signaled(struct dma_fence *fence)
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 
> 

Reply via email to