On 6/3/25 14:48, Tvrtko Ursulin wrote:
> 
> On 03/06/2025 13:40, Christian König wrote:
>> On 6/3/25 13:30, Tvrtko Ursulin wrote:
>>>
>>> On 02/06/2025 19:00, Christian König wrote:
>>>> On 6/2/25 17:25, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 02/06/2025 15:42, Christian König wrote:
>>>>>> On 6/2/25 15:05, Tvrtko Ursulin wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 15/05/2025 14:15, Christian König wrote:
>>>>>>>> Hey drm-misc maintainers,
>>>>>>>>
>>>>>>>> can you guys please backmerge drm-next into drm-misc-next?
>>>>>>>>
>>>>>>>> I want to push this patch here but it depends on changes which are 
>>>>>>>> partially in drm-next and partially in drm-misc-next.
>>>>>>>
>>>>>>> Looks like the backmerge is still pending?
>>>>>>
>>>>>> Yes, @Maarten, @Maxime and @Thomas ping on this.
>>>>>>
>>>>>>> In the meantime, Christian, any chance you will have some bandwith to 
>>>>>>> think about the tail end of the series? Specifically patch 6 and how 
>>>>>>> that is used onward.
>>>>>>
>>>>>> Well the RCU grace period is quite a nifty hack. I wanted to go over it 
>>>>>> again after merging the first patches from this series.
>>>>>>
>>>>>> In general looks like a good idea to me, I just don't like that we 
>>>>>> explicitely need to expose dma_fence_access_begin() and 
>>>>>> dma_fence_access_end().
>>>>>>
>>>>>> Especially we can't do that while calling fence->ops->release.
>>>>>
>>>>> Hm why not? You think something will take offence of the rcu_read_lock()?
>>>>
>>>> Yes, especially it is perfectly legitimate to call synchronize_rcu() or 
>>>> lock semaphores/mutexes from that callback.
>>>>
>>>> Either keep the RCU critical section only for the trace or even better 
>>>> come up with some different approach, e.g. copying the string under the 
>>>> RCU lock or something like that.
>>>
>>> Hmm but the kerneldoc explicity says callback can be called from irq 
>>> context:
>>>
>>>      /**
>>>       * @release:
>>>       *
>>>       * Called on destruction of fence to release additional resources.
>>>       * Can be called from irq context.  This callback is optional. If it is
>>>       * NULL, then dma_fence_free() is instead called as the default
>>>       * implementation.
>>>       */
>>>      void (*release)(struct dma_fence *fence);
>>
>> Ah, right. I mixed that up with the dma-buf object.
>>
>> Yeah in that case that is probably harmless. We delegate the final free to a 
>> work item if necessary anyway.
>>
>> But I would still like to avoid having the RCU cover the release as well. Or 
>> why is there any reason why we would explicitely want to do this?
> 
> I can't remember there was a particular reason. Obviously the driver/timeline 
> name vfunc access I needed a dma_fence_access_begin/end() block so maybe I 
> was just sloppy and put the end at the end of the function instead of at the 
> end of the block which can dereference them.

Yeah that's the next topic I would rather like to improve. We are kind of 
hiding that the returned strings are using RCU protection.

In other words it would be nicer if we could add an __rcu tag to the 
get_driver_name/get_timeline_name callbacks and let the automated tools 
complain if somebody isn't doing the proper RCU handling.

The problem is that as far as I know that is not supported by the automated 
tools (would be cool if somebody could double check that).

+We would need to convert the get_timeline/get_timeline_name function to 
something like func(struct dma_fence *fence, const char __rcu **out) to make 
that work.

Regards,
Christian.

> 
> I will pull it earlier for the next respin, assuming no gotchas get 
> discovered in the process.
> 
> Regards,
> 
> Tvrtko
> 
>>
>> Regards,
>> Christian.
>>
>>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Tvrtko
>>>>>
>>>>>>>> On 5/15/25 11:49, Tvrtko Ursulin wrote:
>>>>>>>>> With the goal of reducing the need for drivers to touch (and 
>>>>>>>>> dereference)
>>>>>>>>> fence->ops, we move the 64-bit seqnos flag from struct dma_fence_ops 
>>>>>>>>> to
>>>>>>>>> the fence->flags.
>>>>>>>>>
>>>>>>>>> Drivers which were setting this flag are changed to use new
>>>>>>>>> dma_fence_init64() instead of dma_fence_init().
>>>>>>>>>
>>>>>>>>> v2:
>>>>>>>>>      * Streamlined init and added kerneldoc.
>>>>>>>>>      * Rebase for amdgpu userq which landed since.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursu...@igalia.com>
>>>>>>>>> Reviewed-by: Christian König <christian.koe...@amd.com> # v1
>>>>>>>>> ---
>>>>>>>>>      drivers/dma-buf/dma-fence-chain.c             |  5 +-
>>>>>>>>>      drivers/dma-buf/dma-fence.c                   | 69 
>>>>>>>>> ++++++++++++++-----
>>>>>>>>>      .../drm/amd/amdgpu/amdgpu_eviction_fence.c    |  7 +-
>>>>>>>>>      .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c   |  5 +-
>>>>>>>>>      .../gpu/drm/amd/amdgpu/amdgpu_vm_tlb_fence.c  |  5 +-
>>>>>>>>>      include/linux/dma-fence.h                     | 14 ++--
>>>>>>>>>      6 files changed, 64 insertions(+), 41 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/dma-buf/dma-fence-chain.c 
>>>>>>>>> b/drivers/dma-buf/dma-fence-chain.c
>>>>>>>>> index 90424f23fd73..a8a90acf4f34 100644
>>>>>>>>> --- a/drivers/dma-buf/dma-fence-chain.c
>>>>>>>>> +++ b/drivers/dma-buf/dma-fence-chain.c
>>>>>>>>> @@ -218,7 +218,6 @@ static void dma_fence_chain_set_deadline(struct 
>>>>>>>>> dma_fence *fence,
>>>>>>>>>      }
>>>>>>>>>        const struct dma_fence_ops dma_fence_chain_ops = {
>>>>>>>>> -    .use_64bit_seqno = true,
>>>>>>>>>          .get_driver_name = dma_fence_chain_get_driver_name,
>>>>>>>>>          .get_timeline_name = dma_fence_chain_get_timeline_name,
>>>>>>>>>          .enable_signaling = dma_fence_chain_enable_signaling,
>>>>>>>>> @@ -262,8 +261,8 @@ void dma_fence_chain_init(struct dma_fence_chain 
>>>>>>>>> *chain,
>>>>>>>>>                  seqno = max(prev->seqno, seqno);
>>>>>>>>>          }
>>>>>>>>>      -    dma_fence_init(&chain->base, &dma_fence_chain_ops,
>>>>>>>>> -               &chain->lock, context, seqno);
>>>>>>>>> +    dma_fence_init64(&chain->base, &dma_fence_chain_ops, 
>>>>>>>>> &chain->lock,
>>>>>>>>> +             context, seqno);
>>>>>>>>>            /*
>>>>>>>>>           * Chaining dma_fence_chain container together is only 
>>>>>>>>> allowed through
>>>>>>>>> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
>>>>>>>>> index f0cdd3e99d36..705b59787731 100644
>>>>>>>>> --- a/drivers/dma-buf/dma-fence.c
>>>>>>>>> +++ b/drivers/dma-buf/dma-fence.c
>>>>>>>>> @@ -989,24 +989,9 @@ void dma_fence_describe(struct dma_fence *fence, 
>>>>>>>>> struct seq_file *seq)
>>>>>>>>>      }
>>>>>>>>>      EXPORT_SYMBOL(dma_fence_describe);
>>>>>>>>>      -/**
>>>>>>>>> - * dma_fence_init - Initialize a custom fence.
>>>>>>>>> - * @fence: the fence to initialize
>>>>>>>>> - * @ops: the dma_fence_ops for operations on this fence
>>>>>>>>> - * @lock: the irqsafe spinlock to use for locking this fence
>>>>>>>>> - * @context: the execution context this fence is run on
>>>>>>>>> - * @seqno: a linear increasing sequence number for this context
>>>>>>>>> - *
>>>>>>>>> - * Initializes an allocated fence, the caller doesn't have to keep 
>>>>>>>>> its
>>>>>>>>> - * refcount after committing with this fence, but it will need to 
>>>>>>>>> hold a
>>>>>>>>> - * refcount again if &dma_fence_ops.enable_signaling gets called.
>>>>>>>>> - *
>>>>>>>>> - * context and seqno are used for easy comparison between fences, 
>>>>>>>>> allowing
>>>>>>>>> - * to check which fence is later by simply using dma_fence_later().
>>>>>>>>> - */
>>>>>>>>> -void
>>>>>>>>> -dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops 
>>>>>>>>> *ops,
>>>>>>>>> -           spinlock_t *lock, u64 context, u64 seqno)
>>>>>>>>> +static void
>>>>>>>>> +__dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops 
>>>>>>>>> *ops,
>>>>>>>>> +             spinlock_t *lock, u64 context, u64 seqno, unsigned long 
>>>>>>>>> flags)
>>>>>>>>>      {
>>>>>>>>>          BUG_ON(!lock);
>>>>>>>>>          BUG_ON(!ops || !ops->get_driver_name || 
>>>>>>>>> !ops->get_timeline_name);
>>>>>>>>> @@ -1017,9 +1002,55 @@ dma_fence_init(struct dma_fence *fence, const 
>>>>>>>>> struct dma_fence_ops *ops,
>>>>>>>>>          fence->lock = lock;
>>>>>>>>>          fence->context = context;
>>>>>>>>>          fence->seqno = seqno;
>>>>>>>>> -    fence->flags = 0UL;
>>>>>>>>> +    fence->flags = flags;
>>>>>>>>>          fence->error = 0;
>>>>>>>>>            trace_dma_fence_init(fence);
>>>>>>>>>      }
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * dma_fence_init - Initialize a custom fence.
>>>>>>>>> + * @fence: the fence to initialize
>>>>>>>>> + * @ops: the dma_fence_ops for operations on this fence
>>>>>>>>> + * @lock: the irqsafe spinlock to use for locking this fence
>>>>>>>>> + * @context: the execution context this fence is run on
>>>>>>>>> + * @seqno: a linear increasing sequence number for this context
>>>>>>>>> + *
>>>>>>>>> + * Initializes an allocated fence, the caller doesn't have to keep 
>>>>>>>>> its
>>>>>>>>> + * refcount after committing with this fence, but it will need to 
>>>>>>>>> hold a
>>>>>>>>> + * refcount again if &dma_fence_ops.enable_signaling gets called.
>>>>>>>>> + *
>>>>>>>>> + * context and seqno are used for easy comparison between fences, 
>>>>>>>>> allowing
>>>>>>>>> + * to check which fence is later by simply using dma_fence_later().
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops 
>>>>>>>>> *ops,
>>>>>>>>> +           spinlock_t *lock, u64 context, u64 seqno)
>>>>>>>>> +{
>>>>>>>>> +    __dma_fence_init(fence, ops, lock, context, seqno, 0UL);
>>>>>>>>> +}
>>>>>>>>>      EXPORT_SYMBOL(dma_fence_init);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * dma_fence_init64 - Initialize a custom fence with 64-bit seqno 
>>>>>>>>> support.
>>>>>>>>> + * @fence: the fence to initialize
>>>>>>>>> + * @ops: the dma_fence_ops for operations on this fence
>>>>>>>>> + * @lock: the irqsafe spinlock to use for locking this fence
>>>>>>>>> + * @context: the execution context this fence is run on
>>>>>>>>> + * @seqno: a linear increasing sequence number for this context
>>>>>>>>> + *
>>>>>>>>> + * Initializes an allocated fence, the caller doesn't have to keep 
>>>>>>>>> its
>>>>>>>>> + * refcount after committing with this fence, but it will need to 
>>>>>>>>> hold a
>>>>>>>>> + * refcount again if &dma_fence_ops.enable_signaling gets called.
>>>>>>>>> + *
>>>>>>>>> + * Context and seqno are used for easy comparison between fences, 
>>>>>>>>> allowing
>>>>>>>>> + * to check which fence is later by simply using dma_fence_later().
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +dma_fence_init64(struct dma_fence *fence, const struct dma_fence_ops 
>>>>>>>>> *ops,
>>>>>>>>> +         spinlock_t *lock, u64 context, u64 seqno)
>>>>>>>>> +{
>>>>>>>>> +    __dma_fence_init(fence, ops, lock, context, seqno,
>>>>>>>>> +             BIT(DMA_FENCE_FLAG_SEQNO64_BIT));
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL(dma_fence_init64);
>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c 
>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
>>>>>>>>> index 1a7469543db5..79713421bffe 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c
>>>>>>>>> @@ -134,7 +134,6 @@ static bool 
>>>>>>>>> amdgpu_eviction_fence_enable_signaling(struct dma_fence *f)
>>>>>>>>>      }
>>>>>>>>>        static const struct dma_fence_ops amdgpu_eviction_fence_ops = {
>>>>>>>>> -    .use_64bit_seqno = true,
>>>>>>>>>          .get_driver_name = amdgpu_eviction_fence_get_driver_name,
>>>>>>>>>          .get_timeline_name = amdgpu_eviction_fence_get_timeline_name,
>>>>>>>>>          .enable_signaling = amdgpu_eviction_fence_enable_signaling,
>>>>>>>>> @@ -160,9 +159,9 @@ amdgpu_eviction_fence_create(struct 
>>>>>>>>> amdgpu_eviction_fence_mgr *evf_mgr)
>>>>>>>>>          ev_fence->evf_mgr = evf_mgr;
>>>>>>>>>          get_task_comm(ev_fence->timeline_name, current);
>>>>>>>>>          spin_lock_init(&ev_fence->lock);
>>>>>>>>> -    dma_fence_init(&ev_fence->base, &amdgpu_eviction_fence_ops,
>>>>>>>>> -               &ev_fence->lock, evf_mgr->ev_fence_ctx,
>>>>>>>>> -               atomic_inc_return(&evf_mgr->ev_fence_seq));
>>>>>>>>> +    dma_fence_init64(&ev_fence->base, &amdgpu_eviction_fence_ops,
>>>>>>>>> +             &ev_fence->lock, evf_mgr->ev_fence_ctx,
>>>>>>>>> +             atomic_inc_return(&evf_mgr->ev_fence_seq));
>>>>>>>>>          return ev_fence;
>>>>>>>>>      }
>>>>>>>>>      diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c 
>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
>>>>>>>>> index 029cb24c28b3..5e92d00a591f 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
>>>>>>>>> @@ -239,8 +239,8 @@ static int amdgpu_userq_fence_create(struct 
>>>>>>>>> amdgpu_usermode_queue *userq,
>>>>>>>>>          fence = &userq_fence->base;
>>>>>>>>>          userq_fence->fence_drv = fence_drv;
>>>>>>>>>      -    dma_fence_init(fence, &amdgpu_userq_fence_ops, 
>>>>>>>>> &userq_fence->lock,
>>>>>>>>> -               fence_drv->context, seq);
>>>>>>>>> +    dma_fence_init64(fence, &amdgpu_userq_fence_ops, 
>>>>>>>>> &userq_fence->lock,
>>>>>>>>> +             fence_drv->context, seq);
>>>>>>>>>            amdgpu_userq_fence_driver_get(fence_drv);
>>>>>>>>>          dma_fence_get(fence);
>>>>>>>>> @@ -334,7 +334,6 @@ static void amdgpu_userq_fence_release(struct 
>>>>>>>>> dma_fence *f)
>>>>>>>>>      }
>>>>>>>>>        static const struct dma_fence_ops amdgpu_userq_fence_ops = {
>>>>>>>>> -    .use_64bit_seqno = true,
>>>>>>>>>          .get_driver_name = amdgpu_userq_fence_get_driver_name,
>>>>>>>>>          .get_timeline_name = amdgpu_userq_fence_get_timeline_name,
>>>>>>>>>          .signaled = amdgpu_userq_fence_signaled,
>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_tlb_fence.c 
>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_tlb_fence.c
>>>>>>>>> index 51cddfa3f1e8..5d26797356a3 100644
>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_tlb_fence.c
>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_tlb_fence.c
>>>>>>>>> @@ -71,7 +71,6 @@ static void amdgpu_tlb_fence_work(struct 
>>>>>>>>> work_struct *work)
>>>>>>>>>      }
>>>>>>>>>        static const struct dma_fence_ops amdgpu_tlb_fence_ops = {
>>>>>>>>> -    .use_64bit_seqno = true,
>>>>>>>>>          .get_driver_name = amdgpu_tlb_fence_get_driver_name,
>>>>>>>>>          .get_timeline_name = amdgpu_tlb_fence_get_timeline_name
>>>>>>>>>      };
>>>>>>>>> @@ -101,8 +100,8 @@ void amdgpu_vm_tlb_fence_create(struct 
>>>>>>>>> amdgpu_device *adev, struct amdgpu_vm *vm
>>>>>>>>>          INIT_WORK(&f->work, amdgpu_tlb_fence_work);
>>>>>>>>>          spin_lock_init(&f->lock);
>>>>>>>>>      -    dma_fence_init(&f->base, &amdgpu_tlb_fence_ops, &f->lock,
>>>>>>>>> -               vm->tlb_fence_context, atomic64_read(&vm->tlb_seq));
>>>>>>>>> +    dma_fence_init64(&f->base, &amdgpu_tlb_fence_ops, &f->lock,
>>>>>>>>> +             vm->tlb_fence_context, atomic64_read(&vm->tlb_seq));
>>>>>>>>>            /* TODO: We probably need a separate wq here */
>>>>>>>>>          dma_fence_get(&f->base);
>>>>>>>>> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
>>>>>>>>> index 48b5202c531d..a34a0dcdc446 100644
>>>>>>>>> --- a/include/linux/dma-fence.h
>>>>>>>>> +++ b/include/linux/dma-fence.h
>>>>>>>>> @@ -97,6 +97,7 @@ struct dma_fence {
>>>>>>>>>      };
>>>>>>>>>        enum dma_fence_flag_bits {
>>>>>>>>> +    DMA_FENCE_FLAG_SEQNO64_BIT,
>>>>>>>>>          DMA_FENCE_FLAG_SIGNALED_BIT,
>>>>>>>>>          DMA_FENCE_FLAG_TIMESTAMP_BIT,
>>>>>>>>>          DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
>>>>>>>>> @@ -124,14 +125,6 @@ struct dma_fence_cb {
>>>>>>>>>       *
>>>>>>>>>       */
>>>>>>>>>      struct dma_fence_ops {
>>>>>>>>> -    /**
>>>>>>>>> -     * @use_64bit_seqno:
>>>>>>>>> -     *
>>>>>>>>> -     * True if this dma_fence implementation uses 64bit seqno, false
>>>>>>>>> -     * otherwise.
>>>>>>>>> -     */
>>>>>>>>> -    bool use_64bit_seqno;
>>>>>>>>> -
>>>>>>>>>          /**
>>>>>>>>>           * @get_driver_name:
>>>>>>>>>           *
>>>>>>>>> @@ -262,6 +255,9 @@ struct dma_fence_ops {
>>>>>>>>>      void dma_fence_init(struct dma_fence *fence, const struct 
>>>>>>>>> dma_fence_ops *ops,
>>>>>>>>>                  spinlock_t *lock, u64 context, u64 seqno);
>>>>>>>>>      +void dma_fence_init64(struct dma_fence *fence, const struct 
>>>>>>>>> dma_fence_ops *ops,
>>>>>>>>> +              spinlock_t *lock, u64 context, u64 seqno);
>>>>>>>>> +
>>>>>>>>>      void dma_fence_release(struct kref *kref);
>>>>>>>>>      void dma_fence_free(struct dma_fence *fence);
>>>>>>>>>      void dma_fence_describe(struct dma_fence *fence, struct seq_file 
>>>>>>>>> *seq);
>>>>>>>>> @@ -454,7 +450,7 @@ static inline bool __dma_fence_is_later(struct 
>>>>>>>>> dma_fence *fence, u64 f1, u64 f2)
>>>>>>>>>           * 32bit sequence numbers. Use a 64bit compare when the 
>>>>>>>>> driver says to
>>>>>>>>>           * do so.
>>>>>>>>>           */
>>>>>>>>> -    if (fence->ops->use_64bit_seqno)
>>>>>>>>> +    if (test_bit(DMA_FENCE_FLAG_SEQNO64_BIT, &fence->flags))
>>>>>>>>>              return f1 > f2;
>>>>>>>>>            return (int)(lower_32_bits(f1) - lower_32_bits(f2)) > 0;
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

Reply via email to