Re: [PATCH v2 2/2] rust: drm: Add GPUVM abstraction

Danilo Krummrich Wed, 23 Apr 2025 08:27:29 -0700

On Wed, Apr 23, 2025 at 10:29:01AM -0300, Daniel Almeida wrote:
> Hi Danilo,
> 
> FYI, most of this patch still retains the original code from the Asahi 
> project,
> 
> > On 22 Apr 2025, at 18:16, Danilo Krummrich <d...@kernel.org> wrote:
> > 
> > (Not a full review, but some things that took my attention from a brief 
> > look.)
> > 
> > On Tue, Apr 22, 2025 at 01:41:53PM -0300, Daniel Almeida wrote:
> 
> > 
> >> +        &mut self,
> >> +        gpuvm: &mut UpdatingGpuVm<'_, T>,
> >> +        gpuva: Pin<KBox<GpuVa<T>>>,
> >> +        gpuvmbo: &ARef<GpuVmBo<T>>,
> >> +    ) -> Result<(), Pin<KBox<GpuVa<T>>>> {
> >> +        // SAFETY: We are handing off the GpuVa ownership and it will not 
> >> be moved.
> >> +        let p = KBox::leak(unsafe { Pin::into_inner_unchecked(gpuva) });
> >> +        // SAFETY: These C functions are called with the correct 
> >> invariants
> >> +        unsafe {
> >> +            bindings::drm_gpuva_init_from_op(&mut p.gpuva, &mut self.0);
> >> +            if bindings::drm_gpuva_insert(gpuvm.0.gpuvm() as *mut _, &mut 
> >> p.gpuva) != 0 {
> >> +                // EEXIST, return the GpuVa to the caller as an error
> >> +                return Err(Pin::new_unchecked(KBox::from_raw(p)));
> >> +            };
> > 
> > How do you ensure that the VM lock is held when there's a list of 
> > operations?
> 
> Are you saying that this will not work for async binds, or that it’s already 
> broken in sync binds too?


I think here it's fine since sm_map() is implemented for LockedGpuVm, and that
seems to be the only place where you hand out a reference to an OpMap. But this
falls apart once we give out sets of operations, e.g. as list.

Even though, your patch doesn't do that yet, that's where we're heading to
eventually. Hence, the initial design must consider those things, otherwise
we only end up with something upstream that needs to be rewritten.

> 
> Perhaps we can get rid of this `UpdatingGpuVm` type and pass `LockedGpuVm` 
> instead?

At a first glance I didn't get what's the difference between the two anyways.
But I don't think it's really related to the problem above, is it?

> 
> > 
> >> +            // SAFETY: This takes a new reference to the gpuvmbo.
> >> +            bindings::drm_gpuva_link(&mut p.gpuva, &gpuvmbo.bo as *const 
> >> _ as *mut _);
> > 
> > How do you ensure that the dma_resv lock is held?
> 
> Right, I totally missed that.
> 
> > 
> >> +        }
> >> +        Ok(())
> >> +    }
> >> +}

<snip>

> >> +// SAFETY: DRM GpuVmBo objects are always reference counted and the 
> >> get/put functions
> >> +// satisfy the requirements.
> >> +unsafe impl<T: DriverGpuVm> AlwaysRefCounted for GpuVmBo<T> {
> >> +    fn inc_ref(&self) {
> >> +        // SAFETY: The drm_gpuvm_get function satisfies the requirements 
> >> for inc_ref().
> >> +        unsafe { bindings::drm_gpuvm_bo_get(&self.bo as *const _ as *mut 
> >> _) };
> >> +    }
> >> +
> >> +    unsafe fn dec_ref(mut obj: NonNull<Self>) {
> >> +        // SAFETY: drm_gpuvm_bo_put() requires holding the gpuva lock, 
> >> which is
> >> +        // the dma_resv lock by default.
> >> +        // The drm_gpuvm_put function satisfies the requirements for 
> >> dec_ref().
> >> +        // (We do not support custom locks yet.)
> >> +        unsafe {
> >> +            let resv = (*obj.as_mut().bo.obj).resv;
> >> +            bindings::dma_resv_lock(resv, core::ptr::null_mut());
> >> +            bindings::drm_gpuvm_bo_put(&mut obj.as_mut().bo);
> >> +            bindings::dma_resv_unlock(resv);
> > 
> > What if the resv_lock is held already? Please also make sure to put multiple
> > unsafe calls each in a separate unsafe block.
> 
> By whom?

The lock might be held already by the driver or by TTM when things are called
from TTM callbacks.

This is why GPUVM never takes locks by itself, but asserts that the correct lock
is held.

I think we really want to get proof by the driver by providing lock guard
references.

> See my comment about “[..] a new type for GEM objects which are known to be 
> locked"
> below.

<snip>

> >> +
> >> +    /// Obtains the [`GpuVmBo`] object that connects `obj` to this VM.
> >> +    ///
> >> +    /// This connection is unique, so an instane of [`GpuVmBo`] will be
> >> +    /// allocated for `obj` once, and that instance will be returned from 
> >> that
> >> +    /// point forward.
> >> +    pub fn obtain_bo(&mut self, obj: &DriverObject<T>) -> 
> >> Result<ARef<GpuVmBo<T>>> {
> >> +        // SAFETY: LockedGpuVm implies the right locks are held.
> > 
> > No, this must be locked by the dma-resv or the GEM gpuva lock, not by the
> > GPUVM lock that protects the interval tree.
> 
> By “GEM gpuva lock” you’re referring to the custom lock which we
> currently do not support, right?

Yes.

> This series currently rely on manual calls to dma_resv_{lock,unlock}, I wonder
> if we should ditch that in favor of something written in Rust directly. This
> would let us introduce a new type for GEM objects which are known to have
> `resv` locked. WDYT?

Not all functions that require the dma-resv lock to be held are called with a
GEM object parameter, it could also be a struct drm_gpuvm_bo, struct drm_gpuva
or struct drm_gpuvm, since they all carry GEM object pointers.

For reference, you can look for "_held" in drivers/gpu/drm/drm_gpuvm.c.

Re: [PATCH v2 2/2] rust: drm: Add GPUVM abstraction

Reply via email to