Oh, now it gets funny.

Adding Tvrtko as well since he recently looked into making dma_fences more 
robust regarding context lifetime.

On 18.09.25 14:30, Philipp Stanner wrote:
> dma_fence is a synchronization mechanism which is needed by virtually
> all GPU drivers.
> 
> A dma_fence offers many features, among which the most important ones
> are registering callbacks (for example to kick off a work item) which
> get executed once a fence gets signalled.
> 
> dma_fence has a number of callbacks. Only the two most basic ones
> (get_driver_name(), get_timeline_name() are abstracted since they are
> enough to enable the basic functionality.
> 
> Callbacks in Rust are registered by passing driver data which implements
> the Rust callback trait, whose function will be called by the C backend.
> 
> dma_fence's are always refcounted, so implement AlwaysRefcounted for
> them. Once a reference drops to zero, the C backend calls a release
> function, where we implement drop_in_place() to conveniently marry that
> C-cleanup mechanism with Rust's ownership concepts.
> 
> This patch provides basic functionality, but is still missing:
>   - An implementation of PinInit<T, Error> for all driver data.
>   - A clever implementation for working dma_fence_begin_signalling()
>     guards. See the corresponding TODO in the code.
>   - Additional useful helper functions such as dma_fence_is_signaled().
>     These _should_ be relatively trivial to implement, though.
> 
> Signed-off-by: Philipp Stanner <pha...@kernel.org>
> ---
> So. ¡Hola!
> 
> This is a highly WIP RFC. It's obviously at many places not yet
> conforming very well to Rust's standards.
> 
> Nevertheless, it has progressed enough that I want to request comments
> from the community.
> 
> There are a number of TODOs in the code to which I need input.
> 
> Notably, it seems (half-)illegal to use a shared static reference to an
> Atomic, which I currently use for the dma_fence unit test / docstring
> test. I'm willing to rework that if someone suggests how.
> (Still, shouldn't changing a global Atomic always be legal? It can race,
> of course. But that's kind of the point of an atomic)
> 
> What I want comments on the most is the design of the callbacks. I think
> it's a great opportunity to provide Rust drivers with rust-only
> callbacks, so that they don't have to bother about the C functions.
> 
> dma_fence wise, only the most basic callbacks currently get implemented.
> For Nova, AFAICS, we don't need much more than signalling fences and
> registering callbacks.
> 
> 
> Another, solvable, issue I'm having is designing the
> dma_fence_begin_signallin() abstractions. There are TODOs about that in
> the code. That should ideally be robust and not racy. So we might want
> some sort of synchronized (locked?) way for using that abstraction.
> 
> 
> Regarding the manually created spinlock of mine: I so far never need
> that spinlock anywhere in Rust and wasn't sure what's then the best way
> to pass a "raw" spinlock to C.

I was hoping to remove that anyway. Maybe it's time to do so now.

Anyway I will need some time to look into it deeper.

Regards,
Christian.

> 
> 
> So much from my side. Hope to hear from you.
> 
> (I've compiled and tested this with the unit test on the current -rc3)
> 
> Philipp
> ---
>  rust/bindings/bindings_helper.h |   1 +
>  rust/helpers/dma_fence.c        |  23 ++
>  rust/helpers/helpers.c          |   1 +
>  rust/helpers/spinlock.c         |   5 +
>  rust/kernel/sync.rs             |   2 +
>  rust/kernel/sync/dma_fence.rs   | 388 ++++++++++++++++++++++++++++++++
>  6 files changed, 420 insertions(+)
>  create mode 100644 rust/helpers/dma_fence.c
>  create mode 100644 rust/kernel/sync/dma_fence.rs
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 84d60635e8a9..107cb6b6f4a4 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -48,6 +48,7 @@
>  #include <linux/cred.h>
>  #include <linux/device/faux.h>
>  #include <linux/dma-mapping.h>
> +#include <linux/dma-fence.h>
>  #include <linux/errname.h>
>  #include <linux/ethtool.h>
>  #include <linux/file.h>
> diff --git a/rust/helpers/dma_fence.c b/rust/helpers/dma_fence.c
> new file mode 100644
> index 000000000000..a9fc4829bbae
> --- /dev/null
> +++ b/rust/helpers/dma_fence.c
> @@ -0,0 +1,23 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/dma-fence.h>
> +
> +void rust_helper_dma_fence_get(struct dma_fence *f)
> +{
> +     dma_fence_get(f);
> +}
> +
> +void rust_helper_dma_fence_put(struct dma_fence *f)
> +{
> +     dma_fence_put(f);
> +}
> +
> +bool rust_helper_dma_fence_begin_signalling(void)
> +{
> +     return dma_fence_begin_signalling();
> +}
> +
> +void rust_helper_dma_fence_end_signalling(bool cookie)
> +{
> +     dma_fence_end_signalling(cookie);
> +}
> diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
> index 7cf7fe95e41d..99a7f7834c03 100644
> --- a/rust/helpers/helpers.c
> +++ b/rust/helpers/helpers.c
> @@ -20,6 +20,7 @@
>  #include "cred.c"
>  #include "device.c"
>  #include "dma.c"
> +#include "dma_fence.c"
>  #include "drm.c"
>  #include "err.c"
>  #include "fs.c"
> diff --git a/rust/helpers/spinlock.c b/rust/helpers/spinlock.c
> index 42c4bf01a23e..017ac447ebbd 100644
> --- a/rust/helpers/spinlock.c
> +++ b/rust/helpers/spinlock.c
> @@ -16,6 +16,11 @@ void rust_helper___spin_lock_init(spinlock_t *lock, const 
> char *name,
>  #endif /* CONFIG_DEBUG_SPINLOCK */
>  }
>  
> +void rust_helper_spin_lock_init(spinlock_t *lock)
> +{
> +     spin_lock_init(lock);
> +}
> +
>  void rust_helper_spin_lock(spinlock_t *lock)
>  {
>       spin_lock(lock);
> diff --git a/rust/kernel/sync.rs b/rust/kernel/sync.rs
> index 00f9b558a3ad..6e59526020bc 100644
> --- a/rust/kernel/sync.rs
> +++ b/rust/kernel/sync.rs
> @@ -12,6 +12,7 @@
>  mod arc;
>  pub mod aref;
>  pub mod completion;
> +pub mod dma_fence;
>  mod condvar;
>  pub mod lock;
>  mod locked_by;
> @@ -20,6 +21,7 @@
>  
>  pub use arc::{Arc, ArcBorrow, UniqueArc};
>  pub use completion::Completion;
> +pub use dma_fence::{DmaFence, DmaFenceCtx, DmaFenceNames, DmaFenceCb, 
> DmaFenceCbFunc};
>  pub use condvar::{new_condvar, CondVar, CondVarTimeoutResult};
>  pub use lock::global::{global_lock, GlobalGuard, GlobalLock, 
> GlobalLockBackend, GlobalLockedBy};
>  pub use lock::mutex::{new_mutex, Mutex, MutexGuard};
> diff --git a/rust/kernel/sync/dma_fence.rs b/rust/kernel/sync/dma_fence.rs
> new file mode 100644
> index 000000000000..a00bb10b2208
> --- /dev/null
> +++ b/rust/kernel/sync/dma_fence.rs
> @@ -0,0 +1,388 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! DmaFence support.
> +//!
> +//! Reference: <https://docs.kernel.org/driver-api/dma-buf.html#c.dma_fence>
> +//!
> +//! C header: 
> [`include/linux/dma-fence.h`](srctree/include/linux/dma-fence.h)
> +
> +use crate::{
> +    bindings,
> +    prelude::*,
> +    str::CStr,
> +    types::ForeignOwnable,
> +    types::{ARef, AlwaysRefCounted, Opaque},
> +};
> +
> +use core::{
> +    ptr::{drop_in_place, NonNull},
> +    sync::atomic::{AtomicU64, Ordering},
> +};
> +
> +use kernel::sync::{Arc, ArcBorrow};
> +
> +/// The C dma_fence backend functions ask for certain name parameters at 
> times. In order to avoid
> +/// storing those names in [`DmaFence`] (there can be millions of fences at 
> the same time, wasting
> +/// much memory), the driver must provide associated constants which the C 
> backend will access,
> +/// ultimately.
> +pub trait DmaFenceNames {
> +    /// The driver's name.
> +    const DRIVER_NAME: &CStr;
> +    /// The name of the timeline the fence is associated with.
> +    const TIMELINE_NAME: &CStr;
> +}
> +
> +/// Defines the callback function the dma-fence backend will call once the 
> fence gets signalled.
> +pub trait DmaFenceCbFunc {
> +    /// The callback function. `cb` is a container of the data which the 
> driver passed in
> +    /// [`DmaFence::register_callback`].
> +    fn callback(cb: Pin<KBox<DmaFenceCb<Self>>>)
> +    where
> +        Self: Sized;
> +}
> +
> +/// Container for driver data which the driver gets back in its callback 
> once the fence gets
> +/// signalled.
> +#[pin_data]
> +pub struct DmaFenceCb<T: DmaFenceCbFunc> {
> +    /// C struct needed for the backend.
> +    #[pin]
> +    inner: Opaque<bindings::dma_fence_cb>,
> +    /// Driver data.
> +    #[pin]
> +    pub data: T,
> +}
> +
> +impl<T: DmaFenceCbFunc + 'static> DmaFenceCb<T> {
> +    fn new(data: impl PinInit<T>) -> Result<Pin<KBox<Self>>> {
> +        let cb = try_pin_init!(Self {
> +            inner: Opaque::zeroed(), // This gets initialized by the C 
> backend.
> +            data <- data,
> +        });
> +
> +        KBox::pin_init(cb, GFP_KERNEL)
> +    }
> +
> +    /// Callback for the C dma_fence backend.
> +    ///
> +    /// # Safety
> +    /// All data used and cast in this function was validly created by
> +    /// [`DmaFence::register_callback`] and isn't modified by the C backend 
> until this callback
> +    /// here has run.
> +    unsafe extern "C" fn callback(
> +        _fence_ptr: *mut bindings::dma_fence,
> +        cb_ptr: *mut bindings::dma_fence_cb,
> +    ) {
> +        let cb_ptr = Opaque::cast_from(cb_ptr);
> +
> +        // SAFETY: The constructor guarantees that `cb_ptr` is always 
> `inner` of a DmaFenceCb.
> +        let cb_ptr = unsafe { crate::container_of!(cb_ptr, Self, inner) 
> }.cast_mut() as *mut c_void;
> +        // SAFETY: `cp_ptr` is the heap memory of a Pin<Kbox<Self>> because 
> it was created by
> +        // invoking ForeignOwnable::into_foreign() on such an instance.
> +        let cb = unsafe { <Pin<KBox<Self>> as 
> ForeignOwnable>::from_foreign(cb_ptr) };
> +
> +        // Pass ownership back over to the driver.
> +        T::callback(cb);
> +    }
> +}
> +
> +/// A dma-fence context. A fence context takes care of associating related 
> fences with each other,
> +/// providing each with raising sequence numbers and a common identifier.
> +#[pin_data]
> +pub struct DmaFenceCtx {
> +    /// An opaque spinlock. Only ever passed to the C backend, never used by 
> Rust.
> +    #[pin]
> +    lock: Opaque<bindings::spinlock_t>,
> +    /// The fence context number.
> +    nr: u64,
> +    /// The sequence number for the next fence created.
> +    seqno: AtomicU64,
> +}
> +
> +impl DmaFenceCtx {
> +    /// Create a new `DmaFenceCtx`.
> +    pub fn new() -> Result<Arc<Self>> {
> +        let ctx = pin_init!(Self {
> +            // Feed in a non-Rust spinlock for now, since the Rust side 
> never needs the lock.
> +            lock <- Opaque::ffi_init(|slot: *mut bindings::spinlock| {
> +                // SAFETY: `slot` is a valid pointer to an uninitialized 
> `struct spinlock_t`.
> +                unsafe { bindings::spin_lock_init(slot) };
> +            }),
> +            // SAFETY: `dma_fence_context_alloc()` merely works on a global 
> atomic. Parameter `1`
> +            // is the number of contexts we want to allocate.
> +            nr: unsafe { bindings::dma_fence_context_alloc(1) },
> +            seqno: AtomicU64::new(0),
> +        });
> +
> +        Arc::pin_init(ctx, GFP_KERNEL)
> +    }
> +
> +    fn get_new_fence_seqno(&self) -> u64 {
> +        self.seqno.fetch_add(1, Ordering::Relaxed)
> +    }
> +}
> +
> +impl ArcBorrow<'_, DmaFenceCtx> {
> +    /// Create a new fence, consuming `data`.
> +    ///
> +    /// The fence will increment the refcount of the fence context 
> associated with this
> +    /// [`DmaFenceCtx`].
> +    pub fn new_fence<T: DmaFenceNames>(
> +        &mut self,
> +        data: impl PinInit<T>,
> +    ) -> Result<ARef<DmaFence<T>>> {
> +        let fctx: Arc<DmaFenceCtx> = (*self).into();
> +        let seqno: u64 = fctx.get_new_fence_seqno();
> +
> +        // TODO: Should we reset seqno in case of failure?
> +        // Pass `fctx` by value so that the fence will hold a reference to 
> the DmaFenceCtx as long
> +        // as it lives.
> +        DmaFence::new(fctx, data, &self.lock, self.nr, seqno)
> +    }
> +}
> +
> +/// A synchronization primitive mainly for GPU drivers.
> +///
> +/// DmaFences are always reference counted. The typical use case is that one 
> side registers
> +/// callbacks on the fence which will perform a certain action (such as 
> queueing work) once the
> +/// other side signals the fence.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// use kernel::c_str;
> +/// use kernel::sync::{Arc, ArcBorrow, DmaFence, DmaFenceCtx, DmaFenceNames, 
> DmaFenceCb, DmaFenceCbFunc};
> +/// use core::sync::atomic::{self, AtomicBool};
> +///
> +/// static mut CHECKER: AtomicBool = AtomicBool::new(false);
> +///
> +/// struct CallbackData {
> +///     i: u32,
> +/// }
> +///
> +/// impl CallbackData {
> +///     fn new() -> Self {
> +///         Self { i: 9 }
> +///     }
> +/// }
> +///
> +/// impl DmaFenceCbFunc for CallbackData {
> +///     fn callback(cb: Pin<KBox<DmaFenceCb<Self>>>) where Self: Sized {
> +///         assert_eq!(cb.data.i, 9);
> +///         // SAFETY: Just to have an easy way for testing. This cannot 
> race with the checker
> +///         // because the fence signalling callbacks are executed 
> synchronously.
> +///         unsafe { CHECKER.store(true, atomic::Ordering::Relaxed); }
> +///     }
> +/// }
> +///
> +/// struct DriverData {
> +///     i: u32,
> +/// }
> +///
> +/// impl DriverData {
> +///     fn new() -> Self {
> +///         Self { i: 5 }
> +///     }
> +/// }
> +///
> +/// impl DmaFenceNames for DriverData {
> +///     const DRIVER_NAME: &CStr = c_str!("DUMMY_DRIVER");
> +///     const TIMELINE_NAME: &CStr = c_str!("DUMMY_TIMELINE");
> +/// }
> +///
> +/// let data = DriverData::new();
> +/// let fctx = DmaFenceCtx::new()?;
> +///
> +/// let mut fence = fctx.as_arc_borrow().new_fence(data)?;
> +///
> +/// let cb_data = CallbackData::new();
> +/// fence.register_callback(cb_data);
> +/// // fence.begin_signalling();
> +/// fence.signal()?;
> +/// // Now check wehether the callback was actually executed.
> +/// // SAFETY: `fence.signal()` above works sequentially. We just check here 
> whether the signalling
> +/// // actually did set the boolean correctly.
> +/// unsafe { assert_eq!(CHECKER.load(atomic::Ordering::Relaxed), true); }
> +///
> +/// Ok::<(), Error>(())
> +/// ```
> +#[pin_data]
> +pub struct DmaFence<T> {
> +    /// The actual dma_fence passed to C.
> +    #[pin]
> +    inner: Opaque<bindings::dma_fence>,
> +    /// User data.
> +    #[pin]
> +    data: T,
> +    /// Marks whether the fence is currently in the signalling critical 
> section.
> +    signalling: bool,
> +    /// A boolean needed for the C backend's lockdep guard.
> +    signalling_cookie: bool,
> +    /// A reference to the associated [`DmaFenceCtx`] so that it cannot be 
> dropped while there are
> +    /// still fences around.
> +    fctx: Arc<DmaFenceCtx>,
> +}
> +
> +// SAFETY: `DmaFence` is safe to be sent to any task.
> +unsafe impl<T> Send for DmaFence<T> {}
> +
> +// SAFETY: `DmaFence` is safe to be accessed concurrently.
> +unsafe impl<T> Sync for DmaFence<T> {}
> +
> +// SAFETY: These implement the C backends refcounting methods which are 
> proven to work correctly.
> +unsafe impl<T: DmaFenceNames> AlwaysRefCounted for DmaFence<T> {
> +    fn inc_ref(&self) {
> +        // SAFETY: `self.as_raw()` is a pointer to a valid `struct 
> dma_fence`.
> +        unsafe { bindings::dma_fence_get(self.as_raw()) }
> +    }
> +
> +    /// # Safety
> +    ///
> +    /// `ptr`must be a valid pointer to a [`DmaFence`].
> +    unsafe fn dec_ref(ptr: NonNull<Self>) {
> +        // SAFETY: `ptr` is never a NULL pointer; and when `dec_ref()` is 
> called
> +        // the fence is by definition still valid.
> +        let fence = unsafe { (*ptr.as_ptr()).inner.get() };
> +
> +        // SAFETY: Valid because `fence` was created validly above.
> +        unsafe { bindings::dma_fence_put(fence) }
> +    }
> +}
> +
> +impl<T: DmaFenceNames> DmaFence<T> {
> +    // TODO: There could be a subtle potential problem here? The LLVM 
> compiler backend can create
> +    // several versions of this constant. Their content would be identical, 
> but their addresses
> +    // different.
> +    const OPS: bindings::dma_fence_ops = Self::ops_create();
> +
> +    /// Create an initializer for a new [`DmaFence`].
> +    fn new(
> +        fctx: Arc<DmaFenceCtx>,
> +        data: impl PinInit<T>, // TODO: The driver data should implement 
> PinInit<T, Error>
> +        lock: &Opaque<bindings::spinlock_t>,
> +        context: u64,
> +        seqno: u64,
> +    ) -> Result<ARef<Self>> {
> +        let fence = pin_init!(Self {
> +            inner <- Opaque::ffi_init(|slot: *mut bindings::dma_fence| {
> +                let lock_ptr = &raw const (*lock);
> +                // SAFETY: `slot` is a valid pointer to an uninitialized 
> `struct dma_fence`.
> +                // `lock_ptr` is a pointer to the spinlock of the fence 
> context, which is shared
> +                // among all the fences. This can't become a UAF because 
> each fence takes a
> +                // reference of the fence context.
> +                unsafe { bindings::dma_fence_init(slot, &Self::OPS, 
> Opaque::cast_into(lock_ptr), context, seqno) };
> +            }),
> +            data <- data,
> +            signalling: false,
> +            signalling_cookie: false,
> +            fctx: fctx,
> +        });
> +
> +        let b = KBox::pin_init(fence, GFP_KERNEL)?;
> +
> +        // SAFETY: We don't move the contents of `b` anywhere here. After 
> unwrapping it, ARef will
> +        // take care of preventing memory moves.
> +        let rawptr = KBox::into_raw(unsafe { Pin::into_inner_unchecked(b) });
> +
> +        // SAFETY: `rawptr` was created validly above.
> +        let aref = unsafe { ARef::from_raw(NonNull::new_unchecked(rawptr)) };
> +
> +        Ok(aref)
> +    }
> +
> +    /// Mark the beginning of a DmaFence signalling critical section. Should 
> be called once a fence
> +    /// gets published.
> +    ///
> +    /// The signalling critical section is marked as finished automatically 
> once the fence signals.
> +    pub fn begin_signalling(&mut self) {
> +        // FIXME: this needs to be mutable, obviously, but we can't borrow 
> mutably. *sigh*
> +        self.signalling = true;
> +        // TODO: Should we warn if a user tries to do this several times for 
> the same
> +        // fence? And should we ignore the request if the fence is already 
> signalled?
> +
> +        // SAFETY: `dma_fence_begin_signalling()` works on global lockdep 
> data and calling it is
> +        // always safe.
> +        self.signalling_cookie = unsafe { 
> bindings::dma_fence_begin_signalling() };
> +    }
> +
> +    const fn ops_create() -> bindings::dma_fence_ops {
> +        // SAFETY: Zeroing out memory on the stack is always safe.
> +        let mut ops: bindings::dma_fence_ops = unsafe { core::mem::zeroed() 
> };
> +
> +        ops.get_driver_name = Some(Self::get_driver_name);
> +        ops.get_timeline_name = Some(Self::get_timeline_name);
> +        ops.release = Some(Self::release);
> +
> +        ops
> +    }
> +
> +    extern "C" fn get_driver_name(_ptr: *mut bindings::dma_fence) -> *const 
> c_char {
> +        T::DRIVER_NAME.as_ptr()
> +    }
> +
> +    extern "C" fn get_timeline_name(_ptr: *mut bindings::dma_fence) -> 
> *const c_char {
> +        T::TIMELINE_NAME.as_ptr()
> +    }
> +
> +    /// The release function called by the C backend once the refcount drops 
> to 0. We use this to
> +    /// drop the Rust dma-fence, too. Since [`DmaFence`] implements 
> [`AlwaysRefCounted`], this is
> +    /// perfectly safe and a convenient way to concile the two release 
> mechanisms of C and Rust.
> +    unsafe extern "C" fn release(ptr: *mut bindings::dma_fence) {
> +        let ptr = Opaque::cast_from(ptr);
> +
> +        // SAFETY: The constructor guarantees that `ptr` is always the inner 
> fence of a DmaFence.
> +        let fence = unsafe { crate::container_of!(ptr, Self, inner) 
> }.cast_mut();
> +
> +        // SAFETY: See above. Also, the release callback will only be called 
> once, when the
> +        // refcount drops to 0, and when that happens the fence is by 
> definition still valid.
> +        unsafe { drop_in_place(fence) };
> +    }
> +
> +    /// Signal the fence. This will invoke all registered callbacks.
> +    pub fn signal(&self) -> Result {
> +        // SAFETY: `self` is refcounted.
> +        let ret = unsafe { bindings::dma_fence_signal(self.as_raw()) };
> +        if ret != 0 {
> +            return Err(Error::from_errno(ret));
> +        }
> +
> +        if self.signalling {
> +            // SAFETY: `dma_fence_end_signalling()` works on global lockdep 
> data. The only
> +            // parameter is a boolean passed by value.
> +            unsafe { 
> bindings::dma_fence_end_signalling(self.signalling_cookie) };
> +        }
> +
> +        Ok(())
> +    }
> +
> +    /// Register a callback on a [`DmaFence`]. The callback will be invoked 
> in the fence's
> +    /// signalling path, i.e., critical section.
> +    ///
> +    /// Consumes `data`. `data` is passed back to the implemented callback 
> function when the fence
> +    /// gets signalled.
> +    pub fn register_callback<U: DmaFenceCbFunc + 'static>(&self, data: impl 
> PinInit<U>) -> Result {
> +        let cb = DmaFenceCb::new(data)?;
> +        let ptr = cb.into_foreign() as *mut DmaFenceCb<U>;
> +        // SAFETY: `ptr` was created validly directly above.
> +        let inner_cb = unsafe { (*ptr).inner.get() };
> +
> +        // SAFETY: `self.as_raw()` is valid because `self` is refcounted, 
> `inner_cb` was created
> +        // validly above and was turned into a ForeignOwnable, so it won't 
> be dropped. `callback`
> +        // has static life time.
> +        let ret = unsafe {
> +            bindings::dma_fence_add_callback(
> +                self.as_raw(),
> +                inner_cb,
> +                Some(DmaFenceCb::<U>::callback),
> +            )
> +        };
> +        if ret != 0 {
> +            return Err(Error::from_errno(ret));
> +        }
> +        Ok(())
> +    }
> +
> +    fn as_raw(&self) -> *mut bindings::dma_fence {
> +        self.inner.get()
> +    }
> +}

Reply via email to