On Thu, 2025-09-18 at 15:11 +0200, Christian König wrote: > Oh, now it gets funny.
For the Rust compiler maybe ^^' dma_fence is a tough nut to crack because of all its callbacks and strict rules <.< > > Adding Tvrtko as well since he recently looked into making dma_fences more > robust regarding context lifetime. > > On 18.09.25 14:30, Philipp Stanner wrote: > > dma_fence is a synchronization mechanism which is needed by virtually > > all GPU drivers. > > > > A dma_fence offers many features, among which the most important ones > > are registering callbacks (for example to kick off a work item) which > > get executed once a fence gets signalled. > > > > dma_fence has a number of callbacks. Only the two most basic ones > > (get_driver_name(), get_timeline_name() are abstracted since they are > > enough to enable the basic functionality. > > > > Callbacks in Rust are registered by passing driver data which implements > > the Rust callback trait, whose function will be called by the C backend. > > > > dma_fence's are always refcounted, so implement AlwaysRefcounted for > > them. Once a reference drops to zero, the C backend calls a release > > function, where we implement drop_in_place() to conveniently marry that > > C-cleanup mechanism with Rust's ownership concepts. > > > > This patch provides basic functionality, but is still missing: > > - An implementation of PinInit<T, Error> for all driver data. > > - A clever implementation for working dma_fence_begin_signalling() > > guards. See the corresponding TODO in the code. > > - Additional useful helper functions such as dma_fence_is_signaled(). > > These _should_ be relatively trivial to implement, though. > > > > Signed-off-by: Philipp Stanner <pha...@kernel.org> > > --- > > So. ¡Hola! > > > > This is a highly WIP RFC. It's obviously at many places not yet > > conforming very well to Rust's standards. > > > > Nevertheless, it has progressed enough that I want to request comments > > from the community. > > > > There are a number of TODOs in the code to which I need input. > > > > Notably, it seems (half-)illegal to use a shared static reference to an > > Atomic, which I currently use for the dma_fence unit test / docstring > > test. I'm willing to rework that if someone suggests how. > > (Still, shouldn't changing a global Atomic always be legal? It can race, > > of course. But that's kind of the point of an atomic) > > > > What I want comments on the most is the design of the callbacks. I think > > it's a great opportunity to provide Rust drivers with rust-only > > callbacks, so that they don't have to bother about the C functions. > > > > dma_fence wise, only the most basic callbacks currently get implemented. > > For Nova, AFAICS, we don't need much more than signalling fences and > > registering callbacks. > > > > > > Another, solvable, issue I'm having is designing the > > dma_fence_begin_signallin() abstractions. There are TODOs about that in > > the code. That should ideally be robust and not racy. So we might want > > some sort of synchronized (locked?) way for using that abstraction. > > > > > > Regarding the manually created spinlock of mine: I so far never need > > that spinlock anywhere in Rust and wasn't sure what's then the best way > > to pass a "raw" spinlock to C. > > I was hoping to remove that anyway. Maybe it's time to do so now. Remove what, the spinlock pointer from struct dma_fence? Or were you trying to quote the section above about dma_fence_begin_signalling()? If the lock were to be removed, would then the driver be responsible for locking through its fence context? P. > > Anyway I will need some time to look into it deeper. > > Regards, > Christian. > > > > > > > So much from my side. Hope to hear from you. > > > > (I've compiled and tested this with the unit test on the current -rc3) > > > > Philipp > > --- > > rust/bindings/bindings_helper.h | 1 + > > rust/helpers/dma_fence.c | 23 ++ > > rust/helpers/helpers.c | 1 + > > rust/helpers/spinlock.c | 5 + > > rust/kernel/sync.rs | 2 + > > rust/kernel/sync/dma_fence.rs | 388 ++++++++++++++++++++++++++++++++ > > 6 files changed, 420 insertions(+) > > create mode 100644 rust/helpers/dma_fence.c > > create mode 100644 rust/kernel/sync/dma_fence.rs > > > > diff --git a/rust/bindings/bindings_helper.h > > b/rust/bindings/bindings_helper.h > > index 84d60635e8a9..107cb6b6f4a4 100644 > > --- a/rust/bindings/bindings_helper.h > > +++ b/rust/bindings/bindings_helper.h > > @@ -48,6 +48,7 @@ > > #include <linux/cred.h> > > #include <linux/device/faux.h> > > #include <linux/dma-mapping.h> > > +#include <linux/dma-fence.h> > > #include <linux/errname.h> > > #include <linux/ethtool.h> > > #include <linux/file.h> > > diff --git a/rust/helpers/dma_fence.c b/rust/helpers/dma_fence.c > > new file mode 100644 > > index 000000000000..a9fc4829bbae > > --- /dev/null > > +++ b/rust/helpers/dma_fence.c > > @@ -0,0 +1,23 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > + > > +#include <linux/dma-fence.h> > > + > > +void rust_helper_dma_fence_get(struct dma_fence *f) > > +{ > > + dma_fence_get(f); > > +} > > + > > +void rust_helper_dma_fence_put(struct dma_fence *f) > > +{ > > + dma_fence_put(f); > > +} > > + > > +bool rust_helper_dma_fence_begin_signalling(void) > > +{ > > + return dma_fence_begin_signalling(); > > +} > > + > > +void rust_helper_dma_fence_end_signalling(bool cookie) > > +{ > > + dma_fence_end_signalling(cookie); > > +} > > diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c > > index 7cf7fe95e41d..99a7f7834c03 100644 > > --- a/rust/helpers/helpers.c > > +++ b/rust/helpers/helpers.c > > @@ -20,6 +20,7 @@ > > #include "cred.c" > > #include "device.c" > > #include "dma.c" > > +#include "dma_fence.c" > > #include "drm.c" > > #include "err.c" > > #include "fs.c" > > diff --git a/rust/helpers/spinlock.c b/rust/helpers/spinlock.c > > index 42c4bf01a23e..017ac447ebbd 100644 > > --- a/rust/helpers/spinlock.c > > +++ b/rust/helpers/spinlock.c > > @@ -16,6 +16,11 @@ void rust_helper___spin_lock_init(spinlock_t *lock, > > const char *name, > > #endif /* CONFIG_DEBUG_SPINLOCK */ > > } > > > > +void rust_helper_spin_lock_init(spinlock_t *lock) > > +{ > > + spin_lock_init(lock); > > +} > > + > > void rust_helper_spin_lock(spinlock_t *lock) > > { > > spin_lock(lock); > > diff --git a/rust/kernel/sync.rs b/rust/kernel/sync.rs > > index 00f9b558a3ad..6e59526020bc 100644 > > --- a/rust/kernel/sync.rs > > +++ b/rust/kernel/sync.rs > > @@ -12,6 +12,7 @@ > > mod arc; > > pub mod aref; > > pub mod completion; > > +pub mod dma_fence; > > mod condvar; > > pub mod lock; > > mod locked_by; > > @@ -20,6 +21,7 @@ > > > > pub use arc::{Arc, ArcBorrow, UniqueArc}; > > pub use completion::Completion; > > +pub use dma_fence::{DmaFence, DmaFenceCtx, DmaFenceNames, DmaFenceCb, > > DmaFenceCbFunc}; > > pub use condvar::{new_condvar, CondVar, CondVarTimeoutResult}; > > pub use lock::global::{global_lock, GlobalGuard, GlobalLock, > > GlobalLockBackend, GlobalLockedBy}; > > pub use lock::mutex::{new_mutex, Mutex, MutexGuard}; > > diff --git a/rust/kernel/sync/dma_fence.rs b/rust/kernel/sync/dma_fence.rs > > new file mode 100644 > > index 000000000000..a00bb10b2208 > > --- /dev/null > > +++ b/rust/kernel/sync/dma_fence.rs > > @@ -0,0 +1,388 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > + > > +//! DmaFence support. > > +//! > > +//! Reference: > > <https://docs.kernel.org/driver-api/dma-buf.html#c.dma_fence> > > +//! > > +//! C header: > > [`include/linux/dma-fence.h`](srctree/include/linux/dma-fence.h) > > + > > +use crate::{ > > + bindings, > > + prelude::*, > > + str::CStr, > > + types::ForeignOwnable, > > + types::{ARef, AlwaysRefCounted, Opaque}, > > +}; > > + > > +use core::{ > > + ptr::{drop_in_place, NonNull}, > > + sync::atomic::{AtomicU64, Ordering}, > > +}; > > + > > +use kernel::sync::{Arc, ArcBorrow}; > > + > > +/// The C dma_fence backend functions ask for certain name parameters at > > times. In order to avoid > > +/// storing those names in [`DmaFence`] (there can be millions of fences > > at the same time, wasting > > +/// much memory), the driver must provide associated constants which the C > > backend will access, > > +/// ultimately. > > +pub trait DmaFenceNames { > > + /// The driver's name. > > + const DRIVER_NAME: &CStr; > > + /// The name of the timeline the fence is associated with. > > + const TIMELINE_NAME: &CStr; > > +} > > + > > +/// Defines the callback function the dma-fence backend will call once the > > fence gets signalled. > > +pub trait DmaFenceCbFunc { > > + /// The callback function. `cb` is a container of the data which the > > driver passed in > > + /// [`DmaFence::register_callback`]. > > + fn callback(cb: Pin<KBox<DmaFenceCb<Self>>>) > > + where > > + Self: Sized; > > +} > > + > > +/// Container for driver data which the driver gets back in its callback > > once the fence gets > > +/// signalled. > > +#[pin_data] > > +pub struct DmaFenceCb<T: DmaFenceCbFunc> { > > + /// C struct needed for the backend. > > + #[pin] > > + inner: Opaque<bindings::dma_fence_cb>, > > + /// Driver data. > > + #[pin] > > + pub data: T, > > +} > > + > > +impl<T: DmaFenceCbFunc + 'static> DmaFenceCb<T> { > > + fn new(data: impl PinInit<T>) -> Result<Pin<KBox<Self>>> { > > + let cb = try_pin_init!(Self { > > + inner: Opaque::zeroed(), // This gets initialized by the C > > backend. > > + data <- data, > > + }); > > + > > + KBox::pin_init(cb, GFP_KERNEL) > > + } > > + > > + /// Callback for the C dma_fence backend. > > + /// > > + /// # Safety > > + /// All data used and cast in this function was validly created by > > + /// [`DmaFence::register_callback`] and isn't modified by the C > > backend until this callback > > + /// here has run. > > + unsafe extern "C" fn callback( > > + _fence_ptr: *mut bindings::dma_fence, > > + cb_ptr: *mut bindings::dma_fence_cb, > > + ) { > > + let cb_ptr = Opaque::cast_from(cb_ptr); > > + > > + // SAFETY: The constructor guarantees that `cb_ptr` is always > > `inner` of a DmaFenceCb. > > + let cb_ptr = unsafe { crate::container_of!(cb_ptr, Self, inner) > > }.cast_mut() as *mut c_void; > > + // SAFETY: `cp_ptr` is the heap memory of a Pin<Kbox<Self>> > > because it was created by > > + // invoking ForeignOwnable::into_foreign() on such an instance. > > + let cb = unsafe { <Pin<KBox<Self>> as > > ForeignOwnable>::from_foreign(cb_ptr) }; > > + > > + // Pass ownership back over to the driver. > > + T::callback(cb); > > + } > > +} > > + > > +/// A dma-fence context. A fence context takes care of associating related > > fences with each other, > > +/// providing each with raising sequence numbers and a common identifier. > > +#[pin_data] > > +pub struct DmaFenceCtx { > > + /// An opaque spinlock. Only ever passed to the C backend, never used > > by Rust. > > + #[pin] > > + lock: Opaque<bindings::spinlock_t>, > > + /// The fence context number. > > + nr: u64, > > + /// The sequence number for the next fence created. > > + seqno: AtomicU64, > > +} > > + > > +impl DmaFenceCtx { > > + /// Create a new `DmaFenceCtx`. > > + pub fn new() -> Result<Arc<Self>> { > > + let ctx = pin_init!(Self { > > + // Feed in a non-Rust spinlock for now, since the Rust side > > never needs the lock. > > + lock <- Opaque::ffi_init(|slot: *mut bindings::spinlock| { > > + // SAFETY: `slot` is a valid pointer to an uninitialized > > `struct spinlock_t`. > > + unsafe { bindings::spin_lock_init(slot) }; > > + }), > > + // SAFETY: `dma_fence_context_alloc()` merely works on a > > global atomic. Parameter `1` > > + // is the number of contexts we want to allocate. > > + nr: unsafe { bindings::dma_fence_context_alloc(1) }, > > + seqno: AtomicU64::new(0), > > + }); > > + > > + Arc::pin_init(ctx, GFP_KERNEL) > > + } > > + > > + fn get_new_fence_seqno(&self) -> u64 { > > + self.seqno.fetch_add(1, Ordering::Relaxed) > > + } > > +} > > + > > +impl ArcBorrow<'_, DmaFenceCtx> { > > + /// Create a new fence, consuming `data`. > > + /// > > + /// The fence will increment the refcount of the fence context > > associated with this > > + /// [`DmaFenceCtx`]. > > + pub fn new_fence<T: DmaFenceNames>( > > + &mut self, > > + data: impl PinInit<T>, > > + ) -> Result<ARef<DmaFence<T>>> { > > + let fctx: Arc<DmaFenceCtx> = (*self).into(); > > + let seqno: u64 = fctx.get_new_fence_seqno(); > > + > > + // TODO: Should we reset seqno in case of failure? > > + // Pass `fctx` by value so that the fence will hold a reference to > > the DmaFenceCtx as long > > + // as it lives. > > + DmaFence::new(fctx, data, &self.lock, self.nr, seqno) > > + } > > +} > > + > > +/// A synchronization primitive mainly for GPU drivers. > > +/// > > +/// DmaFences are always reference counted. The typical use case is that > > one side registers > > +/// callbacks on the fence which will perform a certain action (such as > > queueing work) once the > > +/// other side signals the fence. > > +/// > > +/// # Examples > > +/// > > +/// ``` > > +/// use kernel::c_str; > > +/// use kernel::sync::{Arc, ArcBorrow, DmaFence, DmaFenceCtx, > > DmaFenceNames, DmaFenceCb, DmaFenceCbFunc}; > > +/// use core::sync::atomic::{self, AtomicBool}; > > +/// > > +/// static mut CHECKER: AtomicBool = AtomicBool::new(false); > > +/// > > +/// struct CallbackData { > > +/// i: u32, > > +/// } > > +/// > > +/// impl CallbackData { > > +/// fn new() -> Self { > > +/// Self { i: 9 } > > +/// } > > +/// } > > +/// > > +/// impl DmaFenceCbFunc for CallbackData { > > +/// fn callback(cb: Pin<KBox<DmaFenceCb<Self>>>) where Self: Sized { > > +/// assert_eq!(cb.data.i, 9); > > +/// // SAFETY: Just to have an easy way for testing. This cannot > > race with the checker > > +/// // because the fence signalling callbacks are executed > > synchronously. > > +/// unsafe { CHECKER.store(true, atomic::Ordering::Relaxed); } > > +/// } > > +/// } > > +/// > > +/// struct DriverData { > > +/// i: u32, > > +/// } > > +/// > > +/// impl DriverData { > > +/// fn new() -> Self { > > +/// Self { i: 5 } > > +/// } > > +/// } > > +/// > > +/// impl DmaFenceNames for DriverData { > > +/// const DRIVER_NAME: &CStr = c_str!("DUMMY_DRIVER"); > > +/// const TIMELINE_NAME: &CStr = c_str!("DUMMY_TIMELINE"); > > +/// } > > +/// > > +/// let data = DriverData::new(); > > +/// let fctx = DmaFenceCtx::new()?; > > +/// > > +/// let mut fence = fctx.as_arc_borrow().new_fence(data)?; > > +/// > > +/// let cb_data = CallbackData::new(); > > +/// fence.register_callback(cb_data); > > +/// // fence.begin_signalling(); > > +/// fence.signal()?; > > +/// // Now check wehether the callback was actually executed. > > +/// // SAFETY: `fence.signal()` above works sequentially. We just check > > here whether the signalling > > +/// // actually did set the boolean correctly. > > +/// unsafe { assert_eq!(CHECKER.load(atomic::Ordering::Relaxed), true); } > > +/// > > +/// Ok::<(), Error>(()) > > +/// ``` > > +#[pin_data] > > +pub struct DmaFence<T> { > > + /// The actual dma_fence passed to C. > > + #[pin] > > + inner: Opaque<bindings::dma_fence>, > > + /// User data. > > + #[pin] > > + data: T, > > + /// Marks whether the fence is currently in the signalling critical > > section. > > + signalling: bool, > > + /// A boolean needed for the C backend's lockdep guard. > > + signalling_cookie: bool, > > + /// A reference to the associated [`DmaFenceCtx`] so that it cannot be > > dropped while there are > > + /// still fences around. > > + fctx: Arc<DmaFenceCtx>, > > +} > > + > > +// SAFETY: `DmaFence` is safe to be sent to any task. > > +unsafe impl<T> Send for DmaFence<T> {} > > + > > +// SAFETY: `DmaFence` is safe to be accessed concurrently. > > +unsafe impl<T> Sync for DmaFence<T> {} > > + > > +// SAFETY: These implement the C backends refcounting methods which are > > proven to work correctly. > > +unsafe impl<T: DmaFenceNames> AlwaysRefCounted for DmaFence<T> { > > + fn inc_ref(&self) { > > + // SAFETY: `self.as_raw()` is a pointer to a valid `struct > > dma_fence`. > > + unsafe { bindings::dma_fence_get(self.as_raw()) } > > + } > > + > > + /// # Safety > > + /// > > + /// `ptr`must be a valid pointer to a [`DmaFence`]. > > + unsafe fn dec_ref(ptr: NonNull<Self>) { > > + // SAFETY: `ptr` is never a NULL pointer; and when `dec_ref()` is > > called > > + // the fence is by definition still valid. > > + let fence = unsafe { (*ptr.as_ptr()).inner.get() }; > > + > > + // SAFETY: Valid because `fence` was created validly above. > > + unsafe { bindings::dma_fence_put(fence) } > > + } > > +} > > + > > +impl<T: DmaFenceNames> DmaFence<T> { > > + // TODO: There could be a subtle potential problem here? The LLVM > > compiler backend can create > > + // several versions of this constant. Their content would be > > identical, but their addresses > > + // different. > > + const OPS: bindings::dma_fence_ops = Self::ops_create(); > > + > > + /// Create an initializer for a new [`DmaFence`]. > > + fn new( > > + fctx: Arc<DmaFenceCtx>, > > + data: impl PinInit<T>, // TODO: The driver data should implement > > PinInit<T, Error> > > + lock: &Opaque<bindings::spinlock_t>, > > + context: u64, > > + seqno: u64, > > + ) -> Result<ARef<Self>> { > > + let fence = pin_init!(Self { > > + inner <- Opaque::ffi_init(|slot: *mut bindings::dma_fence| { > > + let lock_ptr = &raw const (*lock); > > + // SAFETY: `slot` is a valid pointer to an uninitialized > > `struct dma_fence`. > > + // `lock_ptr` is a pointer to the spinlock of the fence > > context, which is shared > > + // among all the fences. This can't become a UAF because > > each fence takes a > > + // reference of the fence context. > > + unsafe { bindings::dma_fence_init(slot, &Self::OPS, > > Opaque::cast_into(lock_ptr), context, seqno) }; > > + }), > > + data <- data, > > + signalling: false, > > + signalling_cookie: false, > > + fctx: fctx, > > + }); > > + > > + let b = KBox::pin_init(fence, GFP_KERNEL)?; > > + > > + // SAFETY: We don't move the contents of `b` anywhere here. After > > unwrapping it, ARef will > > + // take care of preventing memory moves. > > + let rawptr = KBox::into_raw(unsafe { Pin::into_inner_unchecked(b) > > }); > > + > > + // SAFETY: `rawptr` was created validly above. > > + let aref = unsafe { ARef::from_raw(NonNull::new_unchecked(rawptr)) > > }; > > + > > + Ok(aref) > > + } > > + > > + /// Mark the beginning of a DmaFence signalling critical section. > > Should be called once a fence > > + /// gets published. > > + /// > > + /// The signalling critical section is marked as finished > > automatically once the fence signals. > > + pub fn begin_signalling(&mut self) { > > + // FIXME: this needs to be mutable, obviously, but we can't borrow > > mutably. *sigh* > > + self.signalling = true; > > + // TODO: Should we warn if a user tries to do this several times > > for the same > > + // fence? And should we ignore the request if the fence is already > > signalled? > > + > > + // SAFETY: `dma_fence_begin_signalling()` works on global lockdep > > data and calling it is > > + // always safe. > > + self.signalling_cookie = unsafe { > > bindings::dma_fence_begin_signalling() }; > > + } > > + > > + const fn ops_create() -> bindings::dma_fence_ops { > > + // SAFETY: Zeroing out memory on the stack is always safe. > > + let mut ops: bindings::dma_fence_ops = unsafe { > > core::mem::zeroed() }; > > + > > + ops.get_driver_name = Some(Self::get_driver_name); > > + ops.get_timeline_name = Some(Self::get_timeline_name); > > + ops.release = Some(Self::release); > > + > > + ops > > + } > > + > > + extern "C" fn get_driver_name(_ptr: *mut bindings::dma_fence) -> > > *const c_char { > > + T::DRIVER_NAME.as_ptr() > > + } > > + > > + extern "C" fn get_timeline_name(_ptr: *mut bindings::dma_fence) -> > > *const c_char { > > + T::TIMELINE_NAME.as_ptr() > > + } > > + > > + /// The release function called by the C backend once the refcount > > drops to 0. We use this to > > + /// drop the Rust dma-fence, too. Since [`DmaFence`] implements > > [`AlwaysRefCounted`], this is > > + /// perfectly safe and a convenient way to concile the two release > > mechanisms of C and Rust. > > + unsafe extern "C" fn release(ptr: *mut bindings::dma_fence) { > > + let ptr = Opaque::cast_from(ptr); > > + > > + // SAFETY: The constructor guarantees that `ptr` is always the > > inner fence of a DmaFence. > > + let fence = unsafe { crate::container_of!(ptr, Self, inner) > > }.cast_mut(); > > + > > + // SAFETY: See above. Also, the release callback will only be > > called once, when the > > + // refcount drops to 0, and when that happens the fence is by > > definition still valid. > > + unsafe { drop_in_place(fence) }; > > + } > > + > > + /// Signal the fence. This will invoke all registered callbacks. > > + pub fn signal(&self) -> Result { > > + // SAFETY: `self` is refcounted. > > + let ret = unsafe { bindings::dma_fence_signal(self.as_raw()) }; > > + if ret != 0 { > > + return Err(Error::from_errno(ret)); > > + } > > + > > + if self.signalling { > > + // SAFETY: `dma_fence_end_signalling()` works on global > > lockdep data. The only > > + // parameter is a boolean passed by value. > > + unsafe { > > bindings::dma_fence_end_signalling(self.signalling_cookie) }; > > + } > > + > > + Ok(()) > > + } > > + > > + /// Register a callback on a [`DmaFence`]. The callback will be > > invoked in the fence's > > + /// signalling path, i.e., critical section. > > + /// > > + /// Consumes `data`. `data` is passed back to the implemented callback > > function when the fence > > + /// gets signalled. > > + pub fn register_callback<U: DmaFenceCbFunc + 'static>(&self, data: > > impl PinInit<U>) -> Result { > > + let cb = DmaFenceCb::new(data)?; > > + let ptr = cb.into_foreign() as *mut DmaFenceCb<U>; > > + // SAFETY: `ptr` was created validly directly above. > > + let inner_cb = unsafe { (*ptr).inner.get() }; > > + > > + // SAFETY: `self.as_raw()` is valid because `self` is refcounted, > > `inner_cb` was created > > + // validly above and was turned into a ForeignOwnable, so it won't > > be dropped. `callback` > > + // has static life time. > > + let ret = unsafe { > > + bindings::dma_fence_add_callback( > > + self.as_raw(), > > + inner_cb, > > + Some(DmaFenceCb::<U>::callback), > > + ) > > + }; > > + if ret != 0 { > > + return Err(Error::from_errno(ret)); > > + } > > + Ok(()) > > + } > > + > > + fn as_raw(&self) -> *mut bindings::dma_fence { > > + self.inner.get() > > + } > > +} >