On Sun, 03 May 2026 20:25:03 +0100
Gary Guo <[email protected]> wrote:

> On Sun May 3, 2026 at 4:39 AM BST, Onur Özkan wrote:
> >> > +/// Sleepable read-copy update primitive.
> >> > +///
> >> > +/// SRCU readers may sleep while holding the read-side guard.
> >> > +///
> >> > +/// The destructor may sleep.
> >> > +///
> >> > +/// # Invariants
> >> > +///
> >> > +/// This represents a valid `struct srcu_struct` initialized by the C 
> >> > SRCU API
> >> > +/// and it remains pinned and valid until the pinned destructor runs.
> >> > +#[repr(transparent)]
> >> > +#[pin_data(PinnedDrop)]
> >> > +pub struct Srcu {
> >> > +    #[pin]
> >> > +    inner: Opaque<bindings::srcu_struct>,
> >> > +}
> >> > +
> >> > +impl Srcu {
> >> > +    /// Creates a new SRCU instance.
> >> > +    #[inline]
> >> > +    pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> 
> >> > impl PinInit<Self, Error> {
> >> > +        try_pin_init!(Self {
> >> > +            inner <- Opaque::try_ffi_init(|ptr: *mut 
> >> > bindings::srcu_struct| {
> >> > +                // SAFETY: `ptr` points to valid uninitialised memory 
> >> > for a `srcu_struct`.
> >> > +                to_result(unsafe {
> >> > +                    bindings::init_srcu_struct_with_key(ptr, 
> >> > name.as_char_ptr(), key.as_ptr())
> >> > +                })
> >> > +            }),
> >> > +        })
> >> > +    }
> >> > +
> >> > +    /// Enters an SRCU read-side critical section.
> >> > +    ///
> >> > +    /// # Safety
> >> > +    ///
> >> > +    /// The returned [`Guard`] must not be leaked. Leaking it with 
> >> > [`core::mem::forget`]
> >> > +    /// leaves the SRCU read-side critical section active.
> >> 
> >> I generally would prefer if we could use guard-like API instead of forcing 
> >> a
> >> callback.
> >
> > Me too and developers can still do that. I think the safety contract here is
> > very simple to handle. It's essentially this:
> >
> >     // SAFETY: Guard is not leaked.
> >     let _guard = unsafe { x.read_lock() };
> >
> > To me it's very simple and straightforward for both the developer and the
> > reviewer. It doesn't add any overhead to the implementation and it ensures
> > that the developer (and later the reviewer) is aware of the potential issue.
> >
> > Of course, there's also the safe option if the developer is happy with
> > closure-based API:
> >
> >     x.with_read_lock(|_guard| {
> >             ...
> >     });
> >
> > So it allows you to use the guard-based approach directly with the 
> > requirement
> > of a safety comment and also provides a safe API for developers who don't 
> > want
> > to deal with that. I am not sure if you fall into the third category, which 
> > is
> > "I don't like writing safety comments and I don't like the closure-based
> > approach" :)
> 
> We have been avoiding using callback-based API if there's an alternatively way
> to achieve this. There has been quite a very precedents with this:
> 
> - spin_lock_irqsave requires taking and releasing in correct order, which is
>   easy to solve with a callback approach. The same logic reasoning can be used
>   to provide an unsafe API + safe callback API, but Boqun & Lyude reworked the
>   spinlock IRQ design so we don't need that anymore.
> 
> - `Task::current` API could easily be replaced callback-based approach, but we
>   used a macro to achieve without unsafe.
> 
> The API here is not inherently impossible to use guard API. It's not safe 
> today
> because a very specific detail. The callback-API is the "path of least
> resistence" approach, but it's not the optimal one.
> 
> >
> >> 
> >> I suppose the only reason that this is unsafe is the "just leak it" 
> >> condition
> >> when cleaning up SRCU struct, which skips cleaning up delayed work, which 
> >> could
> >> call into `process_srcu`, which accesses `srcu_struct`. This however is 
> >> *not*
> >> leaked, because it's controlled by the user. Only the auxillary data 
> >> allocated
> >> by SRCU is leaked. So UAF is going to happen.
> >> 
> >> So in some aspect, the leak caused by `srcu_readers_active(ssp)` can cause 
> >> more
> >> damage compared to just continuing cleanup despite active users? I think 
> >> this
> >> could be changed in one of these ways:
> >> * Have SRCU allocate all memory instead, and user-side would just have a
> >>   `struct srcu_struct*`; then leaking would be safe. This is probably a bit
> >>   difficult to change because it affects many users.
> >
> > We could do the same on the Rust side only. Basically instead of embedding
> > srcu_struct in Rust srcu, allocate it separately and store its pointer. 
> > Then,
> > if cleanup hits the active reader case, we could leak that allocation so any
> > remaining srcu work does not hit UAF. I was aware of this option but I would
> > prefer to avoid it because it adds an extra allocation for every Rust srcu.
> >
> >> * Continue to flush delayed work and stop the timer, and then leak before 
> >> the
> >>   actual kfree happens.
> >
> > Can you say more? I didn't understand this particular solution.
> 
> I was thinking that doing this _might_ be sufficient. I have to admit that 
> I've
> not very familar with the internal implementation of SRCU to make it an
> assertion though.
> 
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 0d01cd8c4b4a..5d75a4dbb6e5 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -717,8 +717,6 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
>       raw_spin_unlock_irq_rcu_node(ssp->srcu_sup);
>       if (WARN_ON(!delay))
>               return; /* Just leak it! */
> -     if (WARN_ON(srcu_readers_active(ssp)))
> -             return; /* Just leak it! */
>       /* Wait for irq_work to finish first as it may queue a new work. */
>       irq_work_sync(&sup->irq_work);
>       flush_delayed_work(&sup->work);
> 
> But after taking another look, I am not even sure if this is needed. A quick
> glance of the code it appears that __srcu_read_unlock doesn't do anything 
> apart
> from adjusting the counter, and the SRCU grace period and thus the timers 
> won't
> actually start unless there's a pending grace period, which won't start unless
> there's a call_srcu or sychronize_srcu. And we *know* that none of them would
> happen, as the lifetime guarantees that nothing accesses the `Srcu` struct 
> when
> `drop` starts, and inside drop we have already invoked `srcu_barrier()`.
> 
> So I think, even if we hit the "Just leak it" scenario, we can still safely
> deallocate the backing storage of `srcu_struct` and nothing should break?
> 
> >
> >> * Trigger a `BUG` when the leak condition is hit for Rust users.
> >
> > We need an atomic counter to detect the leak and I thought that would be too
> > much overhead for this abstraction. Basically each lock and drop will call 
> > an
> > atomic operation so.
> 
> You could just check if srcu_sup is NULL after calling `cleanup_srcu_struct`.
> 
> Best,
> Gary
> 
> >
> >> * Declare the `WARN_ON` to be a sufficient protection and say this can be
> >>   considered safe. Kinda similar to the strategy we take to the
> >>   sleep-inside-atomic context issue.
> >
> > I think this is a rather weak precaution.
> >
> 

Alright, here is the alternative approach I just figured and I think this makes
the most sense compared to the ones we discussed in this series:
        
        #[pinned_drop]
        impl PinnedDrop for Srcu {
                fn drop(self: Pin<&mut Self>) {
                        let ptr = self.inner.get();

                        // `cleanup_srcu_struct()` may return early if readers 
are still active. Because `Srcu`
                        // owns the embedded `srcu_struct`, returning from 
`drop` in that state could free memory
                        // that is still referenced by the C side.
                        //
                        // Wait for all readers to complete first. If any 
`Guard` was leaked, `synchronize_srcu()`
                        // will sleep forever.
                        //
                        // SAFETY: By the type invariants, `self` contains a 
valid and pinned `struct srcu_struct`.
                        unsafe { bindings::synchronize_srcu(ptr) };

                        // ...

(the code snippet above is copied from [1])

As explained in the doc comments, we avoid the potential UAFs by sleeping
forever which is considered non-unsafe, right? Alice said during today's
call that "sleeping is not good, but it is not unsafe". If that's the case,
I think this fixes the overall problem and we can make read_unlock safe again.

What do you think? I will hold the next version until we reach a common point.

[1]: https://github.com/onur-ozkan/linux/commit/28d9739f03

- Onur

Reply via email to