Danilo Krummrich wrote: > On Thu Jan 29, 2026 at 4:42 AM CET, dan.j.williams wrote: > > Jason Gunthorpe wrote: > >> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote: > >> > > The latter already have robust schemes to help the driver shutdown and > >> > > end the concurrent operations. ie cancel_work_sync(), > >> > > del_timer_sync(), free_irq(), and *notifier_unregister(). > >> > > >> > One a side note, devm_request_irq() is another of the devm_* helpers > >> > that cause race conditions, as interrupt handlers can run right after > >> > .remove() returns, which drivers will most likely not handle correctly. > >> > >> Yes! You *cannot* intermix devm and non-devm approaches without > >> creating very subtle bugs exactly like this. If your subsystem does > >> not provide a "devm register" helper its drivers shouldn't use devm. > > > > I wonder if we should have a proactive debug mode that checks for > > idiomatic devres usage and flags: > > > > - registering devres actions while the driver is detached > > In Rust we already enforce this through the type system, i.e. we even fail to > compile the code when this is violated. (I.e. for all scopes that guarantee > that > a device is bound (and will remain throughout) we give out a &Device<Bound>, > which serves as a cookie.) > > In C I don't really see how that would be possible without additional locking, > as the only thing we could check is dev->driver, which obviously is racy. > > > - registering devres actions for a device with a driver that has a > > .remove() method > > This is perfectly valid, drivers might still be performing teardown operations > on the device (if it did not get hot unplugged).
Yes, this one is a soft warning. It is perfectly valid, but the first thing I look for in a device that uses devm in ->probe() and also has a ->remove() method is dependencies of the devm object on the ->remove() managed object. That case is potentially mitigated if all resources are devm acquired and no ->remove() is needed. > > - passing a devres allocation to a kobject API > > Well, this is an ownership violation. Technically, devres owns this allocation > and devres releases the allocation when the device is unbound. Which makes > this > allocation only ever valid to access from a scope that is guaranteed that the > device is (and remains) bound. To be clear I am talking about: dev2 = devm_kzalloc(dev1...) init(dev2) device_register(dev2) ...where it must be valid past unbind of dev1. > > - invoking devres release actions from a kobject release API > > This is similar to "registering devres actions while the driver is detached" > and > falls into the boarder problem class of "messing with devres objects outside > of > a Device<Bound> scope". > > Again, I think in C we can't properly protect against this. While we could > also > give out a "Device<Bound>" token for scopes where we can guarantee that the > device is actually bound to a driver in C, we can't control the lifetime of > the > token as we can in Rust, which makes it pointless. This is why Rust remains on my, learn when I have time, pile. The goal would not be to "properly protect", but "sufficiently warn" the unsuspecting. If anything is going to get me to convert a subsystem to Rust it is to get help from the compiler for the review burden of the above abuses. > So, the best thing we can probably do is document that "This must only be > called > from a scope that guarantees that the device remains bound throughout." for > all > the devres accessors. Agree. > There may be one thing that comes to my mind that we could do though. While we > can't catch those at compile time, we might be able to catch those on runtime. > > For instance, we could "abuse" lockdep and register a fake lock for a > Device<Bound> scope and put a lockdep assert into all the devres accessors. > > However, fixing up all the violations that come up when introducing this > sounds > like a huge pain. :) Right, and as you said there are many valid uses that are not typically recommended that would make the warning not useful.
