On Tue, 2026-04-28 at 15:08 +0200, Daniel Wagner wrote:
> On Mon, Apr 27, 2026 at 12:55:20PM +0200, Florian Bezdeka wrote:
> > This topic reminds me of a discussion started by Tobias [1] some time
> > ago about IRQ spreading of network drivers. The problem was (and still
> > is) that network drivers ignore any CPU isolation when spreading out
> > device IRQs.
> > 
> > In general we have two different CPU isolation mechanisms:
> >   - The static one, via isolcpus= cmdline parameter
> >   - The dynamic one, via cgroups(v2) cpuset controller
> > 
> > This series is only taking the static "world" into account, right? Are
> > there any plans to honor the CPU isolations configured the dynamic
> > way?
> 
> Dynamic configuration would require every driver to fully support
> reconfiguration during runtime. Only a handful of drivers, such as
> nvme-pci, are currently able to handle this.
> 
> The first task, teaching a wide range of drivers to honor CPU isolation
> at boot time, is already going to be a significant amount of work.
> 
> > It has been a while since the last investigations on my end. Last time I
> > went through the code, the IRQ core was completely decoupled from the
> > dynamic configuration via cgroups. Are there any plans to fix that gap?
> 
> Which use case are you actually aiming to support? While dynamic
> reconfiguration would be ideal, the amount of work to get there is
> significant. I won't be signing up for it.

The use case at hand is a RT enabled platform where the concrete RT
workload is not known at boot time. RT applications are deployed "on-
the-fly", nowadays using the existing container runtimes with some
extended resource management on top.

Applications can request certain resources like isolated CPU cores,
special IRQ affinities, PCI devices to pass through, ...,  so that the
resource management on the system can take care of proper system
configuration.

As you said, this requires that drivers are able to honor the system
configuration during runtime. We already identified IRQ spreading of the
network subsystem to be problematic. And yes, that will need some
efforts to fix that. We are still in phase of searching for similar
problems and a starting point for a discussion. Any input in that
direction is highly appreciated.

Reply via email to