On Tue, Feb 18, 2025 at 05:23:52PM +0100, Salvatore Bonaccorso wrote:
> > > > Microsoft has observed that the 5.10.y kernels in bullseye are 
> > > > susceptible
> > > > to crashes due to race conditions in the NVME/PCI subsystem.  See below 
> > > > for
> > > > a representative kernel log.  The problem appears most frequently in 
> > > > larger
> > > > systems, e.g. with 4 or more NVME devices and >= 64 CPUs, but it could
> > > > potentially occur on smaller systems as well.
> > > > 
> > > > The issue was fixed with the 5.14 kernel upstream in e4b9852a0 
> > > > ("nvme-pci:
> > > > fix multiple races in nvme_setup_io_queues"), so this only impacts
> > > > oldstable.  I have provided a backport of this commit upstream in
> > > > https://lore.kernel.org/stable/E1tj8vO-00471h-2H@lore/
> > > > 
> > > > I'm requesting that this commit be included in a bullseye kernel update.
> > > 
> > > AFAICS, this backport has not been accepted back then for 5.10.y. Can
> > > you re-ping upstream to make sure it get included in the 5.10.y
> > > series? Once this has happened as we follow the 5.10.y series it will
> > > be included (or can be included in advance once it has been queued).
> > 
> > Yes, I forgot to reset the date on the commit that I sent upstream,
> > which is why it looks like it's been around since 2021.  I requested
> > that upstream apply the fix to 5.10.y last week, and will ping them in
> > another week or two if it hasn't been acknowledged either way...
> 
> Noah, thanks!

This has now been queued for acceptance upstream.

noah

Reply via email to