Hi,

On Tue, Feb 18, 2025 at 09:56:53AM -0500, Noah Meyerhans wrote:
> On Tue, Feb 18, 2025 at 03:11:08PM +0100, Salvatore Bonaccorso wrote:
> > > Microsoft has observed that the 5.10.y kernels in bullseye are susceptible
> > > to crashes due to race conditions in the NVME/PCI subsystem.  See below 
> > > for
> > > a representative kernel log.  The problem appears most frequently in 
> > > larger
> > > systems, e.g. with 4 or more NVME devices and >= 64 CPUs, but it could
> > > potentially occur on smaller systems as well.
> > > 
> > > The issue was fixed with the 5.14 kernel upstream in e4b9852a0 ("nvme-pci:
> > > fix multiple races in nvme_setup_io_queues"), so this only impacts
> > > oldstable.  I have provided a backport of this commit upstream in
> > > https://lore.kernel.org/stable/E1tj8vO-00471h-2H@lore/
> > > 
> > > I'm requesting that this commit be included in a bullseye kernel update.
> > 
> > AFAICS, this backport has not been accepted back then for 5.10.y. Can
> > you re-ping upstream to make sure it get included in the 5.10.y
> > series? Once this has happened as we follow the 5.10.y series it will
> > be included (or can be included in advance once it has been queued).
> 
> Yes, I forgot to reset the date on the commit that I sent upstream,
> which is why it looks like it's been around since 2021.  I requested
> that upstream apply the fix to 5.10.y last week, and will ping them in
> another week or two if it hasn't been acknowledged either way...

Noah, thanks!

Regards,
Salvatore

Reply via email to