On Mon, Jun 08, 2026 at 09:48:34AM -0400, Michael S. Tsirkin wrote:
> On Mon, Jun 08, 2026 at 10:43:21AM +0100, Lorenzo Stoakes wrote:
> > On Mon, Jun 08, 2026 at 04:34:23AM -0400, Michael S. Tsirkin wrote:
> > > TestSetPageHWPoison() is called without zone->lock, so its atomic
> > > update to page->flags can race with non-atomic flag operations
> > > that run under zone->lock in the buddy allocator.
> > >
> > > In particular, __free_pages_prepare() does:
> > >
> > >     page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
> > >
> > > This non-atomic read-modify-write, while correctly excluding
> > > __PG_HWPOISON from the mask, can still lose a concurrent
> > > TestSetPageHWPoison if the read happens before the poison bit
> > > is set and the write happens after.  Follow-up patches in this
> > > series add similar non-atomic flag operations as well.
> > >
> > > Fix by acquiring zone->lock around TestSetPageHWPoison and
> > > around ClearPageHWPoison in the retry path.  This
> > > serializes with all buddy flag manipulation.  The cost is
> > > negligible: one lock/unlock in an extremely rare path
> > > (hardware memory errors).
> > >
> > > Note: SetPageHWPoison and TestClearPageHWPoison calls elsewhere
> > > in this file operate on pages already removed from the buddy
> > > allocator or on non-buddy pages (DAX, hugetlb), so they do not
> > > need zone->lock protection.
> > >
> > > Acked-by: Miaohe Lin <[email protected]>
> > > Signed-off-by: Michael S. Tsirkin <[email protected]>
> >
> > Can we have Fixes: and Cc: stable and also send this separately please?
> >
> > These patches seem like unrelated fixups that you've discovered along the 
> > way,
> > and don't belong as part of the already rather large series, unless I'm 
> > missing
> > something here.
> >
> > Thanks, Lorenzo
>
> I think you are mising that they are a dependency, not unrelated.

Then say so.

> For example, this issue gets worse with the patchset as there are more
> places that manipulate flags without atomics. No?

It's your job to make that case, not mine.

>
>
> You are welcome to send this to stable, but I think stable rules
> preclude theoretical bugfixes.

It's a dependency but also theoretical?

>
> As for Fixes: the issue has been there for decades. I wouldn't know
> what to attribute it for.

Again, your job.

>
>
> I guess I could send these separately, too, why not. Not sure
> what this accomplishes, but hey.  But is that an ack? You want
> this fix merged even before the feature?

I already made the case as to why, as have other maintainers.

If you need to review what an ack looks like please consult
https://docs.kernel.org/process/5.Posting.html

Thanks, Lorenzo

Reply via email to