On 2026/6/9 15:09, David Hildenbrand (Arm) wrote:
On 6/9/26 04:39, Miaohe Lin wrote:
On 2026/6/8 22:15, Breno Leitao wrote:
On Fri, Jun 05, 2026 at 11:42:53AM +0200, David Hildenbrand (Arm) wrote:
I mean, any such races can currently already happen one way or the other?
Really, the only way to not get races is to tryget the (compound)page,
revalidate that the page is still part of the compound page.
I'm not sure if that's really a good idea.
But my memory is a bit vague in which scenarios we already hold a page reference
here to prevent any concurrent freeing?
No, we don't hold one here in the case that matters.
HWPoisonKernelOwned() runs at the very top of get_any_page(), before
try_again: and before __get_hwpoison_page(). The first refcount taken in
the whole path is the folio_try_get() inside __get_hwpoison_page(), which
runs *after* the short-circuit.
So get_any_page() itself never holds a reference at the check -- the only way
one exists is if the caller passed MF_COUNT_INCREASED (count_increased ==
true).
So on the MCE/GHES path -- the one this panic option exists for -- no
reference is held when HWPoisonKernelOwned() does its compound_head() +
PageSlab()/PageTable()/PageLargeKmalloc() checks.
Given that, I'd rather keep it racy and take no refcount than add a
tryget + revalidate purely for this check. As I've said earleir, an operator
Would it be acceptable to add a simple recheck? Something like below:
retry:
head = compound_head(page);
PageSlab()/PageTable()/PageLargeKmalloc() checks
if (head != compound_head(page))
goto retry
Sure. I guess it could still be racy in some weird scenarios where we
free+allocate+free in-between.
+1, sounds reasonable to me. Still racy, but acceptable here I guess :D