On Tue, Jun 9, 2026 at 4:37 AM Lance Yang <[email protected]> wrote: > > > > On 2026/6/9 17:32, Nico Pache wrote: > > On Tue, Jun 9, 2026 at 3:26 AM David Hildenbrand (Arm) <[email protected]> > > wrote: > >> > >> On 6/9/26 11:06, Nico Pache wrote: > >>> On Mon, Jun 8, 2026 at 8:57 AM David Hildenbrand (Arm) <[email protected]> > >>> wrote: > >>>> > >>>> On 6/6/26 12:28, Lance Yang wrote: > >>>>> > >>>>> > >>>>> Looks broken for swap PTEs in PMD collapse ... > >>>>> > >>>>> collapse_scan_pmd() allows them up to max_ptes_swap and record them in > >>>>> unmapped, but they don't get a bit in mthp_present_ptes. And then > >>>>> mthp_collapse() does the check above: > >>>> > >>>> Right. I assumed this is implicitly handled by the optimization in > >>>> collapse_scan_pmd: > >>>> > >>>> if (enabled_orders != BIT(HPAGE_PMD_ORDER)) > >>>> max_ptes_none = KHUGEPAGED_MAX_PTES_LIMIT; > >>>> > >>>> But we perform the check a second time. > >>>> > >>>>> > >>>>> nr_occupied_ptes >= nr_ptes - max_ptes_none > >>>>> > >>>>> So max_ptes_none=0 + 511 present PTEs + one allowed swap PTE won't even > >>>>> call collapse_huge_page() for PMD order. > >>>>> > >>>>> Shouldn't we account for them in the PMD-order check? Something like: > >>>>> > >>>>> if (is_pmd_order(order)) > >>>>> nr_occupied_ptes += unmapped; > >>> > >>> This solution seems good for a temporary fixup. but longterm we may > >>> want something else. I'm still not sure how we plan on supporting > >>> swapin without causing creep. So I'd be ok with adding a fix for > >>> legacy PMD behavior until we know how to handle mTHP creep correctly. > >>> > >>>> As an alternative, we could either 1) skip the check there for > >>>> pmd order (as the check was already done); or 2) introduce+maintain > >>>> a bitmap that tracks non-present PTEs. > >>>> > >>>> @@ -1475,7 +1477,9 @@ static enum scan_result mthp_collapse(struct > >>>> mm_struct *mm, > >>>> nr_occupied_ptes = > >>>> bitmap_weight_from(cc->mthp_present_ptes, offset, > >>>> offset + nr_ptes); > >>>> > >>>> - if (nr_occupied_ptes >= nr_ptes - max_ptes_none) { > >>>> + /* Check was already done in the caller. */ > >>>> + if (is_pmd_order(order) || > >>>> + nr_occupied_ptes >= nr_ptes - max_ptes_none) { > >>>> enum scan_result ret; > >>>> > >>>> collapse_address = address + offset * PAGE_SIZE; > >>>> > >>>> 2) would probably be cleanest long-term. > >>> > >>> That would be best for future swapin support in mTHP, but I still > >>> don't think it solves the creep issue. > >> > >> It wouldn't, we'd simply maintain the state we collect + rely on in > >> separate > >> bitmaps. On swapin, we'd have to update/refresh bitmaps I guess. > > > > Yeah, I'm saying for the future, it obviously solves this issue here > > as well, but if we have positional tracking of the swapout, shared, > > and none PTEs, I think we can use this to determine whether the > > collapse would lead to creep. If we detect creep would happen it may > > be best to automatically collapse to the N+1 (or greater) candidate. > > Just thinking outloud here. > > > >> > >>> Perhaps we could combine the > >>> two bitmaps to determine if it would make the future collapse eligible > >>> again? Not sure but ill start thinking about it. > >>> > >>> Should I send a fixup for this using Lance's solution? Or does Lance > >>> want to send a patch out with the fixes tag? > >> > >> If Lance could send a fixup, explaining the situation, that would be nice. > > Sure, happy to send a fixup :P > > Should I send it as a fixup to be folded into this patch, or as a > separate patch with a Fixes: tag?
Id assume a seperate patch so you can keep credit for the discovery :) Thank you for all the review you provided on this series, its been really helpful! -- Nico > > Will get one out soon :) > > > OK, I'd appreciate that :) > > Cheers! >
