madvise_collapse() computes a THP-aligned window from the caller's range: hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK /* round up */ hend = end & HPAGE_PMD_MASK /* round down */
When the caller's range is smaller than one PMD (2 MiB) and/or not PMD-aligned, hstart can end up greater than hend. In that case the collapsing loop is correctly skipped, but the return value was computed as ((hend - hstart) >> HPAGE_PMD_SHIFT): with hstart > hend the subtraction wraps unsigned, producing a huge value, the comparison "thps != 0" fires, and -EINVAL is returned instead of 0. A concrete example: /* both cover less than one THP; both should return 0 */ madvise(aligned, PAGE_SIZE, MADV_COLLAPSE); /* OK, returns 0 */ madvise(aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); /* returns -EINVAL */ The fix moves the hstart/hend calculation before kmalloc_obj() and returns 0 early when hstart >= hend. This also avoids the kmalloc, mmgrab(), and lru_add_drain_all() calls for ranges that trivially contain no PMD window. The same effect could be achieved by only guarding the final return expression, but early-return keeps the no-op path free of the allocator and drain overhead. Patch 1 fixes the kernel bug. Patch 2 adds a selftest with two cases covering the hstart == hend (aligned, was already correct) and hstart > hend (unaligned, was broken) scenarios. Chen Wandun (2): mm/khugepaged: fix spurious -EINVAL from sub-PMD MADV_COLLAPSE range selftests/mm: add MADV_COLLAPSE sub-PMD range tests mm/khugepaged.c | 9 +- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 2 + .../selftests/mm/ksft_madv_collapse.sh | 4 + .../selftests/mm/madv_collapse_range.c | 141 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 5 + 6 files changed, 159 insertions(+), 3 deletions(-) create mode 100755 tools/testing/selftests/mm/ksft_madv_collapse.sh create mode 100644 tools/testing/selftests/mm/madv_collapse_range.c -- 2.43.0

