4.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kirill A. Shutemov <[email protected]>

commit 58ceeb6bec86d9140f9d91d71a710e963523d063 upstream.

Both MADV_DONTNEED and MADV_FREE handled with down_read(mmap_sem).

It's critical to not clear pmd intermittently while handling MADV_FREE
to avoid race with MADV_DONTNEED:

        CPU0:                           CPU1:
                                madvise_free_huge_pmd()
                                 pmdp_huge_get_and_clear_full()
madvise_dontneed()
 zap_pmd_range()
  pmd_trans_huge(*pmd) == 0 (without ptl)
  // skip the pmd
                                 set_pmd_at();
                                 // pmd is re-established

It results in MADV_DONTNEED skipping the pmd, leaving it not cleared.
It violates MADV_DONTNEED interface and can result is userspace
misbehaviour.

Basically it's the same race as with numa balancing in
change_huge_pmd(), but a bit simpler to mitigate: we don't need to
preserve dirty/young flags here due to MADV_FREE functionality.

[[email protected]: Urgh... Power is special again]
  Link: 
http://lkml.kernel.org/r/[email protected]
Link: 
http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kirill A. Shutemov <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hillf Danton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
 mm/huge_memory.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1393,8 +1393,7 @@ bool madvise_free_huge_pmd(struct mmu_ga
                deactivate_page(page);
 
        if (pmd_young(orig_pmd) || pmd_dirty(orig_pmd)) {
-               orig_pmd = pmdp_huge_get_and_clear_full(tlb->mm, addr, pmd,
-                       tlb->fullmm);
+               pmdp_invalidate(vma, addr, pmd);
                orig_pmd = pmd_mkold(orig_pmd);
                orig_pmd = pmd_mkclean(orig_pmd);
 


Reply via email to