Christoph, the below patch should fix the current leak of the pinned
pages. I hope the page-pin that should be dropped by the
invalidate_range op, is enough to prevent the "physical page" mapped
on that "mm+address" to change before invalidate_range returns. If
that would ever happen, there would be a coherency loss between the
guest VM writes and the writes coming from userland on the same
mm+address from a different thread (qemu, whatever). invalidate_page
before PT lock was obviously safe. Now we entirely relay on the pin to
prevent the page to change before invalidate_range returns. If the pte
is unmapped and the page is mapped back in with a minor fault that's
ok, as long as the physical page remains the same for that mm+address,
until all sptes are gone.

Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>

diff --git a/mm/fremap.c b/mm/fremap.c
--- a/mm/fremap.c
+++ b/mm/fremap.c
@@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns
                spin_unlock(&mapping->i_mmap_lock);
        }
 
+       err = populate_range(mm, vma, start, size, pgoff);
        mmu_notifier(invalidate_range, mm, start, start + size, 0);
-       err = populate_range(mm, vma, start, size, pgoff);
        if (!err && !(flags & MAP_NONBLOCK)) {
                if (unlikely(has_write_lock)) {
                        downgrade_write(&mm->mmap_sem);
diff --git a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1639,8 +1639,6 @@ gotten:
        /*
         * Re-check the pte - we dropped the lock
         */
-       mmu_notifier(invalidate_range, mm, address,
-                               address + PAGE_SIZE - 1, 0);
        page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
        if (likely(pte_same(*page_table, orig_pte))) {
                if (old_page) {
@@ -1676,6 +1674,8 @@ gotten:
                page_cache_release(old_page);
 unlock:
        pte_unmap_unlock(page_table, ptl);
+       mmu_notifier(invalidate_range, mm, address,
+                               address + PAGE_SIZE - 1, 0);
        if (dirty_page) {
                if (vma->vm_file)
                        file_update_time(vma->vm_file);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to