2017-04-05 15:07+0200, Filippo Sironi:
> cmpxchg_gpte() calls get_user_pages_fast() to retrieve the number of
> pages and the respective struct pages for mapping in the kernel virtual
> address space.
> This doesn't work if get_user_pages_fast() is invoked with a userspace
> virtual address that's backed by PFNs outside of kernel reach (e.g.,
> when limiting the kernel memory with mem= in the command line and using
> /dev/mem to map memory).
> 
> If get_user_pages_fast() fails, look up the VMA that backs the userspace
> virtual address, compute the PFN and the physical address, and map it in
> the kernel virtual address space with memremap().

What is the reason for a configuration that voluntarily restricts access
to memory that it needs?

> Signed-off-by: Filippo Sironi <sir...@amazon.de>
> Cc: Anthony Liguori <aligu...@amazon.com>
> Cc: k...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> @@ -147,15 +147,36 @@ static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, 
> struct kvm_mmu *mmu,
>       struct page *page;
>  
>       npages = get_user_pages_fast((unsigned long)ptep_user, 1, 1, &page);
> -     /* Check if the user is doing something meaningless. */
> -     if (unlikely(npages != 1))
> -             return -EFAULT;
> -
> -     table = kmap_atomic(page);
> -     ret = CMPXCHG(&table[index], orig_pte, new_pte);
> -     kunmap_atomic(table);
> -
> -     kvm_release_page_dirty(page);
> +     if (likely(npages == 1)) {
> +             table = kmap_atomic(page);
> +             ret = CMPXCHG(&table[index], orig_pte, new_pte);
> +             kunmap_atomic(table);
> +
> +             kvm_release_page_dirty(page);
> +     } else {
> +             struct vm_area_struct *vma;
> +             unsigned long vaddr = (unsigned long)ptep_user & PAGE_MASK;
> +             unsigned long pfn;
> +             unsigned long paddr;
> +
> +             down_read(&current->mm->mmap_sem);
> +             vma = find_vma_intersection(current->mm, vaddr,
> +                                         vaddr + PAGE_SIZE);

Hm, with the argument order like this, we check that

  vaddr < vma->vm_end && vaddr + PAGE_SIZE > vma->vm_start

but shouldn't we actually check that the whole page is there, i.e.

  vaddr + PAGE_SIZE < vma->vm_end && vaddr > vma->vm_start

?

Thanks.

> +             if (!vma || !(vma->vm_flags & VM_PFNMAP)) {
> +                     up_read(&current->mm->mmap_sem);
> +                     return -EFAULT;
> +             }
> +             pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> +             paddr = pfn << PAGE_SHIFT;
> +             table = memremap(paddr, PAGE_SIZE, MEMREMAP_WB);

(I don't undestand why there isn't a wrapper for this ...
 Looks like we're doing something unexpected.)

> +             if (!table) {
> +                     up_read(&current->mm->mmap_sem);
> +                     return -EFAULT;
> +             }
> +             ret = CMPXCHG(&table[index], orig_pte, new_pte);
> +             memunmap(table);
> +             up_read(&current->mm->mmap_sem);
> +     }
>  
>       return (ret != orig_pte);
>  }
> -- 
> 2.7.4
> 

Reply via email to