On Tue, Aug 23, 2016 at 11:43 AM, Toshi Kani <toshi.k...@hpe.com> wrote: > The following BUG was observed while starting up KVM with nvdimm > device as memory-backend-file to /dev/dax. > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > IP: [<ffffffff811ac851>] get_zone_device_page+0x11/0x30 > Call Trace: > follow_devmap_pmd+0x298/0x2c0 > follow_page_mask+0x275/0x530 > __get_user_pages+0xe3/0x750 > __gfn_to_pfn_memslot+0x1b2/0x450 [kvm] > ? hrtimer_try_to_cancel+0x2c/0x120 > ? kvm_read_l1_tsc+0x55/0x60 [kvm] > try_async_pf+0x66/0x230 [kvm] > ? kvm_host_page_size+0x90/0xa0 [kvm] > tdp_page_fault+0x130/0x280 [kvm] > kvm_mmu_page_fault+0x5f/0xf0 [kvm] > handle_ept_violation+0x94/0x180 [kvm_intel] > vmx_handle_exit+0x1d3/0x1440 [kvm_intel] > ? atomic_switch_perf_msrs+0x6f/0xa0 [kvm_intel] > ? vmx_vcpu_run+0x2d1/0x490 [kvm_intel] > kvm_arch_vcpu_ioctl_run+0x81d/0x16a0 [kvm] > ? wake_up_q+0x44/0x80 > kvm_vcpu_ioctl+0x33c/0x620 [kvm] > ? __vfs_write+0x37/0x160 > do_vfs_ioctl+0xa2/0x5d0 > SyS_ioctl+0x79/0x90 > entry_SYSCALL_64_fastpath+0x1a/0xa4 > > devm_memremap_pages() calls for_each_device_pfn() to walk through > all pfns in page_map. pfn_first(), however, returns a wrong pfn > that leaves page->pgmap uninitialized. > > Since arch_add_memory() has set up direct mappings to the NVDIMM > range with altmap, pfn_first() should not modify the start pfn. > Change pfn_first() to simply return pfn of res->start. > > Reported-and-tested-by: Abhilash Kumar Mulumudi <m.abhilash-ku...@hpe.com> > Signed-off-by: Toshi Kani <toshi.k...@hpe.com> > Cc: Dan Williams <dan.j.willi...@intel.com> > Cc: Andrew Morton <a...@linux-foundation.org> > Cc: Ard Biesheuvel <ard.biesheu...@linaro.org> > Cc: Brian Starkey <brian.star...@arm.com> > --- > kernel/memremap.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/kernel/memremap.c b/kernel/memremap.c > index 251d16b..50ea577 100644 > --- a/kernel/memremap.c > +++ b/kernel/memremap.c > @@ -210,15 +210,9 @@ static void pgmap_radix_release(struct resource *res) > > static unsigned long pfn_first(struct page_map *page_map) > { > - struct dev_pagemap *pgmap = &page_map->pgmap; > const struct resource *res = &page_map->res; > - struct vmem_altmap *altmap = pgmap->altmap; > - unsigned long pfn; > > - pfn = res->start >> PAGE_SHIFT; > - if (altmap) > - pfn += vmem_altmap_offset(altmap); > - return pfn; > + return res->start >> PAGE_SHIFT; > }
I'm not sure about this fix. The point of honoring vmem_altmap_offset() is because a portion of the resource that is passed to devm_memremap_pages() also contains the metadata info block for the device. The offset says "use everything past this point for pages". This may work for avoiding a crash, but it may corrupt info block metadata in the process. Can you provide more information about the failing scenario to be sure that we are not triggering a fault on an address that is not meant to have a page mapping? I.e. what is the host physical address of the page that caused this fault, and is it valid?