On Tue, 2016-08-23 at 15:32 -0700, Dan Williams wrote: > On Tue, Aug 23, 2016 at 11:43 AM, Toshi Kani <toshi.k...@hpe.com> > wrote: : > I'm not sure about this fix. The point of honoring > vmem_altmap_offset() is because a portion of the resource that is > passed to devm_memremap_pages() also contains the metadata info block > for the device. The offset says "use everything past this point for > pages". This may work for avoiding a crash, but it may corrupt info > block metadata in the process. Can you provide more information > about the failing scenario to be sure that we are not triggering a > fault on an address that is not meant to have a page mapping? I.e. > what is the host physical address of the page that caused this fault, > and is it valid?
The fault address in question was the 2nd page of an NVDIMM range. I assumed this fault address was valid and needed to be handled. Here is some info about the base and patched cases. Let me know if you need more info. Base ==== The following NVDIMM range was set to /dev/dax. /proc/iomem 480000000-87fffffff : Persistent Memory devm_memremap_pages() initialized struct page from 0x490200-0x87ffff. This left 0x48000-0x4901ff uninitialized for page->pgmap. devm_memremap_pages: pgmap 0xffff88046d0453f0 [0] : pfn 0x490200, page ffffea0012408000, pgmap ffff88046d0453f0 [1] : pfn 0x490201, page ffffea0012408040, pgmap ffff88046d0453f0 [2] : pfn 0x490202, page ffffea0012408080, pgmap ffff88046d0453f0 [3] : pfn 0x490203, page ffffea00124080c0, pgmap ffff88046d0453f0 [4] : pfn 0x490204, page ffffea0012408100, pgmap ffff88046d0453f0 : [E+1]: pfn 0x880000, page ffffea0021ffffc0, pgmap ffff88046d0453f0 The faulted page was pfn 0x480001, which was the 2nd page in the NVDIMM range and did not have valid pgmap. This led the BUG. pfn 0x480001 page 0xffffea0012000040 page->pgmap 0xffffea0012000060 page->pgmap->ref (null) Patch ===== With the patch, devm_memremap_pages() initializes as follows. devm_memremap_pages: pgmap ffff880462b3b4b0 [0] : pfn 0x480000, page ffffea0012000000, pgmap ffff880462b3b4b0 [1] : pfn 0x480001, page ffffea0012000040, pgmap ffff880462b3b4b0 [2] : pfn 0x480002, page ffffea0012000080, pgmap ffff880462b3b4b0 [3] : pfn 0x480003, page ffffea00120000c0, pgmap ffff880462b3b4b0 [4] : pfn 0x480004, page ffffea0012000100, pgmap ffff880462b3b4b0 : [E+1]: pfn 0x880000, page ffffea0021ffffc0, pgmap ffff880462b3b4b0 A page fault to pfn 0x480001 is handled as it has valid pgmap. pfn 0x480001 page 0xffffea0012000040 page->pgmap 0xffff880462b3b4b0 page->pgmap->ref 0xffff880462b3b530 Its dev_pagemap and vmem_altmap are as follows. crash> p {struct dev_pagemap} 0xffff880462b3b4b0 $2 = { altmap = 0xffff880462b3b4d0, res = 0xffff880462b3b468, ref = 0xffff880462b3b530, dev = 0xffff880463e37010 } crash> p {struct vmem_altmap} 0xffff880462b3b4d0 $3 = { base_pfn = 0x480000, reserve = 0x2, free = 0x101fe, align = 0x1fe, alloc = 0x10000 } This page entry is physically located at 0x480200040. crash> vtop 0xffffea0012000040 VIRTUAL PHYSICAL ffffea0012000040 480200040 PML4 DIRECTORY: ffffffff81c06000 PAGE DIRECTORY: 47ffe6067 PUD: 47ffe6000 => 47ffe5067 PMD: 47ffe5480 => 80000004802001e3 PAGE: 480200000 (2MB) PTE PHYSICAL FLAGS 80000004802001e3 480200000 (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX) PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffea0012008000 480200000 0 0 1 4fffe000000400 reserved Thanks, -Toshi