From: Yasuaki Ishimatsu <[email protected]>
[ Upstream commit ddffe98d166f4a93d996d5aa628fd745311fc1e7 ]
To identify that pages of page table are allocated from bootmem
allocator, magic number sets to page->lru.next.
But page->lru list is initialized in reserve_bootmem_region(). So when
calling free_pagetable(), the function cannot find the magic number of
pages. And free_pagetable() frees the pages by free_reserved_page() not
put_page_bootmem().
But if the pages are allocated from bootmem allocator and used as page
table, the pages have private flag. So before freeing the pages, we
should clear the private flag by put_page_bootmem().
Before applying the commit 7bfec6f47bb0 ("mm, page_alloc: check multiple
page fields with a single branch"), we could find the following visible
issue:
BUG: Bad page state in process kworker/u1024:1
page:ffffea103cfd8040 count:0 mapcount:0 mappi
flags: 0x6fffff80000800(private)
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
bad because of flags: 0x800(private)
<snip>
Call Trace:
[...] dump_stack+0x63/0x87
[...] bad_page+0x114/0x130
[...] free_pages_prepare+0x299/0x2d0
[...] free_hot_cold_page+0x31/0x150
[...] __free_pages+0x25/0x30
[...] free_pagetable+0x6f/0xb4
[...] remove_pagetable+0x379/0x7ff
[...] vmemmap_free+0x10/0x20
[...] sparse_remove_one_section+0x149/0x180
[...] __remove_pages+0x2e9/0x4f0
[...] arch_remove_memory+0x63/0xc0
[...] remove_memory+0x8c/0xc0
[...] acpi_memory_device_remove+0x79/0xa5
[...] acpi_bus_trim+0x5a/0x8d
[...] acpi_bus_trim+0x38/0x8d
[...] acpi_device_hotplug+0x1b7/0x418
[...] acpi_hotplug_work_fn+0x1e/0x29
[...] process_one_work+0x152/0x400
[...] worker_thread+0x125/0x4b0
[...] kthread+0xd8/0xf0
[...] ret_from_fork+0x22/0x40
And the issue still silently occurs.
Until freeing the pages of page table allocated from bootmem allocator,
the page->freelist is never used. So the patch sets magic number to
page->freelist instead of page->lru.next.
[[email protected]: fix merge issue]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yasuaki Ishimatsu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Xishi Qiu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/mm/init_64.c | 2 +-
mm/memory_hotplug.c | 5 +++--
mm/sparse.c | 2 +-
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 9a324fc8bed8..3e27ded6ac65 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -689,7 +689,7 @@ static void __meminit free_pagetable(struct page *page, int
order)
if (PageReserved(page)) {
__ClearPageReserved(page);
- magic = (unsigned long)page->lru.next;
+ magic = (unsigned long)page->freelist;
if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) {
while (nr_pages--)
put_page_bootmem(page++);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index ede137345a99..c9f715b2917f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -179,7 +179,7 @@ static void release_memory_resource(struct resource *res)
void get_page_bootmem(unsigned long info, struct page *page,
unsigned long type)
{
- page->lru.next = (struct list_head *) type;
+ page->freelist = (void *)type;
SetPagePrivate(page);
set_page_private(page, info);
page_ref_inc(page);
@@ -189,11 +189,12 @@ void put_page_bootmem(struct page *page)
{
unsigned long type;
- type = (unsigned long) page->lru.next;
+ type = (unsigned long) page->freelist;
BUG_ON(type < MEMORY_HOTPLUG_MIN_BOOTMEM_TYPE ||
type > MEMORY_HOTPLUG_MAX_BOOTMEM_TYPE);
if (page_ref_dec_return(page) == 1) {
+ page->freelist = NULL;
ClearPagePrivate(page);
set_page_private(page, 0);
INIT_LIST_HEAD(&page->lru);
diff --git a/mm/sparse.c b/mm/sparse.c
index 1e168bf2779a..8c4c82e358e6 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -662,7 +662,7 @@ static void free_map_bootmem(struct page *memmap)
>> PAGE_SHIFT;
for (i = 0; i < nr_pages; i++, page++) {
- magic = (unsigned long) page->lru.next;
+ magic = (unsigned long) page->freelist;
BUG_ON(magic == NODE_INFO);
--
2.11.0