Public bug reported: Latest Ubuntu trusty with kernel 3.13.0-91-generic run in a KVM virtual machine with virtio_balloon hangs when the previously inflated balloon is deflated.
The problem is in the recently committed backport: commit 838478a8496ef9677256f53710144abe2ea49625 Author: Konstantin Khlebnikov <k.khlebni...@samsung.com> AuthorDate: Mon May 16 14:43:10 2016 +0800 Commit: Kamal Mostafa <ka...@canonical.com> CommitDate: Fri Jun 10 07:15:37 2016 -0700 mm/balloon_compaction: redesign ballooned pages management BugLink: http://bugs.launchpad.net/bugs/1572562 Sasha Levin reported KASAN splash inside isolate_migratepages_range(). Problem is in the function __is_movable_balloon_page() which tests AS_BALLOON_MAP in page->mapping->flags. This function has no protection against anonymous pages. As result it tried to check address space flags inside struct anon_vma. Further investigation shows more problems in current implementation: * Special branch in __unmap_and_move() never works: balloon_page_movable() checks page flags and page_count. In __unmap_and_move() page is locked, reference counter is elevated, thus balloon_page_movable() always fails. As a result execution goes to the normal migration path. virtballoon_migratepage() returns MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS, move_to_new_page() thinks this is an error code and assigns newpage->mapping to NULL. Newly migrated page lose connectivity with balloon an all ability for further migration. * lru_lock erroneously required in isolate_migratepages_range() for isolation ballooned page. This function releases lru_lock periodically, this makes migration mostly impossible for some pages. * balloon_page_dequeue have a tight race with balloon_page_isolate: balloon_page_isolate could be executed in parallel with dequeue between picking page from list and locking page_lock. Race is rare because they use trylock_page() for locking. This patch fixes all of them. Instead of fake mapping with special flag this patch uses special state of page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256. Buddy allocator uses PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose. Storing mark directly in struct page makes everything safer and easier. PagePrivate is used to mark pages present in page list (i.e. not isolated, like PageLRU for normal pages). It replaces special rules for reference counter and makes balloon migration similar to migration of normal pages. This flag is protected by page_lock together with link to the balloon device. Signed-off-by: Konstantin Khlebnikov <k.khlebni...@samsung.com> Reported-by: Sasha Levin <sasha.le...@oracle.com> Link: http://lkml.kernel.org/p/53e6ceaa.9020...@oracle.com Cc: Rafael Aquini <aqu...@redhat.com> Cc: Andrey Ryabinin <ryabinin....@gmail.com> Cc: <sta...@vger.kernel.org> [3.8+] Signed-off-by: Andrew Morton <a...@linux-foundation.org> Signed-off-by: Linus Torvalds <torva...@linux-foundation.org> (backported from commit d6d86c0a7f8ddc5b38cf089222cb1d9540762dc2) Signed-off-by: Gavin Guo <gavin....@canonical.com> Conflicts: mm/balloon_compaction.c mm/migrate.c Acked-by: Stefan Bader <stefan.ba...@canonical.com> Signed-off-by: Kamal Mostafa <ka...@canonical.com> It was applied after another backport: commit 47618e32c2a729554bf56b8ee7b541b63aadad97 Author: Minchan Kim <minc...@kernel.org> AuthorDate: Mon Dec 28 08:35:13 2015 +0900 Commit: Luis Henriques <luis.henriq...@canonical.com> CommitDate: Mon Feb 22 19:31:53 2016 +0000 virtio_balloon: fix race between migration and ballooning BugLink: http://bugs.launchpad.net/bugs/1542497 commit 21ea9fb69e7c4b1b1559c3e410943d3ff248ffcb upstream. In balloon_page_dequeue, pages_lock should cover the loop (ie, list_for_each_entry_safe). Otherwise, the cursor page could be isolated by compaction and then list_del by isolation could poison the page->lru.{prev,next} so the loop finally could access wrong address like this. This patch fixes the bug. general protection fault: 0000 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1-access_bit+ #1906 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000 RIP: 0010:[<ffffffff8115e754>] [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246 RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68 RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000 R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0 R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060 FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0 Stack: 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020 ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060 Call Trace: [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0 [<ffffffff812c8bc7>] balloon+0x217/0x2a0 [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0 [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0 [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0 [<ffffffff8105b6e9>] kthread+0xc9/0xe0 [<ffffffff8105b620>] ? kthread_park+0x60/0x60 [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70 [<ffffffff8105b620>] ? kthread_park+0x60/0x60 Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 RIP [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 RSP <ffff8800a7fefdc0> ---[ end trace 43cf28060d708d5f ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Signed-off-by: Minchan Kim <minc...@kernel.org> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> Acked-by: Rafael Aquini <aqu...@redhat.com> Signed-off-by: Kamal Mostafa <ka...@canonical.com> This resulted in the following code: 82 struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) 83 { 84 struct page *page, *tmp; 85 unsigned long flags; 86 bool dequeued_page; 87 88 dequeued_page = false; 89 spin_lock_irqsave(&b_dev_info->pages_lock, flags); 90 list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { 91 /* 92 * Block others from accessing the 'page' while we get around 93 * establishing additional references and preparing the 'page' 94 * to be released by the balloon driver. 95 */ 96 if (trylock_page(page)) { 97 #ifdef CONFIG_BALLOON_COMPACTION 98 if (!PagePrivate(page)) { 99 /* raced with isolation */ 100 unlock_page(page); 101 continue; 102 } 103 #endif 104 spin_lock_irqsave(&b_dev_info->pages_lock, flags); 105 balloon_page_delete(page); 106 spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); 107 unlock_page(page); 108 dequeued_page = true; 109 break; 110 } 111 } 112 spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); Note the line 104 takes the spinlock already taken in line 89. ** Affects: linux-lts-trusty (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-lts-trusty in Ubuntu. https://bugs.launchpad.net/bugs/1598197 Title: deadlock on balloon deflation Status in linux-lts-trusty package in Ubuntu: New Bug description: Latest Ubuntu trusty with kernel 3.13.0-91-generic run in a KVM virtual machine with virtio_balloon hangs when the previously inflated balloon is deflated. The problem is in the recently committed backport: commit 838478a8496ef9677256f53710144abe2ea49625 Author: Konstantin Khlebnikov <k.khlebni...@samsung.com> AuthorDate: Mon May 16 14:43:10 2016 +0800 Commit: Kamal Mostafa <ka...@canonical.com> CommitDate: Fri Jun 10 07:15:37 2016 -0700 mm/balloon_compaction: redesign ballooned pages management BugLink: http://bugs.launchpad.net/bugs/1572562 Sasha Levin reported KASAN splash inside isolate_migratepages_range(). Problem is in the function __is_movable_balloon_page() which tests AS_BALLOON_MAP in page->mapping->flags. This function has no protection against anonymous pages. As result it tried to check address space flags inside struct anon_vma. Further investigation shows more problems in current implementation: * Special branch in __unmap_and_move() never works: balloon_page_movable() checks page flags and page_count. In __unmap_and_move() page is locked, reference counter is elevated, thus balloon_page_movable() always fails. As a result execution goes to the normal migration path. virtballoon_migratepage() returns MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS, move_to_new_page() thinks this is an error code and assigns newpage->mapping to NULL. Newly migrated page lose connectivity with balloon an all ability for further migration. * lru_lock erroneously required in isolate_migratepages_range() for isolation ballooned page. This function releases lru_lock periodically, this makes migration mostly impossible for some pages. * balloon_page_dequeue have a tight race with balloon_page_isolate: balloon_page_isolate could be executed in parallel with dequeue between picking page from list and locking page_lock. Race is rare because they use trylock_page() for locking. This patch fixes all of them. Instead of fake mapping with special flag this patch uses special state of page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256. Buddy allocator uses PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose. Storing mark directly in struct page makes everything safer and easier. PagePrivate is used to mark pages present in page list (i.e. not isolated, like PageLRU for normal pages). It replaces special rules for reference counter and makes balloon migration similar to migration of normal pages. This flag is protected by page_lock together with link to the balloon device. Signed-off-by: Konstantin Khlebnikov <k.khlebni...@samsung.com> Reported-by: Sasha Levin <sasha.le...@oracle.com> Link: http://lkml.kernel.org/p/53e6ceaa.9020...@oracle.com Cc: Rafael Aquini <aqu...@redhat.com> Cc: Andrey Ryabinin <ryabinin....@gmail.com> Cc: <sta...@vger.kernel.org> [3.8+] Signed-off-by: Andrew Morton <a...@linux-foundation.org> Signed-off-by: Linus Torvalds <torva...@linux-foundation.org> (backported from commit d6d86c0a7f8ddc5b38cf089222cb1d9540762dc2) Signed-off-by: Gavin Guo <gavin....@canonical.com> Conflicts: mm/balloon_compaction.c mm/migrate.c Acked-by: Stefan Bader <stefan.ba...@canonical.com> Signed-off-by: Kamal Mostafa <ka...@canonical.com> It was applied after another backport: commit 47618e32c2a729554bf56b8ee7b541b63aadad97 Author: Minchan Kim <minc...@kernel.org> AuthorDate: Mon Dec 28 08:35:13 2015 +0900 Commit: Luis Henriques <luis.henriq...@canonical.com> CommitDate: Mon Feb 22 19:31:53 2016 +0000 virtio_balloon: fix race between migration and ballooning BugLink: http://bugs.launchpad.net/bugs/1542497 commit 21ea9fb69e7c4b1b1559c3e410943d3ff248ffcb upstream. In balloon_page_dequeue, pages_lock should cover the loop (ie, list_for_each_entry_safe). Otherwise, the cursor page could be isolated by compaction and then list_del by isolation could poison the page->lru.{prev,next} so the loop finally could access wrong address like this. This patch fixes the bug. general protection fault: 0000 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1-access_bit+ #1906 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000 RIP: 0010:[<ffffffff8115e754>] [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 RSP: 0018:ffff8800a7fefdc0 EFLAGS: 00010246 RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68 RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000 R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0 R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060 FS: 0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0 Stack: 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020 ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060 Call Trace: [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0 [<ffffffff812c8bc7>] balloon+0x217/0x2a0 [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0 [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0 [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0 [<ffffffff8105b6e9>] kthread+0xc9/0xe0 [<ffffffff8105b620>] ? kthread_park+0x60/0x60 [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70 [<ffffffff8105b620>] ? kthread_park+0x60/0x60 Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89 RIP [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130 RSP <ffff8800a7fefdc0> ---[ end trace 43cf28060d708d5f ]--- Kernel panic - not syncing: Fatal exception Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Signed-off-by: Minchan Kim <minc...@kernel.org> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> Acked-by: Rafael Aquini <aqu...@redhat.com> Signed-off-by: Kamal Mostafa <ka...@canonical.com> This resulted in the following code: 82 struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) 83 { 84 struct page *page, *tmp; 85 unsigned long flags; 86 bool dequeued_page; 87 88 dequeued_page = false; 89 spin_lock_irqsave(&b_dev_info->pages_lock, flags); 90 list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { 91 /* 92 * Block others from accessing the 'page' while we get around 93 * establishing additional references and preparing the 'page' 94 * to be released by the balloon driver. 95 */ 96 if (trylock_page(page)) { 97 #ifdef CONFIG_BALLOON_COMPACTION 98 if (!PagePrivate(page)) { 99 /* raced with isolation */ 100 unlock_page(page); 101 continue; 102 } 103 #endif 104 spin_lock_irqsave(&b_dev_info->pages_lock, flags); 105 balloon_page_delete(page); 106 spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); 107 unlock_page(page); 108 dequeued_page = true; 109 break; 110 } 111 } 112 spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); Note the line 104 takes the spinlock already taken in line 89. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1598197/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp