This patchset is the latest version of soft offline rework patchset
targetted for v5.9.
Main focus of this series is to stabilize soft offline. Historically soft
offlined pages have suffered from racy conditions because PageHWPoison is
used to a little too aggressively, which (directly or indirec
From: Naoya Horiguchi
The call to get_user_pages_fast is only to get the pointer to a struct
page of a given address, pinning it is memory-poisoning handler's job,
so drop the refcount grabbed by get_user_pages_fast().
Note that the target page is still pinned after this put_page() because
the c
From: Naoya Horiguchi
Drop the PageHuge check, which is dead code since memory_failure() forks
into memory_failure_hugetlb() for hugetlb pages.
memory_failure() and memory_failure_hugetlb() shares some functions like
hwpoison_user_mappings() and identify_page_state(), so they should properly
han
From: Oscar Salvador
Place the THP's page handling in a helper and use it
from both hard and soft-offline machinery, so we get rid
of some duplicated code.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
mm/memory-failure.c | 48 +
From: Naoya Horiguchi
Another memory error injection interface debugfs:hwpoison/corrupt-pfn
also takes bogus refcount for hwpoison_filter(). It's justified
because this does a coarse filter, expecting that memory_failure()
redoes the check for sure.
Signed-off-by: Naoya Horiguchi
Signed-off-by:
From: Oscar Salvador
After commit 4e41a30c6d50 ("mm: hwpoison: adjust for new thp refcounting"),
put_hwpoison_page got reduced to a put_page.
Let us just use put_page instead.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
include/linux/mm.h | 1 -
mm/memory-failure.c | 30
From: Oscar Salvador
Make a proper if-else condition for {hard,soft}-offline.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
mm/madvise.c | 16 ++--
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git v5.8-rc7-mmotm-2020-07-27-18-18/mm/madvise.c
v5.8-rc
From: Oscar Salvador
This patch changes the way we set and handle in-use poisoned pages.
Until now, poisoned pages were released to the buddy allocator, trusting
that the checks that take place prior to hand the page would act as a
safe net and would skip that page.
This has proved to be wrong,
From: Oscar Salvador
Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.
Now, __soft_offline_page will handle both normal and hugetlb pages.
Note that move put_page() block to the beginning of page_handle
From: Oscar Salvador
Currently, there is an inconsistency when calling soft-offline from
different paths on a page that is already poisoned.
1) madvise:
madvise_inject_error skips any poisoned page and continues
the loop.
If that was the only page to madvise, it returns
From: Naoya Horiguchi
memory_failure() is supposed to call action_result() when it handles
a memory error event, but there's one missing case. So let's add it.
I find that include/ras/ras_event.h has some other MF_MSG_* undefined,
so this patch also adds them.
Signed-off-by: Naoya Horiguchi
Si
From: Naoya Horiguchi
hpage is never used after try_to_split_thp_page() in memory_failure(),
so we don't have to update hpage. So let's not recalculate/use hpage.
Suggested-by: "Aneesh Kumar K.V"
Signed-off-by: Naoya Horiguchi
Signed-off-by: Oscar Salvador
Reviewed-by: Mike Kravetz
---
mm/
From: Oscar Salvador
When trying to soft-offline a free page, we need to first take it off
the buddy allocator.
Once we know is out of reach, we can safely flag it as poisoned.
take_page_off_buddy will be used to take a page meant to be poisoned
off the buddy allocator.
take_page_off_buddy calls
From: Naoya Horiguchi
Soft offlining could fail with EIO due to the race condition with
hugepage migration. This issuse became visible due to the change by
previous patch that makes soft offline handler take page refcount
by its own. We have no way to directly pin zero refcount page, and
the pag
From: Oscar Salvador
Since get_hwpoison_page is only used in memory-failure code now,
let us un-export it and make it private to that code.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
include/linux/mm.h | 1 -
mm/memory-failure.c | 3 +--
2 files changed, 1 insertion(+),
From: Naoya Horiguchi
Now there's no user of MF_COUNT_INCREASED, so we can safely remove
it from all calling points.
Signed-off-by: Naoya Horiguchi
Signed-off-by: Oscar Salvador
---
include/linux/mm.h | 7 +++
mm/memory-failure.c | 14 +++---
2 files changed, 6 insertions(+), 15
From: Naoya Horiguchi
The argument @flag no longer affects the behavior of soft_offline_page()
and its variants, so let's remove them.
Signed-off-by: Naoya Horiguchi
Signed-off-by: Oscar Salvador
---
drivers/base/memory.c | 2 +-
include/linux/mm.h| 2 +-
mm/madvise.c | 2 +-
From: Naoya Horiguchi
Another memory error injection interface debugfs:hwpoison/corrupt-pfn
also takes bogus refcount for hwpoison_filter(). It's justified
because this does a coarse filter, expecting that memory_failure()
redoes the check for sure.
Signed-off-by: Naoya Horiguchi
Signed-off-by:
From: Naoya Horiguchi
Drop the PageHuge check, which is dead code since memory_failure() forks
into memory_failure_hugetlb() for hugetlb pages.
memory_failure() and memory_failure_hugetlb() shares some functions like
hwpoison_user_mappings() and identify_page_state(), so they should properly
han
Hi,
This patchset is the latest version of soft offline rework patchset
targetted for v5.9.
Since v5, I dropped some patches which tweak refcount handling in
madvise_inject_error() to avoid the "unknown refcount page" error.
I don't confirm the fix (that didn't reproduce with v5 in my environment
From: Naoya Horiguchi
hpage is never used after try_to_split_thp_page() in memory_failure(),
so we don't have to update hpage. So let's not recalculate/use hpage.
Suggested-by: "Aneesh Kumar K.V"
Signed-off-by: Naoya Horiguchi
Signed-off-by: Oscar Salvador
Reviewed-by: Mike Kravetz
---
mm/
From: Oscar Salvador
Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.
Now, __soft_offline_page will handle both normal and hugetlb pages.
Note that move put_page() block to the beginning of page_handle
From: Oscar Salvador
This patch changes the way we set and handle in-use poisoned pages.
Until now, poisoned pages were released to the buddy allocator, trusting
that the checks that take place prior to hand the page would act as a
safe net and would skip that page.
This has proved to be wrong,
From: Naoya Horiguchi
memory_failure() is supposed to call action_result() when it handles
a memory error event, but there's one missing case. So let's add it.
I find that include/ras/ras_event.h has some other MF_MSG_* undefined,
so this patch also adds them.
Signed-off-by: Naoya Horiguchi
Si
From: Oscar Salvador
Currently, there is an inconsistency when calling soft-offline from
different paths on a page that is already poisoned.
1) madvise:
madvise_inject_error skips any poisoned page and continues
the loop.
If that was the only page to madvise, it returns
From: Oscar Salvador
When trying to soft-offline a free page, we need to first take it off
the buddy allocator.
Once we know is out of reach, we can safely flag it as poisoned.
take_page_off_buddy will be used to take a page meant to be poisoned
off the buddy allocator.
take_page_off_buddy calls
From: Oscar Salvador
Place the THP's page handling in a helper and use it
from both hard and soft-offline machinery, so we get rid
of some duplicated code.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
mm/memory-failure.c | 48 +
From: Oscar Salvador
Since get_hwpoison_page is only used in memory-failure code now,
let us un-export it and make it private to that code.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
include/linux/mm.h | 1 -
mm/memory-failure.c | 3 +--
2 files changed, 1 insertion(+),
From: Oscar Salvador
After commit 4e41a30c6d50 ("mm: hwpoison: adjust for new thp refcounting"),
put_hwpoison_page got reduced to a put_page.
Let us just use put_page instead.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
include/linux/mm.h | 1 -
mm/memory-failure.c | 30
From: Naoya Horiguchi
Soft offlining could fail with EIO due to the race condition with
hugepage migration. This issuse became visible due to the change by
previous patch that makes soft offline handler take page refcount
by its own. We have no way to directly pin zero refcount page, and
the pag
From: Naoya Horiguchi
Drop the PageHuge check, which is dead code since memory_failure() forks
into memory_failure_hugetlb() for hugetlb pages.
memory_failure() and memory_failure_hugetlb() shares some functions like
hwpoison_user_mappings() and identify_page_state(), so they should properly
han
From: Naoya Horiguchi
Now there's no user of MF_COUNT_INCREASED, so we can safely remove
it from all calling points.
Signed-off-by: Naoya Horiguchi
Signed-off-by: Oscar Salvador
---
include/linux/mm.h | 7 +++
mm/memory-failure.c | 14 +++---
2 files changed, 6 insertions(+), 15
From: Oscar Salvador
Since get_hwpoison_page is only used in memory-failure code now,
let us un-export it and make it private to that code.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
include/linux/mm.h | 1 -
mm/memory-failure.c | 3 +--
2 files changed, 1 insertion(+),
From: Naoya Horiguchi
The call to get_user_pages_fast is only to get the pointer to a struct
page of a given address, pinning it is memory-poisoning handler's job,
so drop the refcount grabbed by get_user_pages_fast().
Note that the target page is still pinned after this put_page() because
the c
From: Oscar Salvador
Make a proper if-else condition for {hard,soft}-offline.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
mm/madvise.c | 16 ++--
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git v5.8-rc1-mmots-2020-06-20-21-44/mm/madvise.c
v5.8-rc
From: Naoya Horiguchi
hpage is never used after try_to_split_thp_page() in memory_failure(),
so we don't have to update hpage. So let's not recalculate/use hpage.
Suggested-by: "Aneesh Kumar K.V"
Signed-off-by: Naoya Horiguchi
---
mm/memory-failure.c | 6 +-
1 file changed, 1 insertion(+
From: Oscar Salvador
When trying to soft-offline a free page, we need to first take it off
the buddy allocator.
Once we know is out of reach, we can safely flag it as poisoned.
take_page_off_buddy will be used to take a page meant to be poisoned
off the buddy allocator.
take_page_off_buddy calls
From: Naoya Horiguchi
The argument @flag no longer affects the behavior of soft_offline_page()
and its variants, so let's remove them.
Signed-off-by: Naoya Horiguchi
Signed-off-by: Oscar Salvador
---
drivers/base/memory.c | 2 +-
include/linux/mm.h| 2 +-
mm/madvise.c | 2 +-
From: Naoya Horiguchi
Another memory error injection interface debugfs:hwpoison/corrupt-pfn
also takes bogus refcount for hwpoison_filter(). It's justified
because this does a coarse filter, expecting that memory_failure()
redoes the check for sure.
Signed-off-by: Naoya Horiguchi
Signed-off-by:
From: Oscar Salvador
Place the THP's page handling in a helper and use it
from both hard and soft-offline machinery, so we get rid
of some duplicated code.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
mm/memory-failure.c | 48 +
From: Oscar Salvador
This patch changes the way we set and handle in-use poisoned pages.
Until now, poisoned pages were released to the buddy allocator, trusting
that the checks that take place prior to hand the page would act as a
safe net and would skip that page.
This has proved to be wrong,
From: Oscar Salvador
Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.
Now, __soft_offline_page will handle both normal and hugetlb pages.
Note that move put_page() block to the beginning of page_handle
From: Oscar Salvador
Currently, there is an inconsistency when calling soft-offline from
different paths on a page that is already poisoned.
1) madvise:
madvise_inject_error skips any poisoned page and continues
the loop.
If that was the only page to madvise, it returns
From: Oscar Salvador
After commit 4e41a30c6d50 ("mm: hwpoison: adjust for new thp refcounting"),
put_hwpoison_page got reduced to a put_page.
Let us just use put_page instead.
Signed-off-by: Oscar Salvador
Signed-off-by: Naoya Horiguchi
---
include/linux/mm.h | 1 -
mm/memory-failure.c | 30
I rebased soft-offline rework patchset [1][2] onto the latest mmotm. The
rebasing required some non-trivial changes to adjust, but mainly that was
straightforward. I confirmed that the reported problem doesn't reproduce on
compaction after soft offline. For more precise description of the proble
From: Naoya Horiguchi
memory_failure() is supposed to call action_result() when it handles
a memory error event, but there's one missing case. So let's add it.
I find that include/ras/ras_event.h has some other MF_MSG_* undefined,
so this patch also adds them.
Signed-off-by: Naoya Horiguchi
--
46 matches
Mail list logo