On Tue, Sep 22, 2020 at 03:56:50PM +0200, Oscar Salvador wrote:
> Aristeu Rozanski reported that a customer test case started
> to report -EBUSY after the hwpoison rework patchset.
> 
> There is a race window between spotting a free page and taking it off
> its buddy freelist, so it might be that by the time we try to take it off,
> the page has been already allocated.
> 
> This patch tries to handle such race window by trying to handle the new
> type of page again if the page was allocated under us.
> 
> Signed-off-by: Oscar Salvador <osalva...@suse.de>
> Reported-by: Aristeu Rozanski <a...@ruivo.org>
> Tested-by: Aristeu Rozanski <a...@ruivo.org>

Acked-by: Naoya Horiguchi <naoya.horigu...@nec.com>

> ---
>  mm/memory-failure.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 46b1821d2817..8f23d3c7a0a2 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1903,6 +1903,7 @@ int soft_offline_page(unsigned long pfn, int flags)
>  {
>       int ret;
>       struct page *page;
> +     bool try_again = true;
>  
>       if (!pfn_valid(pfn))
>               return -ENXIO;
> @@ -1918,6 +1919,7 @@ int soft_offline_page(unsigned long pfn, int flags)
>               return 0;
>       }
>  
> +retry:
>       get_online_mems();
>       ret = get_any_page(page, pfn, flags);
>       put_online_mems();
> @@ -1925,7 +1927,10 @@ int soft_offline_page(unsigned long pfn, int flags)
>       if (ret > 0)
>               ret = soft_offline_in_use_page(page);
>       else if (ret == 0)
> -             ret = soft_offline_free_page(page);
> +             if (soft_offline_free_page(page) && try_again) {
> +                     try_again = false;
> +                     goto retry;
> +             }
>  
>       return ret;
>  }
> -- 
> 2.26.2
> 

Reply via email to