On Fri, Nov 10, 2017 at 10:13:40AM -0500, Josef Bacik wrote:
> From: Josef Bacik <[email protected]>
> 
> Since we're allocating under atomic we could every easily enomem, so if
> that's the case and we can block then loop around and try to allocate
> the prealloc not under a lock.
> 
> We also saw this happen during try_to_release_page in production, in
> which case it's completely valid to return ENOMEM so we can tell
> try_to_release_page that we can't release this page.

Have you audited that all direct and indirect callers of
__clear_extent_bit handle the errors? Because they do not. The only case
that seem to understand ENOMEM failures of __clear_extent_bit is
try_release_extent_state.

Almost anything else that calls __clear_extent_bit, clear_extent_bit and
other simple wrappers do not check the value and would happily continue.

> @@ -673,7 +677,15 @@ static int __clear_extent_bit(struct extent_io_tree 
> *tree, u64 start, u64 end,
>  
>       if (state->start < start) {
>               prealloc = alloc_extent_state_atomic(prealloc);
> -             BUG_ON(!prealloc);
> +             if (!prealloc) {
> +                     if (gfpflags_allow_blocking(mask)) {
> +                             need_prealloc = true;
> +                             spin_unlock(&tree->lock);
> +                             goto again;
> +                     }
> +                     err = -ENOMEM;

The retry logic is good, but until ENOMEM is properly handled
everywhere, the safest thing is to move the BUG_ON here.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to