We hit a regression while rolling out 5.2 internally where we were
hitting the following panic

kernel BUG at mm/page-writeback.c:2659!
RIP: 0010:clear_page_dirty_for_io+0xe6/0x1f0
Call Trace:
 __process_pages_contig+0x25a/0x350
 ? extent_clear_unlock_delalloc+0x43/0x70
 submit_compressed_extents+0x359/0x4d0
 normal_work_helper+0x15a/0x330
 process_one_work+0x1f5/0x3f0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x111/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x1f/0x30

this is happening because the page is not locked when doing
clear_page_dirty_for_io.  Looking at the core dump it was because our
async_extent had a ram_size of 24576 but our async_chunk range only
spanned 20480, so we had a whole extra page in our ram_size for our
async_extent.

This happened because we try not to compress pages outside of our
i_size, however a cleanup patch changed us to do

actual_end = min_t(u64, i_size_read(inode), end + 1);

which is problematic because i_size_read() can evaluate to different
values in between checking and assigning.  So either a expanding
truncate or a fallocate could increase our i_size while we're doing
writeout and actual_end would end up being past the range we have
locked.

I confirmed this was what was happening by installing a debug kernel
that had

actual_end = min_t(u64, i_size_read(inode), end + 1);
if (actual_end > end + 1) {
        printk(KERN_ERR "WE GOT FUCKED\n");
        actual_end = end + 1;
}

and installing it onto 500 boxes of the tier that had been seeing the
problem regularly.  Last night I got my debug message and no panic,
confirming what I expected.

Fixes: 62b37622718c ("btrfs: Remove isize local variable in 
compress_file_range")
cc: [email protected]
Signed-off-by: Josef Bacik <[email protected]>
---
 fs/btrfs/inode.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2eb1d7249f83..9a483d1f61f8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -474,6 +474,7 @@ static noinline int compress_file_range(struct async_chunk 
*async_chunk)
        u64 start = async_chunk->start;
        u64 end = async_chunk->end;
        u64 actual_end;
+       loff_t i_size = i_size_read(inode);
        int ret = 0;
        struct page **pages = NULL;
        unsigned long nr_pages;
@@ -488,7 +489,13 @@ static noinline int compress_file_range(struct async_chunk 
*async_chunk)
        inode_should_defrag(BTRFS_I(inode), start, end, end - start + 1,
                        SZ_16K);
 
-       actual_end = min_t(u64, i_size_read(inode), end + 1);
+       /*
+        * We need to save i_size before now because it could change in between
+        * us evaluating the size and assigning it.  This is because we lock and
+        * unlock the page in truncate and fallocate, and then modify the i_size
+        * later on.
+        */
+       actual_end = min_t(u64, i_size, end + 1);
 again:
        will_compress = 0;
        nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
-- 
2.21.0

Reply via email to