This is the fix that will need SRU to jammy/focal, root@nikhil-Lenovo-Legion-Y540-15IRH-PG0:/home/nikhil/Downloads/upstream_kernel/linux/fs/fuse# git show 7d6b902ea0e02b2a25c480edf471cbaa4ebe6b3c commit 7d6b902ea0e02b2a25c480edf471cbaa4ebe6b3c Author: Coly Li <col...@suse.de> Date: Fri May 27 23:28:16 2022 +0800
bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() The local variables check_state (in bch_btree_check()) and state (in bch_sectors_dirty_init()) should be fully filled by 0, because before allocating them on stack, they were dynamically allocated by kzalloc(). Signed-off-by: Coly Li <col...@suse.de> Link: https://lore.kernel.org/r/20220527152818.27545-2-col...@suse.de Signed-off-by: Jens Axboe <ax...@kernel.dk> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 2362bb8ef6d1..e136d6edc1ed 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -2017,6 +2017,7 @@ int bch_btree_check(struct cache_set *c) if (c->root->level == 0) return 0; + memset(&check_state, 0, sizeof(struct btree_check_state)); check_state.c = c; check_state.total_threads = bch_btree_chkthread_nr(); check_state.key_idx = 0; diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 75b71199800d..d138a2d73240 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -950,6 +950,7 @@ void bch_sectors_dirty_init(struct bcache_device *d) return; } + memset(&state, 0, sizeof(struct bch_dirty_init_state)); state.c = c; state.d = d; state.total_threads = bch_btre_dirty_init_thread_nr(); ** Description changed: SRU Justification: [Impact] - This bug in bcache affects focal and jammy releases. + This bug in bcache affects (at least) focal and jammy releases. (not tested bionic as yet) When Random Read I/O is started with a test like - fio --name=read_iops --directory=/home/ubuntu/bcache_mount/ --size=16G --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128 --rw=randread --randrepeat=0 or random read-writes with a test like, fio --filename=/home/ubuntu/bcache_mount/cachedfile --size=15GB --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128 --name=iops-test-job --randrepeat=0 traces are seen in the kernel log, [ 4473.699902] INFO: task bcache_writebac:1835 blocked for more than 120 seconds. [ 4474.050921] Not tainted 5.15.50-051550-generic #202206251445 [ 4474.350883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4474.731391] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4474.731408] Call Trace: [ 4474.731411] <TASK> [ 4474.731413] __schedule+0x23d/0x5a0 [ 4474.731433] schedule+0x4e/0xb0 [ 4474.731436] rwsem_down_write_slowpath+0x220/0x3d0 [ 4474.731441] down_write+0x43/0x50 [ 4474.731446] bch_writeback_thread+0x78/0x320 [bcache] [ 4474.731471] ? read_dirty_submit+0x70/0x70 [bcache] [ 4474.731487] kthread+0x12a/0x150 [ 4474.731491] ? set_kthread_struct+0x50/0x50 [ 4474.731494] ret_from_fork+0x22/0x30 [ 4474.731499] </TASK> The bug exists till kernel 5.15.50-051550-generic, and is fixed through upstream commit commit b57a534df246999bc09a510dc0e0125cacd1724c (HEAD) Author: Coly Li <col...@suse.de> Date: Fri May 27 23:28:16 2022 +0800 - bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() - - The local variables check_state (in bch_btree_check()) and state (in - bch_sectors_dirty_init()) should be fully filled by 0, because before - allocating them on stack, they were dynamically allocated by kzalloc(). - - Signed-off-by: Coly Li <col...@suse.de> - Link: https://lore.kernel.org/r/20220527152818.27545-2-col...@suse.de - Signed-off-by: Jens Axboe <ax...@kernel.dk> - + bcache: memset on stack variables in bch_btree_check() and + bch_sectors_dirty_init() + + The local variables check_state (in bch_btree_check()) and state (in + bch_sectors_dirty_init()) should be fully filled by 0, because before + allocating them on stack, they were dynamically allocated by kzalloc(). + + Signed-off-by: Coly Li <col...@suse.de> + Link: https://lore.kernel.org/r/20220527152818.27545-2-col...@suse.de + Signed-off-by: Jens Axboe <ax...@kernel.dk> The reproducer is pasted below: # uname -a Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux - NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS - sdd 8:48 0 279.4G 0 disk - └─sdd1 8:49 0 60G 0 part - └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount - nvme0n1 259:0 0 372.6G 0 disk - └─nvme0n1p1 259:2 0 15G 0 part - └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount - + sdd 8:48 0 279.4G 0 disk + └─sdd1 8:49 0 60G 0 part + └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount + nvme0n1 259:0 0 372.6G 0 disk + └─nvme0n1p1 259:2 0 15G 0 part + └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount + fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128 --rw=randread --group_reporting=1 read_iops: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 fio-3.28 Starting 1 process read_iops: Laying out IO file (1 file / 12288MiB) - - The test does not progress beyond a few minutes, and this trace is then seen in the kernel log, + + The test does not progress beyond a few minutes, and this trace is then + seen in the kernel log, [ 4473.699902] INFO: task bcache_writebac:1835 blocked for more than 120 seconds. [ 4474.050921] Not tainted 5.15.50-051550-generic #202206251445 [ 4474.350883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4474.731391] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4474.731408] Call Trace: [ 4474.731411] <TASK> [ 4474.731413] __schedule+0x23d/0x5a0 [ 4474.731433] schedule+0x4e/0xb0 [ 4474.731436] rwsem_down_write_slowpath+0x220/0x3d0 [ 4474.731441] down_write+0x43/0x50 [ 4474.731446] bch_writeback_thread+0x78/0x320 [bcache] [ 4474.731471] ? read_dirty_submit+0x70/0x70 [bcache] [ 4474.731487] kthread+0x12a/0x150 [ 4474.731491] ? set_kthread_struct+0x50/0x50 [ 4474.731494] ret_from_fork+0x22/0x30 [ 4474.731499] </TASK> [ 4474.731735] INFO: task fio:16626 blocked for more than 121 seconds. [ 4475.035858] Not tainted 5.15.50-051550-generic #202206251445 [ 4475.335859] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4475.716526] task:fio state:D stack: 0 pid:16626 ppid: 1715 flags:0x00004002 [ 4475.716531] Call Trace: [ 4475.716546] <TASK> [ 4475.716549] __schedule+0x23d/0x5a0 [ 4475.716555] ? sysvec_apic_timer_interrupt+0x4e/0x90 [ 4475.716560] schedule+0x4e/0xb0 [ 4475.716563] rwsem_down_read_slowpath+0x32e/0x380 [ 4475.716567] down_read+0x43/0x90 [ 4475.716571] cached_dev_write+0x7e/0x480 [bcache] [ 4475.716604] cached_dev_submit_bio+0x502/0x550 [bcache] [ 4475.716618] __submit_bio+0x1a1/0x220 [ 4475.716623] __submit_bio_noacct+0x85/0x1f0 [ 4475.716626] submit_bio_noacct+0x4e/0x120 [ 4475.716628] submit_bio+0x4a/0x130 [ 4475.716631] iomap_submit_ioend+0x53/0x80 [ 4475.716634] iomap_writepages+0x35/0x40 [ 4475.716636] xfs_vm_writepages+0x84/0xb0 [xfs] [ 4475.716730] do_writepages+0xda/0x200 [ 4475.716736] filemap_fdatawrite_wbc+0x81/0xd0 [ 4475.716739] file_write_and_wait_range+0xac/0xf0 [ 4475.716742] xfs_file_fsync+0x5b/0x250 [xfs] [ 4475.716821] vfs_fsync_range+0x49/0x80 [ 4475.716826] ? __fget_light+0x32/0x80 [ 4475.716829] __x64_sys_fsync+0x38/0x60 [ 4475.716832] do_syscall_64+0x5c/0xc0 [ 4475.716836] ? ksys_write+0xce/0xe0 [ 4475.716838] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4475.716843] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716847] ? __x64_sys_write+0x19/0x20 [ 4475.716849] ? do_syscall_64+0x69/0xc0 [ 4475.716851] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716855] ? __x64_sys_write+0x19/0x20 [ 4475.716857] ? do_syscall_64+0x69/0xc0 [ 4475.716859] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4475.716862] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716865] ? __x64_sys_write+0x19/0x20 [ 4475.716867] ? do_syscall_64+0x69/0xc0 [ 4475.716869] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716873] ? __x64_sys_write+0x19/0x20 [ 4475.716875] ? do_syscall_64+0x69/0xc0 [ 4475.716877] ? do_syscall_64+0x69/0xc0 [ 4475.716879] ? asm_common_interrupt+0x8/0x40 [ 4475.716883] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 4475.716885] RIP: 0033:0x7fd3c498aa5b [ 4475.716888] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 4475.716890] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b [ 4475.716892] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006 [ 4475.716893] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40 [ 4475.716895] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000 [ 4475.716896] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000 [ 4475.716898] </TASK> [ 4475.716912] INFO: task kworker/26:87:16682 blocked for more than 122 seconds. [ 4476.063391] Not tainted 5.15.50-051550-generic #202206251445 [ 4476.363476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4476.743988] task:kworker/26:87 state:D stack: 0 pid:16682 ppid: 2 flags:0x00004000 [ 4476.743994] Workqueue: events update_writeback_rate [bcache] [ 4476.744022] Call Trace: [ 4476.744024] <TASK> [ 4476.744026] __schedule+0x23d/0x5a0 [ 4476.744032] schedule+0x4e/0xb0 [ 4476.744035] rwsem_down_read_slowpath+0x32e/0x380 [ 4476.744039] down_read+0x43/0x90 [ 4476.744042] update_writeback_rate+0x134/0x190 [bcache] [ 4476.744057] process_one_work+0x22b/0x3d0 [ 4476.744063] worker_thread+0x53/0x410 [ 4476.744066] ? process_one_work+0x3d0/0x3d0 [ 4476.744070] kthread+0x12a/0x150 [ 4476.744074] ? set_kthread_struct+0x50/0x50 [ 4476.744077] ret_from_fork+0x22/0x30 [ 4476.744082] </TASK> [ 4598.626220] INFO: task bcache_writebac:1835 blocked for more than 245 seconds. [ 4598.978432] Not tainted 5.15.50-051550-generic #202206251445 [ 4599.279900] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4599.657293] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4599.657298] Call Trace: [ 4599.657300] <TASK> [ 4599.657302] __schedule+0x23d/0x5a0 [ 4599.657325] schedule+0x4e/0xb0 [ 4599.657328] rwsem_down_write_slowpath+0x220/0x3d0 [ 4599.657333] down_write+0x43/0x50 [ 4599.657338] bch_writeback_thread+0x78/0x320 [bcache] [ 4599.657374] ? read_dirty_submit+0x70/0x70 [bcache] [ 4599.657390] kthread+0x12a/0x150 [ 4599.657393] ? set_kthread_struct+0x50/0x50 [ 4599.657396] ret_from_fork+0x22/0x30 [ 4599.657401] </TASK> [ 4599.657434] INFO: task kworker/2:63:14895 blocked for more than 124 seconds. [ 4600.000498] Not tainted 5.15.50-051550-generic #202206251445 [ 4600.301188] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4600.681984] task:kworker/2:63 state:D stack: 0 pid:14895 ppid: 2 flags:0x00004000 [ 4600.682005] Workqueue: bcache bch_data_insert_start [bcache] [ 4600.682031] Call Trace: [ 4600.682033] <TASK> [ 4600.682035] __schedule+0x23d/0x5a0 [ 4600.682040] schedule+0x4e/0xb0 [ 4600.682043] bch_bucket_alloc+0x21a/0x5b0 [bcache] [ 4600.682066] ? wait_woken+0x70/0x70 [ 4600.682071] __bch_bucket_alloc_set+0x52/0xf0 [bcache] [ 4600.682082] bch_alloc_sectors+0x1c9/0x4c0 [bcache] [ 4600.682098] ? krealloc+0x9d/0xd0 [ 4600.682102] ? __bch_keylist_realloc+0xb7/0x100 [bcache] [ 4600.682113] ? __bch_submit_bbio+0x97/0xb0 [bcache] [ 4600.682126] bch_data_insert_start+0x15e/0x3a0 [bcache] [ 4600.682139] ? closure_sub+0x94/0xb0 [bcache] [ 4600.682152] process_one_work+0x22b/0x3d0 [ 4600.682157] worker_thread+0x53/0x410 [ 4600.682161] ? process_one_work+0x3d0/0x3d0 [ 4600.682164] kthread+0x12a/0x150 [ 4600.682168] ? set_kthread_struct+0x50/0x50 [ 4600.682171] ret_from_fork+0x22/0x30 [ 4600.682176] </TASK> [ 4600.682240] INFO: task kworker/25:123:16464 blocked for more than 125 seconds. [ 4601.033116] Not tainted 5.15.50-051550-generic #202206251445 [ 4601.333405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4601.713829] task:kworker/25:123 state:D stack: 0 pid:16464 ppid: 2 flags:0x00004000 [ 4601.713835] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs] [ 4601.713975] Call Trace: [ 4601.713977] <TASK> [ 4601.713979] __schedule+0x23d/0x5a0 [ 4601.713986] schedule+0x4e/0xb0 [ 4601.713988] rwsem_down_read_slowpath+0x32e/0x380 [ 4601.713993] down_read+0x43/0x90 [ 4601.713996] cached_dev_write+0x7e/0x480 [bcache] [ 4601.714018] cached_dev_submit_bio+0x502/0x550 [bcache] [ 4601.714031] __submit_bio+0x1a1/0x220 [ 4601.714035] ? ttwu_do_activate+0x72/0xf0 [ 4601.714039] __submit_bio_noacct+0x85/0x1f0 [ 4601.714042] ? mutex_lock+0x13/0x40 [ 4601.714045] submit_bio_noacct+0x4e/0x120 [ 4601.714047] submit_bio+0x4a/0x130 [ 4601.714050] xlog_write_iclog+0x254/0x300 [xfs] [ 4601.714142] xlog_sync+0x1ab/0x2c0 [xfs] [ 4601.714225] xlog_state_release_iclog+0x123/0x1d0 [xfs] [ 4601.714306] xfs_log_force+0x186/0x250 [xfs] [ 4601.714386] xfs_log_worker+0x39/0x90 [xfs] [ 4601.714466] process_one_work+0x22b/0x3d0 [ 4601.714471] worker_thread+0x53/0x410 [ 4601.714474] ? process_one_work+0x3d0/0x3d0 [ 4601.714477] kthread+0x12a/0x150 [ 4601.714481] ? set_kthread_struct+0x50/0x50 [ 4601.714484] ret_from_fork+0x22/0x30 [ 4601.714489] </TASK> [ 4601.714526] INFO: task kworker/22:89:16557 blocked for more than 126 seconds. [ 4602.056844] Not tainted 5.15.50-051550-generic #202206251445 [ 4602.357148] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4602.733923] task:kworker/22:89 state:D stack: 0 pid:16557 ppid: 2 flags:0x00004000 [ 4602.733940] Workqueue: bcache bch_data_insert_start [bcache] [ 4602.733966] Call Trace: [ 4602.733968] <TASK> [ 4602.733970] __schedule+0x23d/0x5a0 [ 4602.733988] schedule+0x4e/0xb0 [ 4602.733991] bch_bucket_alloc+0x21a/0x5b0 [bcache] [ 4602.734002] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache] [ 4602.734014] ? wait_woken+0x70/0x70 [ 4602.734019] __bch_bucket_alloc_set+0x52/0xf0 [bcache] [ 4602.734031] bch_alloc_sectors+0x1c9/0x4c0 [bcache] [ 4602.734043] ? __wake_up_common_lock+0x8a/0xc0 [ 4602.734046] bch_data_insert_start+0x15e/0x3a0 [bcache] [ 4602.734062] ? closure_sub+0x94/0xb0 [bcache] [ 4602.734083] process_one_work+0x22b/0x3d0 [ 4602.734088] worker_thread+0x53/0x410 [ 4602.734092] ? process_one_work+0x3d0/0x3d0 [ 4602.734095] kthread+0x12a/0x150 [ 4602.734099] ? set_kthread_struct+0x50/0x50 [ 4602.734102] ret_from_fork+0x22/0x30 [ 4602.734108] </TASK> [ 4602.734134] INFO: task fio:16626 blocked for more than 249 seconds. [ 4603.034541] Not tainted 5.15.50-051550-generic #202206251445 [ 4603.330400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4603.711013] task:fio state:D stack: 0 pid:16626 ppid: 1715 flags:0x00004002 [ 4603.711030] Call Trace: [ 4603.711032] <TASK> [ 4603.711046] __schedule+0x23d/0x5a0 [ 4603.711054] ? sysvec_apic_timer_interrupt+0x4e/0x90 [ 4603.711060] schedule+0x4e/0xb0 [ 4603.711063] rwsem_down_read_slowpath+0x32e/0x380 [ 4603.711067] down_read+0x43/0x90 [ 4603.711072] cached_dev_write+0x7e/0x480 [bcache] [ 4603.711095] cached_dev_submit_bio+0x502/0x550 [bcache] [ 4603.711108] __submit_bio+0x1a1/0x220 [ 4603.711113] __submit_bio_noacct+0x85/0x1f0 [ 4603.711116] submit_bio_noacct+0x4e/0x120 [ 4603.711118] submit_bio+0x4a/0x130 [ 4603.711121] iomap_submit_ioend+0x53/0x80 [ 4603.711124] iomap_writepages+0x35/0x40 [ 4603.711126] xfs_vm_writepages+0x84/0xb0 [xfs] [ 4603.711223] do_writepages+0xda/0x200 [ 4603.711229] filemap_fdatawrite_wbc+0x81/0xd0 [ 4603.711232] file_write_and_wait_range+0xac/0xf0 [ 4603.711235] xfs_file_fsync+0x5b/0x250 [xfs] [ 4603.711312] vfs_fsync_range+0x49/0x80 [ 4603.711317] ? __fget_light+0x32/0x80 [ 4603.711321] __x64_sys_fsync+0x38/0x60 [ 4603.711324] do_syscall_64+0x5c/0xc0 [ 4603.711327] ? ksys_write+0xce/0xe0 [ 4603.711330] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4603.711335] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711338] ? __x64_sys_write+0x19/0x20 [ 4603.711340] ? do_syscall_64+0x69/0xc0 [ 4603.711342] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711345] ? __x64_sys_write+0x19/0x20 [ 4603.711347] ? do_syscall_64+0x69/0xc0 [ 4603.711349] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4603.711352] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711355] ? __x64_sys_write+0x19/0x20 [ 4603.711357] ? do_syscall_64+0x69/0xc0 [ 4603.711359] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711362] ? __x64_sys_write+0x19/0x20 [ 4603.711364] ? do_syscall_64+0x69/0xc0 [ 4603.711366] ? do_syscall_64+0x69/0xc0 [ 4603.711368] ? asm_common_interrupt+0x8/0x40 [ 4603.711371] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 4603.711374] RIP: 0033:0x7fd3c498aa5b [ 4603.711376] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 4603.711379] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b [ 4603.711381] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006 [ 4603.711382] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40 [ 4603.711383] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000 [ 4603.711384] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000 [ 4603.711387] </TASK> [ 4603.711421] INFO: task kworker/26:87:16682 blocked for more than 250 seconds. [ 4604.058441] Not tainted 5.15.50-051550-generic #202206251445 [ 4604.359128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4604.739950] task:kworker/26:87 state:D stack: 0 pid:16682 ppid: 2 flags:0x00004000 [ 4604.739968] Workqueue: events update_writeback_rate [bcache] [ 4604.739995] Call Trace: [ 4604.739997] <TASK> [ 4604.740012] __schedule+0x23d/0x5a0 [ 4604.740017] schedule+0x4e/0xb0 [ 4604.740020] rwsem_down_read_slowpath+0x32e/0x380 [ 4604.740024] down_read+0x43/0x90 [ 4604.740027] update_writeback_rate+0x134/0x190 [bcache] [ 4604.740043] process_one_work+0x22b/0x3d0 [ 4604.740049] worker_thread+0x53/0x410 [ 4604.740052] ? process_one_work+0x3d0/0x3d0 [ 4604.740055] kthread+0x12a/0x150 [ 4604.740059] ? set_kthread_struct+0x50/0x50 [ 4604.740062] ret_from_fork+0x22/0x30 [ 4604.740067] </TASK> [ 4725.600602] INFO: task bcache_writebac:1835 blocked for more than 372 seconds. [ 4725.952827] Not tainted 5.15.50-051550-generic #202206251445 [ 4726.253763] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4726.630937] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4726.630954] Call Trace: [ 4726.630956] <TASK> [ 4726.630959] __schedule+0x23d/0x5a0 [ 4726.630966] schedule+0x4e/0xb0 [ 4726.630970] rwsem_down_write_slowpath+0x220/0x3d0 [ 4726.630976] down_write+0x43/0x50 [ 4726.630980] bch_writeback_thread+0x78/0x320 [bcache] [ 4726.631005] ? read_dirty_submit+0x70/0x70 [bcache] [ 4726.631021] kthread+0x12a/0x150 [ 4726.631025] ? set_kthread_struct+0x50/0x50 [ 4726.631028] ret_from_fork+0x22/0x30 [ 4726.631033] </TASK> [Fix] commit b57a534df246999bc09a510dc0e0125cacd1724c (HEAD) Author: Coly Li <col...@suse.de> Date: Fri May 27 23:28:16 2022 +0800 - bcache: memset on stack variables in bch_btree_check() and + bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() I have built this fix into kernel 5.15.50 (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15/ ) and tested to verify the problem is fixed. root@nikhilbionic:/home/ubuntu/nikhil/5_15_50/cod/mainline/v5.15.50# git log commit b57a534df246999bc09a510dc0e0125cacd1724c (HEAD) Author: Coly Li <col...@suse.de> Date: Fri May 27 23:28:16 2022 +0800 - bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() - - The local variables check_state (in bch_btree_check()) and state (in - bch_sectors_dirty_init()) should be fully filled by 0, because before - allocating them on stack, they were dynamically allocated by kzalloc(). - - Signed-off-by: Coly Li <col...@suse.de> - Link: https://lore.kernel.org/r/20220527152818.27545-2-col...@suse.de - Signed-off-by: Jens Axboe <ax...@kernel.dk> + bcache: memset on stack variables in bch_btree_check() and + bch_sectors_dirty_init() + + The local variables check_state (in bch_btree_check()) and state (in + bch_sectors_dirty_init()) should be fully filled by 0, because before + allocating them on stack, they were dynamically allocated by kzalloc(). + + Signed-off-by: Coly Li <col...@suse.de> + Link: https://lore.kernel.org/r/20220527152818.27545-2-col...@suse.de + Signed-off-by: Jens Axboe <ax...@kernel.dk> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50) Author: Greg Kroah-Hartman <gre...@linuxfoundation.org> Date: Sat Jun 25 15:18:40 2022 +0200 - Linux 5.15.50 - - Link: https://lore.kernel.org/r/20220623164322.288837...@linuxfoundation.org - Tested-by: Florian Fainelli <f.faine...@gmail.com> - Tested-by: Shuah Khan <sk...@linuxfoundation.org> - Tested-by: Bagas Sanjaya <bagasdo...@gmail.com> - Tested-by: Jon Hunter <jonath...@nvidia.com> - Tested-by: Sudip Mukherjee <sudip.mukher...@codethink.co.uk> - Tested-by: Ron Economos <r...@w6rz.net> - Tested-by: Guenter Roeck <li...@roeck-us.net> - Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org> - + Linux 5.15.50 + + Link: https://lore.kernel.org/r/20220623164322.288837...@linuxfoundation.org + Tested-by: Florian Fainelli <f.faine...@gmail.com> + Tested-by: Shuah Khan <sk...@linuxfoundation.org> + Tested-by: Bagas Sanjaya <bagasdo...@gmail.com> + Tested-by: Jon Hunter <jonath...@nvidia.com> + Tested-by: Sudip Mukherjee <sudip.mukher...@codethink.co.uk> + Tested-by: Ron Economos <r...@w6rz.net> + Tested-by: Guenter Roeck <li...@roeck-us.net> + Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org> [Regression Potential] I have not seen any potential drawbacks or harmful effects of this fix in my testing. In fact it is required, without which the deadlock is easily reproduced both on focal as well as jammy GA. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1980925 Title: [SRU] bcache deadlock during read IO in writeback mode Status in linux package in Ubuntu: Confirmed Status in linux source package in Focal: Confirmed Status in linux source package in Jammy: Confirmed Bug description: SRU Justification: [Impact] This bug in bcache affects (at least) focal and jammy releases. (not tested bionic as yet) When Random Read I/O is started with a test like - fio --name=read_iops --directory=/home/ubuntu/bcache_mount/ --size=16G --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128 --rw=randread --randrepeat=0 or random read-writes with a test like, fio --filename=/home/ubuntu/bcache_mount/cachedfile --size=15GB --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=128 --name=iops-test-job --randrepeat=0 traces are seen in the kernel log, [ 4473.699902] INFO: task bcache_writebac:1835 blocked for more than 120 seconds. [ 4474.050921] Not tainted 5.15.50-051550-generic #202206251445 [ 4474.350883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4474.731391] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4474.731408] Call Trace: [ 4474.731411] <TASK> [ 4474.731413] __schedule+0x23d/0x5a0 [ 4474.731433] schedule+0x4e/0xb0 [ 4474.731436] rwsem_down_write_slowpath+0x220/0x3d0 [ 4474.731441] down_write+0x43/0x50 [ 4474.731446] bch_writeback_thread+0x78/0x320 [bcache] [ 4474.731471] ? read_dirty_submit+0x70/0x70 [bcache] [ 4474.731487] kthread+0x12a/0x150 [ 4474.731491] ? set_kthread_struct+0x50/0x50 [ 4474.731494] ret_from_fork+0x22/0x30 [ 4474.731499] </TASK> The bug exists till kernel 5.15.50-051550-generic, and is fixed through upstream commit commit b57a534df246999bc09a510dc0e0125cacd1724c (HEAD) Author: Coly Li <col...@suse.de> Date: Fri May 27 23:28:16 2022 +0800 bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() The local variables check_state (in bch_btree_check()) and state (in bch_sectors_dirty_init()) should be fully filled by 0, because before allocating them on stack, they were dynamically allocated by kzalloc(). Signed-off-by: Coly Li <col...@suse.de> Link: https://lore.kernel.org/r/20220527152818.27545-2-col...@suse.de Signed-off-by: Jens Axboe <ax...@kernel.dk> The reproducer is pasted below: # uname -a Linux bronzor 5.15.50-051550-generic #202206251445 SMP Sat Jun 25 14:51:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sdd 8:48 0 279.4G 0 disk └─sdd1 8:49 0 60G 0 part └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount nvme0n1 259:0 0 372.6G 0 disk └─nvme0n1p1 259:2 0 15G 0 part └─bcache0 252:0 0 60G 0 disk /home/ubuntu/bcache_mount fio --name=read_iops --directory=/home/ubuntu/bcache_mount --size=12G --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=128 --rw=randread --group_reporting=1 read_iops: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 fio-3.28 Starting 1 process read_iops: Laying out IO file (1 file / 12288MiB) The test does not progress beyond a few minutes, and this trace is then seen in the kernel log, [ 4473.699902] INFO: task bcache_writebac:1835 blocked for more than 120 seconds. [ 4474.050921] Not tainted 5.15.50-051550-generic #202206251445 [ 4474.350883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4474.731391] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4474.731408] Call Trace: [ 4474.731411] <TASK> [ 4474.731413] __schedule+0x23d/0x5a0 [ 4474.731433] schedule+0x4e/0xb0 [ 4474.731436] rwsem_down_write_slowpath+0x220/0x3d0 [ 4474.731441] down_write+0x43/0x50 [ 4474.731446] bch_writeback_thread+0x78/0x320 [bcache] [ 4474.731471] ? read_dirty_submit+0x70/0x70 [bcache] [ 4474.731487] kthread+0x12a/0x150 [ 4474.731491] ? set_kthread_struct+0x50/0x50 [ 4474.731494] ret_from_fork+0x22/0x30 [ 4474.731499] </TASK> [ 4474.731735] INFO: task fio:16626 blocked for more than 121 seconds. [ 4475.035858] Not tainted 5.15.50-051550-generic #202206251445 [ 4475.335859] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4475.716526] task:fio state:D stack: 0 pid:16626 ppid: 1715 flags:0x00004002 [ 4475.716531] Call Trace: [ 4475.716546] <TASK> [ 4475.716549] __schedule+0x23d/0x5a0 [ 4475.716555] ? sysvec_apic_timer_interrupt+0x4e/0x90 [ 4475.716560] schedule+0x4e/0xb0 [ 4475.716563] rwsem_down_read_slowpath+0x32e/0x380 [ 4475.716567] down_read+0x43/0x90 [ 4475.716571] cached_dev_write+0x7e/0x480 [bcache] [ 4475.716604] cached_dev_submit_bio+0x502/0x550 [bcache] [ 4475.716618] __submit_bio+0x1a1/0x220 [ 4475.716623] __submit_bio_noacct+0x85/0x1f0 [ 4475.716626] submit_bio_noacct+0x4e/0x120 [ 4475.716628] submit_bio+0x4a/0x130 [ 4475.716631] iomap_submit_ioend+0x53/0x80 [ 4475.716634] iomap_writepages+0x35/0x40 [ 4475.716636] xfs_vm_writepages+0x84/0xb0 [xfs] [ 4475.716730] do_writepages+0xda/0x200 [ 4475.716736] filemap_fdatawrite_wbc+0x81/0xd0 [ 4475.716739] file_write_and_wait_range+0xac/0xf0 [ 4475.716742] xfs_file_fsync+0x5b/0x250 [xfs] [ 4475.716821] vfs_fsync_range+0x49/0x80 [ 4475.716826] ? __fget_light+0x32/0x80 [ 4475.716829] __x64_sys_fsync+0x38/0x60 [ 4475.716832] do_syscall_64+0x5c/0xc0 [ 4475.716836] ? ksys_write+0xce/0xe0 [ 4475.716838] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4475.716843] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716847] ? __x64_sys_write+0x19/0x20 [ 4475.716849] ? do_syscall_64+0x69/0xc0 [ 4475.716851] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716855] ? __x64_sys_write+0x19/0x20 [ 4475.716857] ? do_syscall_64+0x69/0xc0 [ 4475.716859] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4475.716862] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716865] ? __x64_sys_write+0x19/0x20 [ 4475.716867] ? do_syscall_64+0x69/0xc0 [ 4475.716869] ? syscall_exit_to_user_mode+0x27/0x50 [ 4475.716873] ? __x64_sys_write+0x19/0x20 [ 4475.716875] ? do_syscall_64+0x69/0xc0 [ 4475.716877] ? do_syscall_64+0x69/0xc0 [ 4475.716879] ? asm_common_interrupt+0x8/0x40 [ 4475.716883] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 4475.716885] RIP: 0033:0x7fd3c498aa5b [ 4475.716888] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 4475.716890] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b [ 4475.716892] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006 [ 4475.716893] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40 [ 4475.716895] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000 [ 4475.716896] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000 [ 4475.716898] </TASK> [ 4475.716912] INFO: task kworker/26:87:16682 blocked for more than 122 seconds. [ 4476.063391] Not tainted 5.15.50-051550-generic #202206251445 [ 4476.363476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4476.743988] task:kworker/26:87 state:D stack: 0 pid:16682 ppid: 2 flags:0x00004000 [ 4476.743994] Workqueue: events update_writeback_rate [bcache] [ 4476.744022] Call Trace: [ 4476.744024] <TASK> [ 4476.744026] __schedule+0x23d/0x5a0 [ 4476.744032] schedule+0x4e/0xb0 [ 4476.744035] rwsem_down_read_slowpath+0x32e/0x380 [ 4476.744039] down_read+0x43/0x90 [ 4476.744042] update_writeback_rate+0x134/0x190 [bcache] [ 4476.744057] process_one_work+0x22b/0x3d0 [ 4476.744063] worker_thread+0x53/0x410 [ 4476.744066] ? process_one_work+0x3d0/0x3d0 [ 4476.744070] kthread+0x12a/0x150 [ 4476.744074] ? set_kthread_struct+0x50/0x50 [ 4476.744077] ret_from_fork+0x22/0x30 [ 4476.744082] </TASK> [ 4598.626220] INFO: task bcache_writebac:1835 blocked for more than 245 seconds. [ 4598.978432] Not tainted 5.15.50-051550-generic #202206251445 [ 4599.279900] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4599.657293] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4599.657298] Call Trace: [ 4599.657300] <TASK> [ 4599.657302] __schedule+0x23d/0x5a0 [ 4599.657325] schedule+0x4e/0xb0 [ 4599.657328] rwsem_down_write_slowpath+0x220/0x3d0 [ 4599.657333] down_write+0x43/0x50 [ 4599.657338] bch_writeback_thread+0x78/0x320 [bcache] [ 4599.657374] ? read_dirty_submit+0x70/0x70 [bcache] [ 4599.657390] kthread+0x12a/0x150 [ 4599.657393] ? set_kthread_struct+0x50/0x50 [ 4599.657396] ret_from_fork+0x22/0x30 [ 4599.657401] </TASK> [ 4599.657434] INFO: task kworker/2:63:14895 blocked for more than 124 seconds. [ 4600.000498] Not tainted 5.15.50-051550-generic #202206251445 [ 4600.301188] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4600.681984] task:kworker/2:63 state:D stack: 0 pid:14895 ppid: 2 flags:0x00004000 [ 4600.682005] Workqueue: bcache bch_data_insert_start [bcache] [ 4600.682031] Call Trace: [ 4600.682033] <TASK> [ 4600.682035] __schedule+0x23d/0x5a0 [ 4600.682040] schedule+0x4e/0xb0 [ 4600.682043] bch_bucket_alloc+0x21a/0x5b0 [bcache] [ 4600.682066] ? wait_woken+0x70/0x70 [ 4600.682071] __bch_bucket_alloc_set+0x52/0xf0 [bcache] [ 4600.682082] bch_alloc_sectors+0x1c9/0x4c0 [bcache] [ 4600.682098] ? krealloc+0x9d/0xd0 [ 4600.682102] ? __bch_keylist_realloc+0xb7/0x100 [bcache] [ 4600.682113] ? __bch_submit_bbio+0x97/0xb0 [bcache] [ 4600.682126] bch_data_insert_start+0x15e/0x3a0 [bcache] [ 4600.682139] ? closure_sub+0x94/0xb0 [bcache] [ 4600.682152] process_one_work+0x22b/0x3d0 [ 4600.682157] worker_thread+0x53/0x410 [ 4600.682161] ? process_one_work+0x3d0/0x3d0 [ 4600.682164] kthread+0x12a/0x150 [ 4600.682168] ? set_kthread_struct+0x50/0x50 [ 4600.682171] ret_from_fork+0x22/0x30 [ 4600.682176] </TASK> [ 4600.682240] INFO: task kworker/25:123:16464 blocked for more than 125 seconds. [ 4601.033116] Not tainted 5.15.50-051550-generic #202206251445 [ 4601.333405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4601.713829] task:kworker/25:123 state:D stack: 0 pid:16464 ppid: 2 flags:0x00004000 [ 4601.713835] Workqueue: xfs-sync/bcache0 xfs_log_worker [xfs] [ 4601.713975] Call Trace: [ 4601.713977] <TASK> [ 4601.713979] __schedule+0x23d/0x5a0 [ 4601.713986] schedule+0x4e/0xb0 [ 4601.713988] rwsem_down_read_slowpath+0x32e/0x380 [ 4601.713993] down_read+0x43/0x90 [ 4601.713996] cached_dev_write+0x7e/0x480 [bcache] [ 4601.714018] cached_dev_submit_bio+0x502/0x550 [bcache] [ 4601.714031] __submit_bio+0x1a1/0x220 [ 4601.714035] ? ttwu_do_activate+0x72/0xf0 [ 4601.714039] __submit_bio_noacct+0x85/0x1f0 [ 4601.714042] ? mutex_lock+0x13/0x40 [ 4601.714045] submit_bio_noacct+0x4e/0x120 [ 4601.714047] submit_bio+0x4a/0x130 [ 4601.714050] xlog_write_iclog+0x254/0x300 [xfs] [ 4601.714142] xlog_sync+0x1ab/0x2c0 [xfs] [ 4601.714225] xlog_state_release_iclog+0x123/0x1d0 [xfs] [ 4601.714306] xfs_log_force+0x186/0x250 [xfs] [ 4601.714386] xfs_log_worker+0x39/0x90 [xfs] [ 4601.714466] process_one_work+0x22b/0x3d0 [ 4601.714471] worker_thread+0x53/0x410 [ 4601.714474] ? process_one_work+0x3d0/0x3d0 [ 4601.714477] kthread+0x12a/0x150 [ 4601.714481] ? set_kthread_struct+0x50/0x50 [ 4601.714484] ret_from_fork+0x22/0x30 [ 4601.714489] </TASK> [ 4601.714526] INFO: task kworker/22:89:16557 blocked for more than 126 seconds. [ 4602.056844] Not tainted 5.15.50-051550-generic #202206251445 [ 4602.357148] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4602.733923] task:kworker/22:89 state:D stack: 0 pid:16557 ppid: 2 flags:0x00004000 [ 4602.733940] Workqueue: bcache bch_data_insert_start [bcache] [ 4602.733966] Call Trace: [ 4602.733968] <TASK> [ 4602.733970] __schedule+0x23d/0x5a0 [ 4602.733988] schedule+0x4e/0xb0 [ 4602.733991] bch_bucket_alloc+0x21a/0x5b0 [bcache] [ 4602.734002] ? bch_btree_insert_check_key+0x1e0/0x1e0 [bcache] [ 4602.734014] ? wait_woken+0x70/0x70 [ 4602.734019] __bch_bucket_alloc_set+0x52/0xf0 [bcache] [ 4602.734031] bch_alloc_sectors+0x1c9/0x4c0 [bcache] [ 4602.734043] ? __wake_up_common_lock+0x8a/0xc0 [ 4602.734046] bch_data_insert_start+0x15e/0x3a0 [bcache] [ 4602.734062] ? closure_sub+0x94/0xb0 [bcache] [ 4602.734083] process_one_work+0x22b/0x3d0 [ 4602.734088] worker_thread+0x53/0x410 [ 4602.734092] ? process_one_work+0x3d0/0x3d0 [ 4602.734095] kthread+0x12a/0x150 [ 4602.734099] ? set_kthread_struct+0x50/0x50 [ 4602.734102] ret_from_fork+0x22/0x30 [ 4602.734108] </TASK> [ 4602.734134] INFO: task fio:16626 blocked for more than 249 seconds. [ 4603.034541] Not tainted 5.15.50-051550-generic #202206251445 [ 4603.330400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4603.711013] task:fio state:D stack: 0 pid:16626 ppid: 1715 flags:0x00004002 [ 4603.711030] Call Trace: [ 4603.711032] <TASK> [ 4603.711046] __schedule+0x23d/0x5a0 [ 4603.711054] ? sysvec_apic_timer_interrupt+0x4e/0x90 [ 4603.711060] schedule+0x4e/0xb0 [ 4603.711063] rwsem_down_read_slowpath+0x32e/0x380 [ 4603.711067] down_read+0x43/0x90 [ 4603.711072] cached_dev_write+0x7e/0x480 [bcache] [ 4603.711095] cached_dev_submit_bio+0x502/0x550 [bcache] [ 4603.711108] __submit_bio+0x1a1/0x220 [ 4603.711113] __submit_bio_noacct+0x85/0x1f0 [ 4603.711116] submit_bio_noacct+0x4e/0x120 [ 4603.711118] submit_bio+0x4a/0x130 [ 4603.711121] iomap_submit_ioend+0x53/0x80 [ 4603.711124] iomap_writepages+0x35/0x40 [ 4603.711126] xfs_vm_writepages+0x84/0xb0 [xfs] [ 4603.711223] do_writepages+0xda/0x200 [ 4603.711229] filemap_fdatawrite_wbc+0x81/0xd0 [ 4603.711232] file_write_and_wait_range+0xac/0xf0 [ 4603.711235] xfs_file_fsync+0x5b/0x250 [xfs] [ 4603.711312] vfs_fsync_range+0x49/0x80 [ 4603.711317] ? __fget_light+0x32/0x80 [ 4603.711321] __x64_sys_fsync+0x38/0x60 [ 4603.711324] do_syscall_64+0x5c/0xc0 [ 4603.711327] ? ksys_write+0xce/0xe0 [ 4603.711330] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4603.711335] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711338] ? __x64_sys_write+0x19/0x20 [ 4603.711340] ? do_syscall_64+0x69/0xc0 [ 4603.711342] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711345] ? __x64_sys_write+0x19/0x20 [ 4603.711347] ? do_syscall_64+0x69/0xc0 [ 4603.711349] ? exit_to_user_mode_prepare+0x37/0xb0 [ 4603.711352] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711355] ? __x64_sys_write+0x19/0x20 [ 4603.711357] ? do_syscall_64+0x69/0xc0 [ 4603.711359] ? syscall_exit_to_user_mode+0x27/0x50 [ 4603.711362] ? __x64_sys_write+0x19/0x20 [ 4603.711364] ? do_syscall_64+0x69/0xc0 [ 4603.711366] ? do_syscall_64+0x69/0xc0 [ 4603.711368] ? asm_common_interrupt+0x8/0x40 [ 4603.711371] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 4603.711374] RIP: 0033:0x7fd3c498aa5b [ 4603.711376] RSP: 002b:00007ffc81602ee0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 4603.711379] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd3c498aa5b [ 4603.711381] RDX: 0000000000000000 RSI: 000055ededf41f40 RDI: 0000000000000006 [ 4603.711382] RBP: 0000000000001000 R08: 0000000000000000 R09: 000055ededf41f40 [ 4603.711383] R10: 00000000e8746551 R11: 0000000000000293 R12: 00007fd3ba533000 [ 4603.711384] R13: 0000000000000000 R14: 00007fd3c1cde110 R15: 0000000300000000 [ 4603.711387] </TASK> [ 4603.711421] INFO: task kworker/26:87:16682 blocked for more than 250 seconds. [ 4604.058441] Not tainted 5.15.50-051550-generic #202206251445 [ 4604.359128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4604.739950] task:kworker/26:87 state:D stack: 0 pid:16682 ppid: 2 flags:0x00004000 [ 4604.739968] Workqueue: events update_writeback_rate [bcache] [ 4604.739995] Call Trace: [ 4604.739997] <TASK> [ 4604.740012] __schedule+0x23d/0x5a0 [ 4604.740017] schedule+0x4e/0xb0 [ 4604.740020] rwsem_down_read_slowpath+0x32e/0x380 [ 4604.740024] down_read+0x43/0x90 [ 4604.740027] update_writeback_rate+0x134/0x190 [bcache] [ 4604.740043] process_one_work+0x22b/0x3d0 [ 4604.740049] worker_thread+0x53/0x410 [ 4604.740052] ? process_one_work+0x3d0/0x3d0 [ 4604.740055] kthread+0x12a/0x150 [ 4604.740059] ? set_kthread_struct+0x50/0x50 [ 4604.740062] ret_from_fork+0x22/0x30 [ 4604.740067] </TASK> [ 4725.600602] INFO: task bcache_writebac:1835 blocked for more than 372 seconds. [ 4725.952827] Not tainted 5.15.50-051550-generic #202206251445 [ 4726.253763] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4726.630937] task:bcache_writebac state:D stack: 0 pid: 1835 ppid: 2 flags:0x00004000 [ 4726.630954] Call Trace: [ 4726.630956] <TASK> [ 4726.630959] __schedule+0x23d/0x5a0 [ 4726.630966] schedule+0x4e/0xb0 [ 4726.630970] rwsem_down_write_slowpath+0x220/0x3d0 [ 4726.630976] down_write+0x43/0x50 [ 4726.630980] bch_writeback_thread+0x78/0x320 [bcache] [ 4726.631005] ? read_dirty_submit+0x70/0x70 [bcache] [ 4726.631021] kthread+0x12a/0x150 [ 4726.631025] ? set_kthread_struct+0x50/0x50 [ 4726.631028] ret_from_fork+0x22/0x30 [ 4726.631033] </TASK> [Fix] commit b57a534df246999bc09a510dc0e0125cacd1724c (HEAD) Author: Coly Li <col...@suse.de> Date: Fri May 27 23:28:16 2022 +0800 bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() I have built this fix into kernel 5.15.50 (https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15/ ) and tested to verify the problem is fixed. root@nikhilbionic:/home/ubuntu/nikhil/5_15_50/cod/mainline/v5.15.50# git log commit b57a534df246999bc09a510dc0e0125cacd1724c (HEAD) Author: Coly Li <col...@suse.de> Date: Fri May 27 23:28:16 2022 +0800 bcache: memset on stack variables in bch_btree_check() and bch_sectors_dirty_init() The local variables check_state (in bch_btree_check()) and state (in bch_sectors_dirty_init()) should be fully filled by 0, because before allocating them on stack, they were dynamically allocated by kzalloc(). Signed-off-by: Coly Li <col...@suse.de> Link: https://lore.kernel.org/r/20220527152818.27545-2-col...@suse.de Signed-off-by: Jens Axboe <ax...@kernel.dk> commit 18a33c8dabb88b50b860e0177a73933f2c0ddf68 (tag: v5.15.50) Author: Greg Kroah-Hartman <gre...@linuxfoundation.org> Date: Sat Jun 25 15:18:40 2022 +0200 Linux 5.15.50 Link: https://lore.kernel.org/r/20220623164322.288837...@linuxfoundation.org Tested-by: Florian Fainelli <f.faine...@gmail.com> Tested-by: Shuah Khan <sk...@linuxfoundation.org> Tested-by: Bagas Sanjaya <bagasdo...@gmail.com> Tested-by: Jon Hunter <jonath...@nvidia.com> Tested-by: Sudip Mukherjee <sudip.mukher...@codethink.co.uk> Tested-by: Ron Economos <r...@w6rz.net> Tested-by: Guenter Roeck <li...@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org> [Regression Potential] I have not seen any potential drawbacks or harmful effects of this fix in my testing. In fact it is required, without which the deadlock is easily reproduced both on focal as well as jammy GA. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1980925/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp