I/O Error Test 4 ================ commit "bcache: add io_disable to struct cached_dev"
Problem: in case of the backing device hits I/o errors or is disconected, the I/O can still be accepted to the bcache device. Original kernel: dd writes in writeback mode to failed backing device complete. Modified kernel: the bcache0 device is removed after some I/O errors in backing device. Original -------- # uname -rv 4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019 # ./setup.sh >/dev/null 2>&1 [ 24.820401] bcache: register_bdev() registered backing device dm-0 [ 24.833268] bcache: run_cache_set() invalidating existing data [ 24.848314] bcache: register_cache() registered cache device dm-1 [ 26.824645] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set dd465fa7-4e85-484b-89dd-353c24c6b041 # echo writeback > /sys/block/bcache0/bcache/cache_mode # cat /sys/block/bcache0/bcache/cache_mode writethrough [writeback] writearound none # ./dm_fake_dev.sh /dev/loop0 bad [ 41.439684] Buffer I/O error on dev dm-0, logical block 262128, async page read [ 41.445284] Buffer I/O error on dev dm-0, logical block 262128, async page read [ 41.451846] bcache: register_bcache() error /dev/dm-0: device already registered (emitting change event) [ 41.454704] Buffer I/O error on dev bcache0, logical block 262112, async page read [ 41.457685] Buffer I/O error on dev bcache0, logical block 262112, async page read [ 41.457743] Buffer I/O error on dev bcache0, logical block 1, async page read # dd if=/dev/zero of=/dev/bcache0 bs=4k [ 49.048036] Buffer I/O error on dev bcache0, logical block 0, lost async page write [ 49.051702] Buffer I/O error on dev bcache0, logical block 1, lost async page write [ 49.054062] Buffer I/O error on dev bcache0, logical block 2, lost async page write [ 49.056466] Buffer I/O error on dev bcache0, logical block 3, lost async page write [ 49.058867] Buffer I/O error on dev bcache0, logical block 4, lost async page write [ 49.072020] Buffer I/O error on dev bcache0, logical block 5, lost async page write [ 49.074440] Buffer I/O error on dev bcache0, logical block 6, lost async page write [ 49.078658] Buffer I/O error on dev bcache0, logical block 7, lost async page write [ 49.079008] Buffer I/O error on dev bcache0, logical block 6834, lost async page write [ 49.079022] Buffer I/O error on dev bcache0, logical block 6835, lost async page write dd: error writing '/dev/bcache0': No space left on device 262142+0 records in 262141+0 records out 1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.58342 s, 416 MB/s # dd if=/dev/zero of=/dev/bcache0 bs=4k [ 62.696034] buffer_io_error: 260992 callbacks suppressed [ 62.696037] Buffer I/O error on dev bcache0, logical block 0, lost async page write [ 62.701996] Buffer I/O error on dev bcache0, logical block 1, lost async page write [ 62.704394] Buffer I/O error on dev bcache0, logical block 2, lost async page write [ 62.706763] Buffer I/O error on dev bcache0, logical block 3, lost async page write [ 62.716025] Buffer I/O error on dev bcache0, logical block 4, lost async page write [ 62.718421] Buffer I/O error on dev bcache0, logical block 5, lost async page write [ 62.720821] Buffer I/O error on dev bcache0, logical block 6, lost async page write [ 62.723193] Buffer I/O error on dev bcache0, logical block 7, lost async page write [ 62.725584] Buffer I/O error on dev bcache0, logical block 8, lost async page write [ 62.725763] Buffer I/O error on dev bcache0, logical block 5405, lost async page write dd: error writing '/dev/bcache0': No space left on device 262142+0 records in 262141+0 records out 1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.88915 s, 372 MB/s # dd if=/dev/zero of=/dev/bcache0 bs=4k [ 67.700114] buffer_io_error: 290043 callbacks suppressed [ 67.700117] Buffer I/O error on dev bcache0, logical block 40750, lost async page write [ 67.706230] Buffer I/O error on dev bcache0, logical block 40751, lost async page write [ 67.709846] Buffer I/O error on dev bcache0, logical block 40752, lost async page write [ 67.713503] Buffer I/O error on dev bcache0, logical block 40753, lost async page write [ 67.717241] Buffer I/O error on dev bcache0, logical block 40754, lost async page write [ 67.720938] Buffer I/O error on dev bcache0, logical block 40755, lost async page write [ 67.741395] Buffer I/O error on dev bcache0, logical block 40756, lost async page write dd: error writing '/dev/bcache0': No space left on device [ 67.748145] Buffer I/O error on dev bcache0, logical block 40757, lost async page write [ 67.752352] Buffer I/O error on dev bcache0, logical block 41038, lost async page write [ 67.756642] Buffer I/O error on dev bcache0, logical block 41313, lost async page write 262142+0 records in 262141+0 records out 1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.99741 s, 358 MB/s # lsblk -e 252 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 1G 0 loop loop1 7:1 0 1G 0 loop └─fake-loop1 253:1 0 1024M 0 dm └─bcache0 251:0 0 1024M 0 disk fake-loop0 253:0 0 1G 0 dm └─bcache0 251:0 0 1024M 0 disk Modified -------- # uname -rv 4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37 UTC # ./setup.sh >/dev/null 2>&1 [ 22.202972] bcache: run_cache_set() invalidating existing data [ 22.213346] bcache: register_cache() registered cache device dm-1 [ 22.226165] bcache: register_bdev() registered backing device dm-0 [ 24.198940] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set cabfdb60-4301-46e0-940a-eb96e801c816 # echo writeback > /sys/block/bcache0/bcache/cache_mode # cat /sys/block/bcache0/bcache/cache_mode writethrough [writeback] writearound none # ./dm_fake_dev.sh /dev/loop0 bad [ 40.025536] Buffer I/O error on dev dm-0, logical block 262128, async page read [ 40.030156] Buffer I/O error on dev dm-0, logical block 262128, async page read [ 40.035808] bcache: register_bcache() error /dev/dm-0: device already registered (emitting change event) [ 40.038534] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 40.038567] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 40.038574] Buffer I/O error on dev bcache0, logical block 262112, async page read [ 40.041268] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 40.041284] Buffer I/O error on dev bcache0, logical block 262112, async page read [ 40.041319] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 40.041341] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 40.041346] Buffer I/O error on dev bcache0, logical block 1, async page read # dd if=/dev/zero of=/dev/bcache0 bs=4k [ 48.178854] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.181495] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.183988] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.186469] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.188962] Buffer I/O error on dev bcache0, logical block 0, lost async page write [ 48.191117] Buffer I/O error on dev bcache0, logical block 1, lost async page write [ 48.193295] Buffer I/O error on dev bcache0, logical block 2, lost async page write [ 48.195457] Buffer I/O error on dev bcache0, logical block 3, lost async page write [ 48.197607] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.200116] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.202597] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.205087] Buffer I/O error on dev bcache0, logical block 4, lost async page write [ 48.207229] Buffer I/O error on dev bcache0, logical block 5, lost async page write [ 48.209377] Buffer I/O error on dev bcache0, logical block 6, lost async page write ... [ 48.362824] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.365085] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.367294] bcache: bch_count_backing_io_errors() dm-0: IO error on backing device, unrecoverable [ 48.369457] bcache: bch_cached_dev_error() stop bcache0: too many IO errors on backing device dm-0 [ 48.369457] dd: error writing '/dev/bcache0': No space left on device 262142+0 records in[ 48.866726] bcache: bcache_device_free() bcache0 stopped 262141+0 records out 1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.27785 s, 471 MB/s # lsblk -e 252 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 1G 0 loop loop1 7:1 0 1G 0 loop └─fake-loop1 253:1 0 1024M 0 dm fake-loop0 253:0 0 1G 0 dm # ls /dev/bcache0 ls: cannot access '/dev/bcache0': No such file or directory -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1829563 Title: bcache: risk of data loss on I/O errors in backing or caching devices Status in linux package in Ubuntu: Invalid Status in linux source package in Bionic: In Progress Status in linux source package in Cosmic: In Progress Bug description: [Impact] * The bcache code in Bionic lacks several fixes to handle I/O errors in both backing devices and caching devices. * Partial or permanent errors in backing or caching devices, specially in writeback mode, can lead to data loss and/or the application is not notified about failed I/O requests. * The bcache device might remain available for I/O requests even if backing device is offline, so writes are undefined. [Test Case] * Detailed test cases/steps for the behavior of almost every patch with code logic changes are provided in bug comments. * The patchset has been tested for regressions on each cache mode (writethrough, writeback, writearound, none) with the xfstests test suite (on ext4), fio (random read-write) and iozone (several read/write tests). [Regression Potential] * The patchset is relatively large and touches several areas in bcache code, however, synthetic testing of the patches has been performed, and extensive regression/stress tests were run (as mentioned in Test Case section). * Many patches in the patchset are 'Fixes' patches to other patches, and no further 'Fixes' currently exist upstream. [Other Info] * Canonical Field Eng. deploys bcache+writeback extensively (e.g., BootStack, UA cloud, except rare all-flash cases). [Original Bug Description] This is a request for a backport of the following upstream patch from 4.18: "bcache: stop bcache device when backing device is offline" https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee Field engineering uses bcache quite extensively and it would be good to have this in the GA/bionic kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp