On Thu, Nov 22, 2018 at 6:07 AM Tomasz Chmielewski <[email protected]> wrote: > > On 2018-11-22 21:46, Nikolay Borisov wrote: > > >> # echo w > /proc/sysrq-trigger > >> > >> # dmesg -c > >> [ 931.585611] sysrq: SysRq : Show Blocked State > >> [ 931.585715] task PC stack pid father > >> [ 931.590168] btrfs-cleaner D 0 1340 2 0x80000000 > >> [ 931.590175] Call Trace: > >> [ 931.590190] __schedule+0x29e/0x840 > >> [ 931.590195] schedule+0x2c/0x80 > >> [ 931.590199] schedule_timeout+0x258/0x360 > >> [ 931.590204] io_schedule_timeout+0x1e/0x50 > >> [ 931.590208] wait_for_completion_io+0xb7/0x140 > >> [ 931.590214] ? wake_up_q+0x80/0x80 > >> [ 931.590219] submit_bio_wait+0x61/0x90 > >> [ 931.590225] blkdev_issue_discard+0x7a/0xd0 > >> [ 931.590266] btrfs_issue_discard+0x123/0x160 [btrfs] > >> [ 931.590299] btrfs_discard_extent+0xd8/0x160 [btrfs] > >> [ 931.590335] btrfs_finish_extent_commit+0xe2/0x240 [btrfs] > >> [ 931.590382] btrfs_commit_transaction+0x573/0x840 [btrfs] > >> [ 931.590415] ? btrfs_block_rsv_check+0x25/0x70 [btrfs] > >> [ 931.590456] __btrfs_end_transaction+0x2be/0x2d0 [btrfs] > >> [ 931.590493] btrfs_end_transaction_throttle+0x13/0x20 [btrfs] > >> [ 931.590530] btrfs_drop_snapshot+0x489/0x800 [btrfs] > >> [ 931.590567] btrfs_clean_one_deleted_snapshot+0xbb/0xf0 [btrfs] > >> [ 931.590607] cleaner_kthread+0x136/0x160 [btrfs] > >> [ 931.590612] kthread+0x120/0x140 > >> [ 931.590646] ? btree_submit_bio_start+0x20/0x20 [btrfs] > >> [ 931.590658] ? kthread_bind+0x40/0x40 > >> [ 931.590661] ret_from_fork+0x22/0x40 > >> > > > > It seems your filesystem is mounted with the DSICARD option meaning > > every delete will result in discard this is highly suboptimal for > > ssd's. > > Try remounting the fs without the discard option see if it helps. > > Generally for discard you want to submit it in big batches (what fstrim > > does) so that the ftl on the ssd could apply any optimisations it might > > have up its sleeve. > > Spot on! > > Removed "discard" from fstab and added "ssd", rebooted - no more > btrfs-cleaner running. > > Do you know if the issue you described ("discard this is highly > suboptimal for ssd") affects other filesystems as well to a similar > extent? I.e. if using ext4 on ssd?
Quite a lot of activity on ext4 and XFS are overwrites, so discard isn't needed. And it might be discard is subject to delays. On Btrfs, it's almost immediate, to the degree that on a couple SSDs I've tested, stale trees referenced exclusively by the most recent backup tree entires in the superblock are already zeros. That functionally means no automatic recoveries at mount time if there's a problem with any of the current trees. I was using it for about a year to no ill effect, BUT not a lot of file deletions either. I wouldn't recommend it, and instead suggest enabling the fstrim.timer which by default runs fstrim.service once a week (which in turn issues fstrim, I think on all mounted volumes.) I am a bit more concerned about the read errors you had that were being corrected automatically? The corruption suggests a firmware bug related to trim. I'd check the affected SSD firmware revision and consider updating it (only after a backup, it's plausible the firmware update is not guaranteed to be data safe). Does the volume use DUP or raid1 metadata? I'm not sure how it's correcting for these problems otherwise. -- Chris Murphy
