On 02/22/2015 07:28 PM, Jim Murphy wrote: > [...] > Part of the discussion: > >>> btrfs checksumming theoretically allows you to transparently recover >>> after media corruption if filesystem has redundancy (more than one >>> copy of data). Journald checksum will probably detect corruption, but >>> can it repair it? >>> >> No it cannot. >> But btrfs checksumming cannot fix things for you either if you lose >> non-trivial amounts of data. It might be able to fix a few bits of >> errors, but not non-trivial amounts. I mean, that's a simple property >> of error correction codes: the more you want to be able to correct the >> longer must your checksum be. Neither btrfs' nor journald's are >> substantial enough to correct even a sector... >> >> Lennart >
This is pure ignorance. It does not require the redundancy provided by the CRC algorithm to recover the data; it uses the checksum just to find out if the copy is good, and uses redundancy provided by raid to repair it. (which is simply what Lennart's victim already said by adding context with "if filesystem has redundancy" and "more than one copy of data", which is not the CRC). The checksum doesn't need to be longer to repair it, only to prevent collision. The chance of a collision is something like one in 2^32 = 4 billion. (< 1 in 512 :P) Test this out simply by making a raid1, filling it with data, then run 2 things in infinite loops. One to repeat scrubs, and one to write random data to the disks, not just a few bits. Here's 30 minutes of the test script (kernel 3.18.x, btrfs tools version 3.18.2): Konsole output Konsole output WARNING: errors detected during scrubbing, corrected. scrub status for af936534-6c3f-4136-809a-740a32a65591 scrub started at Fri Feb 27 15:07:34 2015 and finished after 159 seconds total bytes scrubbed: 13.20GiB with 120 errors error details: csum=120 corrected errors: 120, uncorrectable errors: 0, unverified errors: 0 scrub started on /mnt/test, fsid af936534-6c3f-4136-809a-740a32a65591 (pid=14152) WARNING: errors detected during scrubbing, corrected. scrub status for af936534-6c3f-4136-809a-740a32a65591 scrub started at Fri Feb 27 15:10:14 2015 and finished after 144 seconds total bytes scrubbed: 13.20GiB with 14 errors error details: csum=14 corrected errors: 14, uncorrectable errors: 0, unverified errors: 0 scrub started on /mnt/test, fsid af936534-6c3f-4136-809a-740a32a65591 (pid=14275) WARNING: errors detected during scrubbing, corrected. scrub status for af936534-6c3f-4136-809a-740a32a65591 scrub started at Fri Feb 27 15:12:44 2015 and finished after 139 seconds total bytes scrubbed: 13.20GiB with 80 errors error details: csum=80 corrected errors: 80, uncorrectable errors: 0, unverified errors: 0 scrub started on /mnt/test, fsid af936534-6c3f-4136-809a-740a32a65591 (pid=14377) WARNING: errors detected during scrubbing, corrected. scrub status for af936534-6c3f-4136-809a-740a32a65591 scrub started at Fri Feb 27 15:15:04 2015 and finished after 168 seconds total bytes scrubbed: 13.20GiB with 14 errors error details: csum=14 corrected errors: 14, uncorrectable errors: 0, unverified errors: 0 scrub started on /mnt/test, fsid af936534-6c3f-4136-809a-740a32a65591 (pid=14505) WARNING: errors detected during scrubbing, corrected. scrub status for af936534-6c3f-4136-809a-740a32a65591 scrub started at Fri Feb 27 15:17:54 2015 and finished after 163 seconds total bytes scrubbed: 13.20GiB with 110 errors error details: csum=110 corrected errors: 110, uncorrectable errors: 0, unverified errors: 0 scrub started on /mnt/test, fsid af936534-6c3f-4136-809a-740a32a65591 (pid=14595) WARNING: errors detected during scrubbing, corrected. scrub status for af936534-6c3f-4136-809a-740a32a65591 scrub started at Fri Feb 27 15:20:44 2015 and finished after 173 seconds total bytes scrubbed: 13.20GiB with 53 errors error details: csum=53 corrected errors: 53, uncorrectable errors: 0, unverified errors: 0 scrub started on /mnt/test, fsid af936534-6c3f-4136-809a-740a32a65591 (pid=14737) Obviously there is a chance for both copies to be destroyed at the same time... but it isn't all that likely in 20 minutes, even with such high destruction rate. But clearly this disproves Lennart's unfounded statement, saying a single sector cannot be repaired. Here's 391 blocks so far, which I assume is more than 391 sectors. Clearing cache and then doing a diff on the test files compared to the original copy shows that they are undamaged. (this means you can cp the files away without any loss, but maybe there are bugs that will make btrfs die later :P it's not exactly fully production ready) So change "theoretically" in the above email to "in practice". And the test script: ################ # variables used in many parts of the script ################ disk1=/dev/data/btrfs1 disk2=/dev/data/btrfs2 testuser=peter ################ # Set up some disks ################ lvcreate -n btrfs1 -L 10g data lvcreate -n btrfs2 -L 10g data chown "$testuser" "$disk1" "$disk2" mkfs.btrfs -d raid1 -m raid1 /dev/data/btrfs{1,2} mount /dev/data/btrfs1 /mnt/test cp -a ~peter/archive/software/manjaro/ /mnt/test # make sure there is enough data to test # # df -h /mnt/test # Filesystem Size Used Avail Use% Mounted on /dev/mapper/data-btrfs1 10G 5.7G 3.3G 64% /mnt/test # make sure the files match so we can compare properly later # diff -qr manjaro/ ~peter/archive/software/manjaro/ ################ # The scrub script ################ while true; do if ! btrfs scrub status /mnt/test | grep "running for" >/dev/null 2>&1; then btrfs scrub status /mnt/test btrfs scrub start /mnt/test echo fi sleep 10 done ################ # The disk mutilation script ################ # run as a non-root user mutilate() { # Pick a disk if [ $(($RANDOM % 2 )) == 1 ]; then target=${disk1} else target=${disk2} fi echo "Disk $target selected" # Pick a sector sz=$(blockdev --getsz "${target}") sector=$(($RANDOM$RANDOM % $sz)) echo "sector $sector selected" # just a paranoid safety check if [ -z "$disk1" -o -z "$disk2" -o "$target" != "$disk1" -a "$target" != "$disk2" -o "${target:0:6}" = "/dev/s" ]; then echo "ERROR: safety check failed..." return 1 fi if [ "$(id -u)" = "0" ]; then echo "ERROR: don't run as root..." return 1 fi # damage the disk dd if=/dev/urandom of=${target} bs=512 count=100 seek=$sector } while true; do # destroy 10 random places x 100 blocks x 512 bytes per block (510 kB) for n in {1..10}; do mutilate done sleep 300 # scrub takes about 5min done
_______________________________________________ Dng mailing list Dng@lists.dyne.org https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng