Re: BTRFS partition corrupted after deleting files in /home

Andrej Podzimek via users Sun, 03 Jan 2021 09:36:18 -0800

I don't know what happened, but I just logged into my system and
deleted some unused files in my home directory. Just some directories.


Suddenly everything on my system became read only. While rebooting I
think I saw messages like:

"BTRFS Error"

I booted into a live environment to restore my snapshots, then while
mounting the BTRFS root partition I got:

mount: /mnt: wrong fs type, bad option, bad superblock on
/dev/mapper/dm_crypt, missing codepage or helper program, or other
error.


Are you sure you are opening the right LUKS device in the live environment? Is the LUKS device readable (e.g. just using 
"cat /dev/mapper/dm_crypt > /dev/null")? (Does its size look right, e.g. in "lsblk -p"?) Do you get 
any read errors in dmesg (for NVME / SAS / SATA)? If you pipe your direct partition read through "pv -arb" 
("pv -arb /dev/mapper/dm_crypt > /dev/null") (or another cat-like tool that shows the data rate), does it look 
reasonable?



Saving a binary image of your device would be a good first step — if the device 
is still readable.

I don't know what to do, my system won't boot.

Why is BTRFS so unstable ?


What makes you so sure that this is a Btrfs problem, as opposed to a SSD or 
hard drive failure or a RAM failure causing data corruption?
 (Were there no other errors before the Btrfs errors in "dmesg"?)


While data loss of any kind is (understandably) frustrating, claiming that 
Btrfs is “unstable” is plain wrong and unhelpful and it is unlikely to motivate 
Btrfs experts to chime in and help.
 :-/


(For example, I have been using Btrfs on all my systems since ~2010 and Btrfs 
RAID5/6 setups since ~2016. All I can say is that it has been perfectly stable 
in my case, saving my data many times over as my poor choice of SMR disks kept 
backfiring on me.)

Please don't tell me to erase everything and start again. that is not feasible.

Is there any other way?


A few suggestions:
0. Take a binary backup of your Btrfs device, if it’s still readable.

1. Check your RAM. Does the machine have ECC? You may want to give it a few 
hours of memtest, no matter what.

2. Check your SSD / disk whether it’s reading at a reasonable pace and showing nothing suspicious 
in "smartctl -A" and "dmesg".

3. Then there are a few tools (see man btrfs-check, man btrfs-rescue, man 
btrfs-restore) you might want to try, depending on the situation. Some of them 
require help from Btrfs experts (at which point you may want to ask on their 
kernel mailing lists).


No filesystem can cope with failing hardware or human error (such as 
unintentional low-level writes into a partition). It’s always good to rule out 
these cases first, before suspecting the filesystem. (If there was a stability 
issue, large production setups (https://lwn.net/Articles/824855/) would most 
likely spot it first.)


Andrej
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Re: BTRFS partition corrupted after deleting files in /home

Reply via email to