I don't know what happened, but I just logged into my system and
deleted some unused files in my home directory. Just some directories.
Suddenly everything on my system became read only. While rebooting I
think I saw messages like:
"BTRFS Error"
I booted into a live environment to restore my snapshots, then while
mounting the BTRFS root partition I got:
mount: /mnt: wrong fs type, bad option, bad superblock on
/dev/mapper/dm_crypt, missing codepage or helper program, or other
error.
Are you sure you are opening the right LUKS device in the live environment? Is the LUKS device readable (e.g. just using
"cat /dev/mapper/dm_crypt > /dev/null")? (Does its size look right, e.g. in "lsblk -p"?) Do you get
any read errors in dmesg (for NVME / SAS / SATA)? If you pipe your direct partition read through "pv -arb"
("pv -arb /dev/mapper/dm_crypt > /dev/null") (or another cat-like tool that shows the data rate), does it look
reasonable?
Saving a binary image of your device would be a good first step — if the device
is still readable.
I don't know what to do, my system won't boot.
Why is BTRFS so unstable ?
What makes you so sure that this is a Btrfs problem, as opposed to a SSD or
hard drive failure or a RAM failure causing data corruption?
(Were there no other errors before the Btrfs errors in "dmesg"?)
While data loss of any kind is (understandably) frustrating, claiming that
Btrfs is “unstable” is plain wrong and unhelpful and it is unlikely to motivate
Btrfs experts to chime in and help.
:-/
(For example, I have been using Btrfs on all my systems since ~2010 and Btrfs
RAID5/6 setups since ~2016. All I can say is that it has been perfectly stable
in my case, saving my data many times over as my poor choice of SMR disks kept
backfiring on me.)
Please don't tell me to erase everything and start again. that is not feasible.
Is there any other way?
A few suggestions:
0. Take a binary backup of your Btrfs device, if it’s still readable.
1. Check your RAM. Does the machine have ECC? You may want to give it a few
hours of memtest, no matter what.
2. Check your SSD / disk whether it’s reading at a reasonable pace and showing nothing suspicious
in "smartctl -A" and "dmesg".
3. Then there are a few tools (see man btrfs-check, man btrfs-rescue, man
btrfs-restore) you might want to try, depending on the situation. Some of them
require help from Btrfs experts (at which point you may want to ask on their
kernel mailing lists).
No filesystem can cope with failing hardware or human error (such as
unintentional low-level writes into a partition). It’s always good to rule out
these cases first, before suspecting the filesystem. (If there was a stability
issue, large production setups (https://lwn.net/Articles/824855/) would most
likely spot it first.)
Andrej
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org