Re: BTRFS partition corrupted after deleting files in /home

Sreyan Chakravarty Mon, 04 Jan 2021 05:43:17 -0800

On Sun, Jan 3, 2021 at 11:06 PM Andrej Podzimek via users
<users@lists.fedoraproject.org> wrote:
>
> Are you sure you are opening the right LUKS device in the live environment? 
> Is the LUKS device readable (e.g. just using "cat /dev/mapper/dm_crypt > 
> /dev/null")? (Does its size look right, e.g. in "lsblk -p"?) Do you get any 
> read errors in dmesg (for NVME / SAS / SATA)? If you pipe your direct 
> partition read through "pv -arb" ("pv -arb /dev/mapper/dm_crypt > /dev/null") 
> (or another cat-like tool that shows the data rate), does it look reasonable?


Yes it is fully readable.

I just got a full ddrescue image that had 0 bad-sectors. So nothing is
wrong with my disk.

This is the ddrescue output:

GNU ddrescue 1.25
Press Ctrl-C to interrupt
Initial status (read from mapfile)
rescued: 998575 MB, tried: 0 B, bad-sector: 0 B, bad areas: 0

Current status
    ipos:        0 B, non-trimmed:        0 B,  current rate:       0 B/s
    opos:        0 B, non-scraped:        0 B,  average rate:       0 B/s
non-tried:        0 B,  bad-sector:        0 B,    error rate:       0 B/s
 rescued:  998575 MB,   bad areas:        0,        run time:          0s
pct rescued:  100.00%, read errors:        0,  remaining time:         n/a
                             time since last successful read:         n/a
Finished

As you can see there are no bad sectors.

$ pv -arb /dev/mapper/dm_crypt > /dev/null
452GiB [92.0MiB/s] [97.5MiB/s]

The data rate is also reasonable.

>
> Saving a binary image of your device would be a good first step — if the 
> device is still readable.
>
Yes, I did that that's why you are getting a late reply.

> What makes you so sure that this is a Btrfs problem, as opposed to a SSD or 
> hard drive failure or a RAM failure causing data corruption?
>   (Were there no other errors before the Btrfs errors in "dmesg"?)

I think it is BTRFS because I recently had to do a lot of snapshot
creation and restoration.

Also, I don't think my RAM is to blame since I have never had a
problem with it, even now I have been on my live system for about 14
hrs, since I had to get all my work done from there.

>
>
> While data loss of any kind is (understandably) frustrating, claiming that 
> Btrfs is “unstable” is plain wrong and unhelpful and it is unlikely to 
> motivate Btrfs experts to chime in and help.
>   :-/
>
I believe it's better to call this out, rather than worry about
hurting peoples feelings.

> A few suggestions:
> 0. Take a binary backup of your Btrfs device, if it’s still readable.

Done.

> 1. Check your RAM. Does the machine have ECC? You may want to give it a few 
> hours of memtest, no matter what.
>

I don't think my RAM is at fault. What is an ECC ?
I will give it a memtest irregardless and get back to you, but I think
it will be a waste of time.

> 2. Check your SSD / disk whether it’s reading at a reasonable pace and 
> showing nothing suspicious in "smartctl -A" and "dmesg".
>
SmartCTL output:
https://pastebin.com/raw/B6AdLZXt

I ran the smartctl test a month ago, since I though there was
something wrong with my HDD but the guys on the mailing list told me I
did not have to worry.

https://listi.jpberlin.de/pipermail/smartmontools-support/2020-November/000560.html

> 3. Then there are a few tools (see man btrfs-check, man btrfs-rescue, man 
> btrfs-restore) you might want to try, depending on the situation. Some of 
> them require help from Btrfs experts (at which point you may want to ask on 
> their kernel mailing lists).
>

Yeah that's the only option I have left.

-- 
Regards,
Sreyan Chakravarty
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Re: BTRFS partition corrupted after deleting files in /home

Reply via email to