Short story:
$ sudo debugfs -R "ncheck 23666852" /dev/md127
fails with
/dev/md127: Block bitmap checksum does not match bitmap while reading
allocation bitmaps
ncheck: Filesystem not open
even after a (clean) fsck. 23666852 is a known good inode.
$ ls -li /data1/tmp/zzz
23666852 -rw-rw-r-- 1 eyal eyal 544 Aug 12 2020 /data1/tmp/zzz
Same issue with other inodes.
Full story:
Recently one of my raid-6 disks started logging errors in smart report:
197 Current_Pending_Sector -O--C- 100 100 000 - 8
198 Offline_Uncorrectable ----C- 100 100 000 - 8
Then a few hours later:
197 Current_Pending_Sector -O--C- 100 100 000 - 16
198 Offline_Uncorrectable ----C- 100 100 000 - 16
By now the report also included:
Pending Defects log (GP Log 0x0c)
Index LBA Hours
0 23240269256 53439
1 23240269257 53439
2 23240269258 53439
3 23240269259 53439
4 23240269260 53439
5 23240269261 53439
6 23240269262 53439
7 23240269263 53439
8 23387031568 53376
9 23387031569 53376
10 23387031570 53376
11 23387031571 53376
12 23387031572 53376
13 23387031573 53376
14 23387031574 53376
15 23387031575 53376
This disk was in the array for over 6 years so not a big surprise.
As I was trying to identify the files (if any) using the above LBAs I used
debugfs which gave the error above.
$ df /data1
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/md127 58574076816 48925332280 9648728152 84% /data1
A search suggested I fsck the disk which I did. No issues logged. debugfs
roblem remained.
I then thought that maybe the raid would have something to say, so I ran
$ sudo raid6check /dev/md127 $((22695000)) 1024
followed by
$ sudo raid6check /dev/md127 $((22838000)) 1024
which I figured covered the reported LBAs.
Surprisingly it found no errors but the smart pending errors disappeared.
raid6check was run in check (no write) mode.
I then tried the debugfs again and the error still happens.
I now repeated the block and inode checks.
$ sudo fdisk -l /dev/sde
Disk /dev/sde: 10.91 TiB, 12000138625024 bytes, 23437770752 sectors
Device Start End Sectors Size Type
/dev/sde1 2048 23437768703 23437766656 10.9T Linux filesystem
The array is a 7-disk raid-6 so 5 data disks.
$ sudo sh -c '(lo=$((23240269256-2048)) ; lo="$((lo*5))" ; lo="$((lo/8))" ; echo "testb $lo
1" ; debugfs -R "testb $lo 1" /dev/md127)'
testb 14525167005 1
debugfs 1.47.2 (1-Jan-2025)
/dev/md127: Block bitmap checksum does not match bitmap while reading
allocation bitmaps
testb: Filesystem not open
Does anyone have an idea what the problem is?
This weekend is backup day so I will probably run a full raid 'check' or maybe
a full array raid6check after that.
TIA
--
Eyal at Home ([email protected])
--
_______________________________________________
users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it:
https://pagure.io/fedora-infrastructure/new_issue