Hello everyone,
I've conducted some crash tests (unplugging drives, the machine,
terminating and restarting ceph systemd services) with Ceph 12.2.0 on
Ubuntu and quite easily managed to corrupt what appears to be rocksdb's
log replay on a bluestore OSD:
# ceph-bluestore-tool fsckĀ --path /var/lib/ceph/osd/ceph-2/
[...]
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2859]
Recovered from manifest file:db/MANIFEST-000975
succeeded,manifest_file_number is 975, next_file_number is 1008,
last_sequence is 51965907, log_number is 0,prev_log_number is
0,max_column_family is 0
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2867]
Column family [default] (ID 0), log number is 1005
4 rocksdb: EVENT_LOG_v1 {"time_micros": 1509298585082794, "job": 1,
"event": "recovery_started", "log_files": [1003, 1005]}
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482]
Recovering log #1003 mode 0
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482]
Recovering log #1005 mode 0
3 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:424]
db/001005.log: dropping 3225 bytes; Corruption: missing start of
fragmented record(2)
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:217] Shutdown:
canceling all background work
4 rocksdb:
[/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:343] Shutdown
complete
-1 rocksdb: Corruption: missing start of fragmented record(2)
-1 bluestore(/var/lib/ceph/osd/ceph-2/) _open_db erroring opening db:
1 bluefs umount
1 bdev(0x557f5b6a4240 /var/lib/ceph/osd/ceph-2//block) close
If I understand this right, rocksdb isĀ just trying to replay WAL type
logs, of which presumably "001005.log" is corrupted. It then throws an
error that stops everything.
I did try to mount the bluestore, as I was assuming that would probably
where I'd find the rocksdb's files somewhere, but that also doesn't seem
possible:
#ceph-objectstore-tool --op fsck --data-path /var/lib/ceph/osd/ceph-2/
--mountpoint /mnt/bluestore-repair/
fsck failed: (5) Input/output error
# ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-2
--mountpoint /mnt/bluestore-repair/
Mount failed with '(5) Input/output error'
# ceph-objectstore-tool --op fuse --force --skip-journal-replay
--data-path /var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/
Mount failed with '(5) Input/output error'
Adding --debug shows the ultimate culprit is just the above rocksdb
error again.
Q: Is there some way in which I can tell rockdb to truncate or delete /
skip the respective log entries? Or can I get access to rocksdb('s
files) in some other way to just manipulate it or delete corrupted WAL
files manually?
-Michael
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com