Hi Igor, I now fixed my wrong OSD debug config to: [osd.7] debug bluefs = 20 debug bdev = 20
and you can download the debug log from: https://we.tl/t-3e4do1PQGj Thanks, Sebastian > On 21.12.2021, at 19:44, Igor Fedotov <igor.fedo...@croit.io> wrote: > > Hi Sebastian, > > first of all I'm not sure this issue has the same root cause as Francois one. > Highly likely it's just another BlueFS/RocksDB data corruption which is > indicated in the same way. > > In this respect I would rather mention this one reported just yesterday: > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/M2ZRZD4725SRPFE5MMZPI7JBNO23FNU6/ > > So similarly I'd like to ask some questions/collect more data. Please find > the list below: > > 1) Is this a bare metal or containerized deployment? > > 2) What's the output for "hdparm -W <dev>" for devices in question? Any > enabled write caching at the disk controller? > > 3) Could you please share the broken OSD startup log with debug-bluefs set to > 20? > > 4) Could you please export bluefs files (this might need some extra space to > keep all the bluefs data at target filesystem) via ceph-bluestore-tool and > share the content of db/002182.sst file? The first 4M would be generally > sufficient if it's huge. > > 5) Have you seen RocksDB data corruptions at this cluster before > > 6)What's the disk h/w for these OSDs - disk drives and controllers? > > 7) Did you reboot the nodes or just restart the OSDs? Did all the issues > happen at the same or at different nodes? How many OSDs were restarted total? > > 8) Is that correct that this is a hdd-only setup, there is no standaone > SSD/NVMe for WAL/DB? > > 9) Would you be able to run some long lasting (and potentially data > corrupting) experiments at this cluster in an attempt to pin point the issue. > I'm thinking about periodic OSD shutdown under the load to catch the > corrupting event. With a raised debug level for that specific OSD. The major > problem with this bug debugging is that we can see its consequences - but we > have no clue about what was happening when actual corruption happened. Hence > we need to reproduce that somehow. So please let me know if we can use your > cluster/help for that... > > > Thanks in advance, > > Igor > > On 12/21/2021 7:47 PM, Sebastian Mazza wrote: >> Hi all, >> >> after a reboot of a cluster 3 OSDs can not be started. The OSDs exit with >> the following error message: >> 2021-12-21T01:01:02.209+0100 7fd368cebf00 4 rocksdb: >> [db_impl/db_impl.cc:396] Shutdown: canceling all background work >> 2021-12-21T01:01:02.209+0100 7fd368cebf00 4 rocksdb: >> [db_impl/db_impl.cc:573] Shutdown complete >> 2021-12-21T01:01:02.209+0100 7fd368cebf00 -1 rocksdb: Corruption: Bad >> table magic number: expected 9863518390377041911, found 0 in db/002182.sst >> 2021-12-21T01:01:02.213+0100 7fd368cebf00 -1 >> bluestore(/var/lib/ceph/osd/ceph-7) _open_db erroring opening db: >> 2021-12-21T01:01:02.213+0100 7fd368cebf00 1 bluefs umount >> 2021-12-21T01:01:02.213+0100 7fd368cebf00 1 bdev(0x559bbe0ea800 >> /var/lib/ceph/osd/ceph-7/block) close >> 2021-12-21T01:01:02.293+0100 7fd368cebf00 1 bdev(0x559bbe0ea400 >> /var/lib/ceph/osd/ceph-7/block) close >> 2021-12-21T01:01:02.537+0100 7fd368cebf00 -1 osd.7 0 OSD:init: unable >> to mount object store >> 2021-12-21T01:01:02.537+0100 7fd368cebf00 -1 ** ERROR: osd init >> failed: (5) Input/output error >> >> >> I found a similar problem in this Mailing list: >> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/MJLVS7UPJ5AZKOYN3K2VQW7WIOEQGC5V/#MABLFA4FHG6SX7YN4S6BGSCP6DOAX6UE >> >> In this thread, Francois was able to successfully repair his OSD data with >> `ceph-bluestore-tool fsck`. I tried to run: >> `ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-7 -l >> /var/log/ceph/bluestore-tool-fsck-osd-7.log --log-level 20 > >> /var/log/ceph/bluestore-tool-fsck-osd-7.out 2>&1` >> But that results in: >> 2021-12-21T16:44:18.455+0100 7fc54ef7a240 -1 rocksdb: Corruption: Bad >> table magic number: expected 9863518390377041911, found 0 in db/002182.sst >> 2021-12-21T16:44:18.455+0100 7fc54ef7a240 -1 >> bluestore(/var/lib/ceph/osd/ceph-7) _open_db erroring opening db: >> fsck failed: (5) Input/output error >> >> I also tried to run `ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-7 >> repair`. But that also fails with: >> 2021-12-21T17:34:06.780+0100 7f35765f7240 0 >> bluestore(/var/lib/ceph/osd/ceph-7) _open_db_and_around read-only:0 repair:0 >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 bdev(0x55fce5a1a800 >> /var/lib/ceph/osd/ceph-7/block) open path /var/lib/ceph/osd/ceph-7/block >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 bdev(0x55fce5a1a800 >> /var/lib/ceph/osd/ceph-7/block) open size 12000134430720 (0xae9ffc00000, 11 >> TiB) >> block_size 4096 (4 KiB) rotational discard not supported >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 >> bluestore(/var/lib/ceph/osd/ceph-7) _set_cache_sizes cache_size 1073741824 >> meta 0.45 kv 0.45 data 0.06 >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 bdev(0x55fce5a1ac00 >> /var/lib/ceph/osd/ceph-7/block) open path /var/lib/ceph/osd/ceph-7/block >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 bdev(0x55fce5a1ac00 >> /var/lib/ceph/osd/ceph-7/block) open size 12000134430720 (0xae9ffc00000, 11 >> TiB) >> block_size 4096 (4 KiB) rotational discard not supported >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 bluefs add_block_device >> bdev 1 path /var/lib/ceph/osd/ceph-7/block size 11 TiB >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 bluefs mount >> 2021-12-21T17:34:06.780+0100 7f35765f7240 1 bluefs _init_alloc shared, >> id 1, capacity 0xae9ffc00000, block size 0x10000 >> 2021-12-21T17:34:06.904+0100 7f35765f7240 1 bluefs mount >> shared_bdev_used = 0 >> 2021-12-21T17:34:06.904+0100 7f35765f7240 1 >> bluestore(/var/lib/ceph/osd/ceph-7) _prepare_db_environment set db_paths to >> db,11400127709184 db.slow,11400127709184 >> 2021-12-21T17:34:06.908+0100 7f35765f7240 -1 rocksdb: Corruption: Bad >> table magic number: expected 9863518390377041911, found 0 in db/002182.sst >> 2021-12-21T17:34:06.908+0100 7f35765f7240 -1 >> bluestore(/var/lib/ceph/osd/ceph-7) _open_db erroring opening db: >> 2021-12-21T17:34:06.908+0100 7f35765f7240 1 bluefs umount >> 2021-12-21T17:34:06.908+0100 7f35765f7240 1 bdev(0x55fce5a1ac00 >> /var/lib/ceph/osd/ceph-7/block) close >> 2021-12-21T17:34:07.072+0100 7f35765f7240 1 bdev(0x55fce5a1a800 >> /var/lib/ceph/osd/ceph-7/block) close >> >> >> The cluster is not in production, therefore, I can remove all corrupt pools >> and delete the OSDs. However, I would like to understand what was going on, >> in order to be able to avoid such a situation in the future. >> >> I will provide the OSD logs from the time around the server reboot at the >> following link: https://we.tl/t-fArHXTmSM7 >> >> Ceph version: 16.2.6 >> >> >> Thanks, >> Sebastian >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > -- > Igor Fedotov > Ceph Lead Developer > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH, Freseniusstr. 31h, 81247 Munich > CEO: Martin Verges - VAT-ID: DE310638492 > Com. register: Amtsgericht Munich HRB 231263 > Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io