Hi Frank, I'm glad this thread is more about understanding what's going on, as opposed to a quick fix. Normally people in this situation should just zap and redeploy, like you said.
The next thing I'd do is fuse-mount the OSD and see which osdmaps it has -- try to read them, etc. Does the OSD have the osdmap epoch it is warning about in the logs? ceph-objectstore-tool --data-path </path/to/osd> --op fuse --mountpoint /mnt Inside /mnt you'll see the PGs and a meta folder, IIRC. Inside meta you will find the osdmaps. Cheers, dan -- Dan van der Ster CTO @ CLYSO https://clyso.com | dan.vanders...@clyso.com On Tue, Oct 22, 2024 at 12:14 AM Frank Schilder <fr...@dtu.dk> wrote: > > Hi Dan, > > I don't remember exactly when I took them down. Maybe a month ago? The reason > was a fatal IO error due to connectivity issues. Its 2 SSDs with 4 OSDs each > and they were installed in JBODs. We have 12 of those and, unfortunately, it > seems that 3 have issues with SSDs even though they have dedicated slots for > SSD/NVMe drives. The hallmark are SAS messages containing "blk_update" > (usually blk_update_request with a drive reset). Any attempts to fix that > (reseating, moving etc.) failed. We had like 40-50 messages per disk per hour > in the log, didn't impact performance though. So it was a nuisance on the > lower priority list. > > Some time ago, maybe a month, in one of these JBODs the connection finally > gave up. For both SSDs at the same time. I'm pretty sure it was a bus error > and the disks are fine. I stopped these OSDs and the data is recovered, no > problems here. The disks are part of our FS meta data pool and we have > plenty, so there was no rush. They store a ton of objects though and I try to > avoid having a full re-write of everything (life-time fetishist). > > Yesterday, we started a major disk replacement operation after evacuating > about 40 HDDs. As part of this, we move all SSDs from the JBODs to the > servers and I tried to get these two disks up yesterday with the result > reported below. We are not done yet with the maintenance operation and I can > pull logs after we are done. Possibly next week. We are not in a rush to get > these disks back up and I'm also prepared to just zap and redeploy these. > > My interest with this case is along the lines "I would like to know what is > going on" and "is there a better way than zap+redeploy". Others might be in a > situation where they don't have the luxury of all data being healthy and we > have a chance to experiment without any risk. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Dan van der Ster <dan.vanders...@clyso.com> > Sent: Tuesday, October 22, 2024 12:04 AM > To: Frank Schilder > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] failed to load OSD map for epoch 2898146, got 0 > bytes > > Hi Frank, > > Do you have some more info about these OSDs -- how long were they down > for? Were they down because of some IO errors? > > Is it possible that the OSD thinks it stored those osdmaps but IO > errors are preventing them from being loaded? > > I know the log is large, but can you share at least a snippet of when > this starts? Preferably with debug_osd = 10. > > Thanks, Dan > > -- > Dan van der Ster > CTO @ CLYSO > https://clyso.com | dan.vanders...@clyso.com > > On Mon, Oct 21, 2024 at 1:32 PM Frank Schilder <fr...@dtu.dk> wrote: > > > > Hi Dan, > > > > maybe not. Looking at the output of > > > > grep -B 1 -e "2971464 failed to load OSD map for epoch 2898132" > > /var/log/ceph/ceph-osd.1004.log > > > > that searches for lines that start a cycle and also print the line before, > > there might be some progress, but I'm not sure: > > > > 2024-10-21T17:41:40.173+0200 7fad509a1700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2898208, got 0 bytes > > 2024-10-21T17:41:40.173+0200 7fad4a194700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2898132, got 0 bytes > > -- > > 2024-10-21T17:41:40.610+0200 7fad519a3700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2898678, got 0 bytes > > 2024-10-21T17:41:40.610+0200 7fad4e19c700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2898132, got 0 bytes > > -- > > 2024-10-21T17:41:41.340+0200 7fad4c198700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2898293, got 0 bytes > > 2024-10-21T17:41:41.340+0200 7fad4f19e700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2898132, got 0 bytes > > -- > > 2024-10-21T17:41:41.347+0200 7fad4e99d700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2899238, got 0 bytes > > 2024-10-21T17:41:41.347+0200 7fad4c999700 -1 osd.1004 2971464 failed to > > load OSD map for epoch 2898132, got 0 bytes > > > > The loop seems to run longer and longer. Problem is though that by the time > > it got here it already wrote like 1G log. The loop seems to repeat over all > > epochs for each such iteration, so the log output is quadratic in the > > number of epochs to catch up with. I still have like 100000 to go and I > > doubt I have disks large enough to collect the resulting logs. The logging > > probably also a total performance killer. > > > > Is it possible to suppress the massive log spam so that I can let it run > > until it is marked up? These messages seem not to be related to a log > > level. If absolutely necessary, I could start the OSD manually with logging > > to disk disabled. > > > > Thanks and best regards, > > ================= > > Frank Schilder > > AIT Risø Campus > > Bygning 109, rum S14 > > > > ________________________________________ > > From: Dan van der Ster <dan.vanders...@clyso.com> > > Sent: Monday, October 21, 2024 9:03 PM > > To: Frank Schilder > > Cc: ceph-users@ceph.io > > Subject: Re: [ceph-users] failed to load OSD map for epoch 2898146, got 0 > > bytes > > > > Hi Frank, > > > > Are you sure it's looping over the same epochs? > > It looks like that old osd is trying to catch up on all the osdmaps it > > missed while it was down. (And those old maps are probably trimmed > > from all the mons and osds, based on the "got 0 bytes" error). > > Eventually it should catch up to the current (e 2971464 according to > > your log), and then the PGs can go active. > > > > Cheers, Dan > > > > -- > > Dan van der Ster > > CTO @ CLYSO > > https://clyso.com | dan.vanders...@clyso.com > > > > > > > > > > On Mon, Oct 21, 2024 at 9:13 AM Frank Schilder <fr...@dtu.dk> wrote: > > > > > > Hi all, > > > > > > I have a strange problem on an octopus latest cluster. We had a couple > > > of SSD OSDs down for a while and brought them up today again. For some > > > reason, these OSDs don't come up and flood the log with messages like > > > > > > osd.1004 2971464 failed to load OSD map for epoch 2898146, got 0 bytes > > > > > > These messages cycle through the same epochs over and over again. I did > > > not really fine too much help, there is an old thread about a similar/the > > > same problem on a home lab cluster, with new OSDs though, I believe. I > > > couldn't really find useful information. The OSDs seem to boot fine and > > > then end up in something like a death loop. Below some snippets from the > > > OSD log. > > > > > > Any hints appreciated. > > > Thanks and best regards, > > > Frank > > > > > > After OSD start, everything looks normal up to here: > > > > > > 2024-10-21T17:41:39.136+0200 7fad73cf6f00 0 osd.1004 2971464 load_pgs > > > opened 205 pgs > > > 2024-10-21T17:41:39.140+0200 7fad73cf6f00 -1 osd.1004 2971464 > > > log_to_monitors {default=true} > > > 2024-10-21T17:41:39.150+0200 7fad73cf6f00 -1 osd.1004 2971464 > > > mon_cmd_maybe_osd_create fail: 'osd.1004 has already bound to class > > > 'fs_meta', can not reset class to 'ssd'; use 'ceph osd crush > > > rm-device-class <id>' to remove old class first': (16) Device or > > > resource busy > > > 2024-10-21T17:41:39.155+0200 7fad519a3700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898132, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898132, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898133, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898134, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898135, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898136, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4f99f700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898132, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4f99f700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898133, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898132, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898133, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898134, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898135, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898136, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898137, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad73cf6f00 0 osd.1004 2971464 done with > > > init, starting boot process > > > 2024-10-21T17:41:39.155+0200 7fad73cf6f00 1 osd.1004 2971464 start_boot > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898138, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898139, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898140, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898141, got 0 bytes > > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898142, got 0 bytes > > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898143, got 0 bytes > > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898144, got 0 bytes > > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898145, got 0 bytes > > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2898146, got 0 bytes > > > > > > > > > These messages repeat over and over again with some others of this form > > > showing up every now and then: > > > > > > 2024-10-21T17:41:39.476+0200 7fad651ca700 4 rocksdb: > > > [db/compaction_job.cc:1332] [default] [JOB 12] Generated table #82879: > > > 76571 keys, 67866714 bytes > > > 2024-10-21T17:41:39.688+0200 7fad651ca700 4 rocksdb: EVENT_LOG_v1 > > > {"time_micros": 1729525299690000, "cf_name": "default", "job": 12, > > > "event": "table_file_creation", "file_number": 82879, "file_size": > > > 67866714, "table_properties": {"data_size": 67111697, "index_size": > > > 562601, "filter_size": 191557, "raw_key_size": 4823973, > > > "raw_average_key_size": 63, "raw_value_size": 62631087, > > > "raw_average_value_size": 817, "num_data_blocks": 15644, "num_entries": > > > 76571, "filter_policy_name": "rocksdb.BuiltinBloomFilter"}} > > > > > > > > > And another occasion: > > > > > > 2024-10-21T17:41:40.520+0200 7fad651ca700 4 rocksdb: > > > [db/compaction_job.cc:1332] [default] [JOB 12] Generated table #82880: > > > 76774 keys, 67868330 bytes > > > 2024-10-21T17:41:40.520+0200 7fad501a0700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2899234, got 0 bytes > > > 2024-10-21T17:41:40.520+0200 7fad501a0700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2899235, got 0 bytes > > > 2024-10-21T17:41:40.520+0200 7fad501a0700 -1 osd.1004 2971464 failed to > > > load OSD map for epoch 2899236, got 0 bytes > > > 2024-10-21T17:41:40.520+0200 7fad651ca700 4 rocksdb: EVENT_LOG_v1 > > > {"time_micros": 1729525300521403, "cf_name": "default", "job": 12, > > > "event": "table_file_creation", "file_number": 82880, "file_size": > > > 67868330, "table_properties": {"data_size": 67113021, "index_size": > > > 562509, "filter_size": 191941, "raw_key_size": 4836742, > > > "raw_average_key_size": 62, "raw_value_size": 62623274, > > > "raw_average_value_size": 815, "num_data_blocks": 15630, "num_entries": > > > 76774, "filter_policy_name": "rocksdb.BuiltinBloomFilter"}} > > > > > > ================= > > > Frank Schilder > > > AIT Risø Campus > > > Bygning 109, rum S14 > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@ceph.io > > > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io