Hi Frank,

I'm glad this thread is more about understanding what's going on, as
opposed to a quick fix. Normally people in this situation should just
zap and redeploy, like you said.

The next thing I'd do is fuse-mount the OSD and see which osdmaps it
has -- try to read them, etc. Does the OSD have the osdmap epoch it is
warning about in the logs?

ceph-objectstore-tool --data-path </path/to/osd> --op fuse --mountpoint /mnt

Inside /mnt you'll see the PGs and a meta folder, IIRC. Inside meta
you will find the osdmaps.

Cheers, dan

--
Dan van der Ster
CTO @ CLYSO
https://clyso.com | dan.vanders...@clyso.com


On Tue, Oct 22, 2024 at 12:14 AM Frank Schilder <fr...@dtu.dk> wrote:
>
> Hi Dan,
>
> I don't remember exactly when I took them down. Maybe a month ago? The reason 
> was a fatal IO error due to connectivity issues. Its 2 SSDs with 4 OSDs each 
> and they were installed in JBODs. We have 12 of those and, unfortunately, it 
> seems that 3 have issues with SSDs even though they have dedicated slots for 
> SSD/NVMe drives. The hallmark are SAS messages containing "blk_update" 
> (usually blk_update_request with a drive reset). Any attempts to fix that 
> (reseating, moving etc.) failed. We had like 40-50 messages per disk per hour 
> in the log, didn't impact performance though. So it was a nuisance on the 
> lower priority list.
>
> Some time ago, maybe a month, in one of these JBODs the connection finally 
> gave up. For both SSDs at the same time. I'm pretty sure it was a bus error 
> and the disks are fine. I stopped these OSDs and the data is recovered, no 
> problems here. The disks are part of our FS meta data pool and we have 
> plenty, so there was no rush. They store a ton of objects though and I try to 
> avoid having a full re-write of everything (life-time fetishist).
>
> Yesterday, we started a major disk replacement operation after evacuating 
> about 40 HDDs. As part of this, we move all SSDs from the JBODs to the 
> servers and I tried to get these two disks up yesterday with the result 
> reported below. We are not done yet with the maintenance operation and I can 
> pull logs after we are done. Possibly next week. We are not in a rush to get 
> these disks back up and I'm also prepared to just zap and redeploy these.
>
> My interest with this case is along the lines "I would like to know what is 
> going on" and "is there a better way than zap+redeploy". Others might be in a 
> situation where they don't have the luxury of all data being healthy and we 
> have a chance to experiment without any risk.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Dan van der Ster <dan.vanders...@clyso.com>
> Sent: Tuesday, October 22, 2024 12:04 AM
> To: Frank Schilder
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] failed to load OSD map for epoch 2898146, got 0 
> bytes
>
> Hi Frank,
>
> Do you have some more info about these OSDs -- how long were they down
> for? Were they down because of some IO errors?
>
> Is it possible that the OSD thinks it stored those osdmaps but IO
> errors are preventing them from being loaded?
>
> I know the log is large, but can you share at least a snippet of when
> this starts? Preferably with debug_osd = 10.
>
> Thanks, Dan
>
> --
> Dan van der Ster
> CTO @ CLYSO
> https://clyso.com | dan.vanders...@clyso.com
>
> On Mon, Oct 21, 2024 at 1:32 PM Frank Schilder <fr...@dtu.dk> wrote:
> >
> > Hi Dan,
> >
> > maybe not. Looking at the output of
> >
> > grep -B 1 -e "2971464 failed to load OSD map for epoch 2898132" 
> > /var/log/ceph/ceph-osd.1004.log
> >
> > that searches for lines that start a cycle and also print the line before, 
> > there might be some progress, but I'm not sure:
> >
> > 2024-10-21T17:41:40.173+0200 7fad509a1700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2898208, got 0 bytes
> > 2024-10-21T17:41:40.173+0200 7fad4a194700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2898132, got 0 bytes
> > --
> > 2024-10-21T17:41:40.610+0200 7fad519a3700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2898678, got 0 bytes
> > 2024-10-21T17:41:40.610+0200 7fad4e19c700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2898132, got 0 bytes
> > --
> > 2024-10-21T17:41:41.340+0200 7fad4c198700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2898293, got 0 bytes
> > 2024-10-21T17:41:41.340+0200 7fad4f19e700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2898132, got 0 bytes
> > --
> > 2024-10-21T17:41:41.347+0200 7fad4e99d700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2899238, got 0 bytes
> > 2024-10-21T17:41:41.347+0200 7fad4c999700 -1 osd.1004 2971464 failed to 
> > load OSD map for epoch 2898132, got 0 bytes
> >
> > The loop seems to run longer and longer. Problem is though that by the time 
> > it got here it already wrote like 1G log. The loop seems to repeat over all 
> > epochs for each such iteration, so the log output is quadratic in the 
> > number of epochs to catch up with. I still have like 100000 to go and I 
> > doubt I have disks large enough to collect the resulting logs. The logging 
> > probably also a total performance killer.
> >
> > Is it possible to suppress the massive log spam so that I can let it run 
> > until it is marked up? These messages seem not to be related to a log 
> > level. If absolutely necessary, I could start the OSD manually with logging 
> > to disk disabled.
> >
> > Thanks and best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Dan van der Ster <dan.vanders...@clyso.com>
> > Sent: Monday, October 21, 2024 9:03 PM
> > To: Frank Schilder
> > Cc: ceph-users@ceph.io
> > Subject: Re: [ceph-users] failed to load OSD map for epoch 2898146, got 0 
> > bytes
> >
> > Hi Frank,
> >
> > Are you sure it's looping over the same epochs?
> > It looks like that old osd is trying to catch up on all the osdmaps it
> > missed while it was down. (And those old maps are probably trimmed
> > from all the mons and osds, based on the "got 0 bytes" error).
> > Eventually it should catch up to the current (e 2971464 according to
> > your log), and then the PGs can go active.
> >
> > Cheers, Dan
> >
> > --
> > Dan van der Ster
> > CTO @ CLYSO
> > https://clyso.com | dan.vanders...@clyso.com
> >
> >
> >
> >
> > On Mon, Oct 21, 2024 at 9:13 AM Frank Schilder <fr...@dtu.dk> wrote:
> > >
> > > Hi all,
> > >
> > > I have a strange problem on an octopus latest cluster.  We had a couple 
> > > of SSD OSDs down for a while and brought them up today again. For some 
> > > reason, these OSDs don't come up and flood the log with messages like
> > >
> > > osd.1004 2971464 failed to load OSD map for epoch 2898146, got 0 bytes
> > >
> > > These messages cycle through the same epochs over and over again. I did 
> > > not really fine too much help, there is an old thread about a similar/the 
> > > same problem on a home lab cluster, with new OSDs though, I believe. I 
> > > couldn't really find useful information. The OSDs seem to boot fine and 
> > > then end up in something like a death loop. Below some snippets from the 
> > > OSD log.
> > >
> > > Any hints appreciated.
> > > Thanks and best regards,
> > > Frank
> > >
> > > After OSD start, everything looks normal up to here:
> > >
> > > 2024-10-21T17:41:39.136+0200 7fad73cf6f00  0 osd.1004 2971464 load_pgs 
> > > opened 205 pgs
> > > 2024-10-21T17:41:39.140+0200 7fad73cf6f00 -1 osd.1004 2971464 
> > > log_to_monitors {default=true}
> > > 2024-10-21T17:41:39.150+0200 7fad73cf6f00 -1 osd.1004 2971464 
> > > mon_cmd_maybe_osd_create fail: 'osd.1004 has already bound to class 
> > > 'fs_meta', can not reset class to 'ssd'; use 'ceph osd crush
> > >  rm-device-class <id>' to remove old class first': (16) Device or 
> > > resource busy
> > > 2024-10-21T17:41:39.155+0200 7fad519a3700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898132, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898132, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898133, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898134, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898135, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad511a2700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898136, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4f99f700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898132, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4f99f700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898133, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898132, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898133, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898134, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898135, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898136, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898137, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad73cf6f00  0 osd.1004 2971464 done with 
> > > init, starting boot process
> > > 2024-10-21T17:41:39.155+0200 7fad73cf6f00  1 osd.1004 2971464 start_boot
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898138, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898139, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898140, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898141, got 0 bytes
> > > 2024-10-21T17:41:39.155+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898142, got 0 bytes
> > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898143, got 0 bytes
> > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898144, got 0 bytes
> > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898145, got 0 bytes
> > > 2024-10-21T17:41:39.156+0200 7fad4b196700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2898146, got 0 bytes
> > >
> > >
> > > These messages repeat over and over again with some others of this form 
> > > showing up every now and then:
> > >
> > > 2024-10-21T17:41:39.476+0200 7fad651ca700  4 rocksdb: 
> > > [db/compaction_job.cc:1332] [default] [JOB 12] Generated table #82879: 
> > > 76571 keys, 67866714 bytes
> > > 2024-10-21T17:41:39.688+0200 7fad651ca700  4 rocksdb: EVENT_LOG_v1 
> > > {"time_micros": 1729525299690000, "cf_name": "default", "job": 12, 
> > > "event": "table_file_creation", "file_number": 82879, "file_size": 
> > > 67866714, "table_properties": {"data_size": 67111697, "index_size": 
> > > 562601, "filter_size": 191557, "raw_key_size": 4823973, 
> > > "raw_average_key_size": 63, "raw_value_size": 62631087, 
> > > "raw_average_value_size": 817, "num_data_blocks": 15644, "num_entries": 
> > > 76571, "filter_policy_name": "rocksdb.BuiltinBloomFilter"}}
> > >
> > >
> > > And another occasion:
> > >
> > > 2024-10-21T17:41:40.520+0200 7fad651ca700  4 rocksdb: 
> > > [db/compaction_job.cc:1332] [default] [JOB 12] Generated table #82880: 
> > > 76774 keys, 67868330 bytes
> > > 2024-10-21T17:41:40.520+0200 7fad501a0700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2899234, got 0 bytes
> > > 2024-10-21T17:41:40.520+0200 7fad501a0700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2899235, got 0 bytes
> > > 2024-10-21T17:41:40.520+0200 7fad501a0700 -1 osd.1004 2971464 failed to 
> > > load OSD map for epoch 2899236, got 0 bytes
> > > 2024-10-21T17:41:40.520+0200 7fad651ca700  4 rocksdb: EVENT_LOG_v1 
> > > {"time_micros": 1729525300521403, "cf_name": "default", "job": 12, 
> > > "event": "table_file_creation", "file_number": 82880, "file_size": 
> > > 67868330, "table_properties": {"data_size": 67113021, "index_size": 
> > > 562509, "filter_size": 191941, "raw_key_size": 4836742, 
> > > "raw_average_key_size": 62, "raw_value_size": 62623274, 
> > > "raw_average_value_size": 815, "num_data_blocks": 15630, "num_entries": 
> > > 76774, "filter_policy_name": "rocksdb.BuiltinBloomFilter"}}
> > >
> > > =================
> > > Frank Schilder
> > > AIT Risø Campus
> > > Bygning 109, rum S14
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to