If iirc you can just wait and it will start and resolve by himself. Perhaps a
systemctl stop, is failed and than start after you have a message like "couldnt
start osd" in the osd log.
Hth
Mehmet
Am 13. Dezember 2024 12:03:20 MEZ schrieb Frank Schilder :
>Hi all,
>
>we had to bring the OSDs ba
Hi all,
we had to bring the OSDs back up prior to an upgrade from octopus to pacific.
Unfortunately, instructions we found for an off-line update of the osdmap did
not work. We first tried a command like
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-556/ --op set-osdmap
--file osd
gards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Frank Schilder
Sent: Friday, November 8, 2024 6:13 PM
To: Dan van der Ster
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146,
got 0 bytes
Hi Da
ilder
Sent: Friday, November 8, 2024 6:13 PM
To: Dan van der Ster
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes
Hi Dan,
I have collected a 134M log file (11M compressed) of the startup with
debug_osd=20/20. Do you have access to the upload
Hi Dan,
I have collected a 134M log file (11M compressed) of the startup with
debug_osd=20/20. Do you have access to the upload area of the ceph-devs (the
ceph-post-file destination)? If not, any preferred way I can send it to you?
To execute the ceph-objectstore-tool mount command it looks lik
_
From: Dan van der Ster
Sent: Monday, October 21, 2024 3:03:41 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes
Hi Frank,
Are you sure it's looping over the same epochs?
It looks like th
Hi Frank,
I'm glad this thread is more about understanding what's going on, as
opposed to a quick fix. Normally people in this situation should just
zap and redeploy, like you said.
The next thing I'd do is fuse-mount the OSD and see which osdmaps it
has -- try to read them, etc. Does the OSD hav
Hi Dan,
I don't remember exactly when I took them down. Maybe a month ago? The reason
was a fatal IO error due to connectivity issues. Its 2 SSDs with 4 OSDs each
and they were installed in JBODs. We have 12 of those and, unfortunately, it
seems that 3 have issues with SSDs even though they hav
Hi Frank,
Do you have some more info about these OSDs -- how long were they down
for? Were they down because of some IO errors?
Is it possible that the OSD thinks it stored those osdmaps but IO
errors are preventing them from being loaded?
I know the log is large, but can you share at least a sn
Hi Dan,
maybe not. Looking at the output of
grep -B 1 -e "2971464 failed to load OSD map for epoch 2898132"
/var/log/ceph/ceph-osd.1004.log
that searches for lines that start a cycle and also print the line before,
there might be some progress, but I'm not sure:
2024-10-21T17:41:40.173+0200 7
r 21, 2024 3:03:41 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes
Hi Frank,
Are you sure it's looping over the same epochs?
It looks like that old osd is trying to catch up on all the osdmaps it
missed while it was
Hi Frank,
Are you sure it's looping over the same epochs?
It looks like that old osd is trying to catch up on all the osdmaps it
missed while it was down. (And those old maps are probably trimmed
from all the mons and osds, based on the "got 0 bytes" error).
Eventually it should catch up to the cu
12 matches
Mail list logo