[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-12-15 Thread ceph
If iirc you can just wait and it will start and resolve by himself. Perhaps a systemctl stop, is failed and than start after you have a message like "couldnt start osd" in the osd log. Hth Mehmet Am 13. Dezember 2024 12:03:20 MEZ schrieb Frank Schilder : >Hi all, > >we had to bring the OSDs ba

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-12-13 Thread Frank Schilder
Hi all, we had to bring the OSDs back up prior to an upgrade from octopus to pacific. Unfortunately, instructions we found for an off-line update of the osdmap did not work. We first tried a command like ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-556/ --op set-osdmap --file osd

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-12-02 Thread Eugen Block
gards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: Friday, November 8, 2024 6:13 PM To: Dan van der Ster Cc: ceph-users@ceph.io Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes Hi Da

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-12-01 Thread Frank Schilder
ilder Sent: Friday, November 8, 2024 6:13 PM To: Dan van der Ster Cc: ceph-users@ceph.io Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes Hi Dan, I have collected a 134M log file (11M compressed) of the startup with debug_osd=20/20. Do you have access to the upload

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-11-08 Thread Frank Schilder
Hi Dan, I have collected a 134M log file (11M compressed) of the startup with debug_osd=20/20. Do you have access to the upload area of the ceph-devs (the ceph-post-file destination)? If not, any preferred way I can send it to you? To execute the ceph-objectstore-tool mount command it looks lik

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-10-24 Thread Alex Walender
_ From: Dan van der Ster Sent: Monday, October 21, 2024 3:03:41 PM To: Frank Schilder Cc: ceph-users@ceph.io Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes Hi Frank, Are you sure it's looping over the same epochs? It looks like th

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-10-22 Thread Dan van der Ster
Hi Frank, I'm glad this thread is more about understanding what's going on, as opposed to a quick fix. Normally people in this situation should just zap and redeploy, like you said. The next thing I'd do is fuse-mount the OSD and see which osdmaps it has -- try to read them, etc. Does the OSD hav

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-10-22 Thread Frank Schilder
Hi Dan, I don't remember exactly when I took them down. Maybe a month ago? The reason was a fatal IO error due to connectivity issues. Its 2 SSDs with 4 OSDs each and they were installed in JBODs. We have 12 of those and, unfortunately, it seems that 3 have issues with SSDs even though they hav

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-10-21 Thread Dan van der Ster
Hi Frank, Do you have some more info about these OSDs -- how long were they down for? Were they down because of some IO errors? Is it possible that the OSD thinks it stored those osdmaps but IO errors are preventing them from being loaded? I know the log is large, but can you share at least a sn

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-10-21 Thread Frank Schilder
Hi Dan, maybe not. Looking at the output of grep -B 1 -e "2971464 failed to load OSD map for epoch 2898132" /var/log/ceph/ceph-osd.1004.log that searches for lines that start a cycle and also print the line before, there might be some progress, but I'm not sure: 2024-10-21T17:41:40.173+0200 7

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-10-21 Thread Vladimir Sigunov
r 21, 2024 3:03:41 PM To: Frank Schilder Cc: ceph-users@ceph.io Subject: [ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes Hi Frank, Are you sure it's looping over the same epochs? It looks like that old osd is trying to catch up on all the osdmaps it missed while it was

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-10-21 Thread Dan van der Ster
Hi Frank, Are you sure it's looping over the same epochs? It looks like that old osd is trying to catch up on all the osdmaps it missed while it was down. (And those old maps are probably trimmed from all the mons and osds, based on the "got 0 bytes" error). Eventually it should catch up to the cu