Good morning,

if the osdmap doesn't exist on that OSD yet (which obviously it doesn't), you can use the --force flag to create it. I played a bit on a test cluster to try and reproduce. I stopped one OSD at epoch 20990, waited until the epoch increased and tried the same:

[ceph: root@nautilus /]# ceph osd getmap > osdmap-20993.bin
got osdmap epoch 20993
[ceph: root@nautilus /]# ceph-objectstore-tool --op set-osdmap --data-path /var/lib/ceph/osd/ceph-0/ --file osdmap-20993.bin
osdmap (#-1:9885c172:::osdmap.20993:0#) does not exist.


So I repeated it with the force flag:

[ceph: root@nautilus /]# ceph-objectstore-tool --op set-osdmap --data-path /var/lib/ceph/osd/ceph-0/ --file osdmap-20993.bin --force
osdmap (#-1:9885c172:::osdmap.20993:0#) does not exist.
Creating a new epoch.


The OSD then started successfully. But nothing was broken in this cluster, so even if that worked for me, there's no guarantee that the force flag will work for you, too. There might be a different underlying issue, maybe you're barking up the wrong tree wrt osdmaps. But I guess it's worth a shot on one OSD.

Did I interpret correctly that the most affected pool would be the EC pool with 11 chunks and min_size 9? And CRUSH can't find 4 of those chunks according to one of those logs, which means rebuilding the OSDs wouldn't work and most likely resulted in data loss. So trying to bring back at least enough of those down OSDs would be my approach as well.

And is there a history that lead to more than 20 down OSDs? That could be helpful to undestand what happened. Was it really just the (successful) upgrade?

Regards,
Eugen


Zitat von Huseyin Cotuk <[email protected]>:

Hi again,

By the way, I ran into a similar problem a few years ago and I have set the following config parameter to 5000 some time ago.

osd_map_cache_size <https://docs.ceph.com/en/reef/rados/configuration/osd-config-ref/#confval-osd_map_cache_size>
The number of OSD maps to keep cached.

type
:
int
default
:
50
I could not find any other related config parameters.

BR,
Huseyin Cotuk
[email protected]




On 24 Oct 2025, at 21:14, Huseyin Cotuk <[email protected]> wrote:

Hi Eugen,

I have already tried as the below method I stated before. I got the current osdmap and tried to set-osdmap via ceph-object-store-tool. But the command failed with error:

ceph osd getmap 72555 > /tmp/osd_map_72555
CEPH_ARGS="--bluestore-ignore-data-csum" ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-11/ --op set-osdmap --file /tmp/osd_map_72555

osdmap (#-1:9c8e9ef2:::osdmap.72555:0#) does not exist.

https://www.mail-archive.com/[email protected]/msg11545.html

BR,
Huseyin
[email protected]





_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]


_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to