Hi Eugen,
I've removed 12 OSDs with a 'ceph orch osd rm ID --replace' last week on
Pacific and even though only 10 OSDs started draining their PGs at a time (the
other 2 waiting for an available 'slot', obviously) all 12 OSDs got removed
successfully at the end.
Cheers,
Frédéric.
- Le 16
Hi, thanks for chiming in. I believe there's a slot limit of 10 for
the queue, at least I believe I read that some time ago somewhere, so
that would explain those 10 parallel drains you mention. I also don't
have any such issues on customer clusters, that's why I still suspect
the drives...
I have a feeling that this could be related to the drives, but I have
no real proof. I drained the SSD OSDs yesterday, hours later I wanted
to remove the OSDs (no PGs were on them anymore) via for loop with the
orchestrator (ceph orch osd rm ID --force --zap). The first one got
removed quit