If you've really extracted all the PGs from the down OSDs, you should have
been able to inject them into new OSDs and continue on from there with just
rebalancing activity. The use of mark_unfound_lost_revert complicates
matters a bit but I'm not sure what the behavior would be if you just put
them
Our ceph cluster stopped responding to requests two weeks ago, and I have
been trying to fix it since then. After a semi-hard reboot, we had 11-ish
OSDs "fail" spread across two hosts, with the pool size set to two. I was
able to extract a copy of every PG that resided solely on the nonfunctional