The thing i've seen a lot is where an OSD would get marked down because of a failed drive, then then it would add itself right back again
On Fri, Jun 28, 2019 at 9:12 AM Robert LeBlanc <rob...@leblancnet.us> wrote: > I'm not sure why the monitor did not mark it down after 600 seconds > (default). The reason it is so long is that you don't want to move data > around unnecessarily if the osd is just being rebooted/restarted. Usually, > you will still have min_size OSDs available for all PGs that will allow IO > to continue. Then when the down timeout expires it will start backfilling > and recovering the PGs that were affected. Double check that size != > min_size for your pools. > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > > > On Thu, Jun 27, 2019 at 5:26 PM Bryan Henderson <bry...@giraffe-data.com> > wrote: > >> What does it take for a monitor to consider an OSD down which has been >> dead as >> a doornail since the cluster started? >> >> A couple of times, I have seen 'ceph status' report an OSD was up, when >> it was >> quite dead. Recently, a couple of OSDs were on machines that failed to >> boot >> up after a power failure. The rest of the Ceph cluster came up, though, >> and >> reported all OSDs up and in. I/Os stalled, probably because they were >> waiting >> for the dead OSDs to come back. >> >> I waited 15 minutes, because the manual says if the monitor doesn't hear a >> heartbeat from an OSD in that long (default value of >> mon_osd_report_timeout), >> it marks it down. But it didn't. I did "osd down" commands for the dead >> OSDs >> and the status changed to down and I/O started working. >> >> And wouldn't even 15 minutes of grace be unacceptable if it means I/Os >> have to >> wait that long before falling back to a redundant OSD? >> >> -- >> Bryan Henderson San Jose, California >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com