The thing i've seen a lot is where an OSD would get marked down because of
a failed drive, then then it would add itself right back again


On Fri, Jun 28, 2019 at 9:12 AM Robert LeBlanc <rob...@leblancnet.us> wrote:

> I'm not sure why the monitor did not mark it down after 600 seconds
> (default). The reason it is so long is that you don't want to move data
> around unnecessarily if the osd is just being rebooted/restarted. Usually,
> you will still have min_size OSDs available for all PGs that will allow IO
> to continue. Then when the down timeout expires it will start backfilling
> and recovering the PGs that were affected. Double check that size !=
> min_size for your pools.
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>
> On Thu, Jun 27, 2019 at 5:26 PM Bryan Henderson <bry...@giraffe-data.com>
> wrote:
>
>> What does it take for a monitor to consider an OSD down which has been
>> dead as
>> a doornail since the cluster started?
>>
>> A couple of times, I have seen 'ceph status' report an OSD was up, when
>> it was
>> quite dead.  Recently, a couple of OSDs were on machines that failed to
>> boot
>> up after a power failure.  The rest of the Ceph cluster came up, though,
>> and
>> reported all OSDs up and in.  I/Os stalled, probably because they were
>> waiting
>> for the dead OSDs to come back.
>>
>> I waited 15 minutes, because the manual says if the monitor doesn't hear a
>> heartbeat from an OSD in that long (default value of
>> mon_osd_report_timeout),
>> it marks it down.  But it didn't.  I did "osd down" commands for the dead
>> OSDs
>> and the status changed to down and I/O started working.
>>
>> And wouldn't even 15 minutes of grace be unacceptable if it means I/Os
>> have to
>> wait that long before falling back to a redundant OSD?
>>
>> --
>> Bryan Henderson                                   San Jose, California
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to