Re: [ceph-users] Fwd: HW failure cause client IO drops

M Ranga Swami Reddy Tue, 16 Apr 2019 01:44:45 -0700

OSD processes/daemon running as is...So ceph not making those OSD down or
out.
But as battery failed, which leads temperature high, leads CPU utlization
increased  - leads
OSD response time more, so that other OSDs failed to response on time..
causing the utter slow or no IO...




On Tue, Apr 16, 2019 at 12:23 PM Eugen Block <[email protected]> wrote:

> Good morning,
>
> the OSDs are usually marked out after 10 minutes, that's when
> rebalancing starts. But the I/O should not drop during that time, this
> could be related to your pool configuration. If you have a replicated
> pool of size 3 and also set min_size to 3 the I/O would pause if a
> node or OSD fails. So more information about the cluster would help,
> can you share that?
>
> ceph osd tree
> ceph osd pool ls detail
>
> Were all pools affected or just specific pools?
>
> Regards,
> Eugen
>
>
> Zitat von M Ranga Swami Reddy <[email protected]>:
>
> > Hello - Recevenlt we had an issue with storage node's battery failure,
> > which cause ceph client IO dropped to '0' bytes. Means ceph cluster
> > couldn't perform IO operations on the cluster till the node takes out.
> This
> > is not expected from Ceph, as some HW fails, those respective OSDs should
> > mark as out/down and IO should go as is..
> >
> > Please let me know if anyone seen the similar behavior and is this issue
> > resolved?
> >
> > Thanks
> > Swami
>
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: HW failure cause client IO drops

Reply via email to