OSD processes/daemon running as is...So ceph not making those OSD down or out. But as battery failed, which leads temperature high, leads CPU utlization increased - leads OSD response time more, so that other OSDs failed to response on time.. causing the utter slow or no IO...
On Tue, Apr 16, 2019 at 12:23 PM Eugen Block <ebl...@nde.ag> wrote: > Good morning, > > the OSDs are usually marked out after 10 minutes, that's when > rebalancing starts. But the I/O should not drop during that time, this > could be related to your pool configuration. If you have a replicated > pool of size 3 and also set min_size to 3 the I/O would pause if a > node or OSD fails. So more information about the cluster would help, > can you share that? > > ceph osd tree > ceph osd pool ls detail > > Were all pools affected or just specific pools? > > Regards, > Eugen > > > Zitat von M Ranga Swami Reddy <swamire...@gmail.com>: > > > Hello - Recevenlt we had an issue with storage node's battery failure, > > which cause ceph client IO dropped to '0' bytes. Means ceph cluster > > couldn't perform IO operations on the cluster till the node takes out. > This > > is not expected from Ceph, as some HW fails, those respective OSDs should > > mark as out/down and IO should go as is.. > > > > Please let me know if anyone seen the similar behavior and is this issue > > resolved? > > > > Thanks > > Swami > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com