Hi,

After a couple months of almost no issues, our Ceph cluster has started to have 
frequent failures. Just this week it's failed about three times.


The issue appears to be than an MDS or Monitor will fail and then all clients 
hang. After that, all clients need to be forcibly restarted.


Has anyone else run into this or have any suggestions on how to remedy it?


The architecture for our setup is:

3 ea MON, MDS instances (co-located) on 2cpu, 4GB RAM servers

12 ea OSDs (ssd), on 1cpu, 1GB RAM servers


Ceph v10.2.5

Clients connect via CephFS Kernel driver.


I'd also like to note I'm relatively new to Ceph and I'm here on behalf of the 
person who set the cluster up, so any information is appreciated.


Thank you for your time,

Rich
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to