I've just had one of my cephfs servers showing an "mdsY: Client XXXXX failing to respond to capability release" error. The client in question was acting strange, not allowing deleting files, etc. The issue was cleared by restarting the affected server. I see there have been a few posts about this - perhaps related to mds cache size. Does anyone know if there is some tuning that can be done to prevent this from happening? Or is this a bug? I do have plenty of RAM available to increase mds cache size if necessary - it's currently just left at the default value. Is there a tuning guide for MDS? I can't seem to find any recommendations in the docs.
The cluster is running ceph 10.2.5 and I'm using the kernel cephfs client (kernel version 4.9.0). Things I've already investigated: * Log files (kernel, syslog, etc) - nothing unusual at all * Historical graphs of CPU, Memory, Network, etc - nothing unusual, plenty of resources available * Historical graphs of overall cluster load/IO - nothing out of the ordinary
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com