[ceph-users] "Failed to authpin" results in large number of blocked requests

2019-03-28 Thread Zoë O'Connell
We're running a Ceph mimic (13.2.4) cluster which is predominantly used for CephFS. We have recently switched to using multiple active MDSes to cope with load on the cluster, but are experiencing problems with large numbers of blocked requests when research staff run large experiments. The erro

[ceph-users] Recovery from "FAILED assert(omap_num_objs <= MAX_OBJECTS)"

2019-08-27 Thread Zoë O'Connell
We have run in to what looks like bug 36094 (https://tracker.ceph.com/issues/36094) on our 13.2.6 cluster and unfortunately now one of our ranks (Rank 1) won't start - it comes up for a few seconds before the assigned MDS crashes again with the below log entries. It would appear that OpenFileTa