On Fri, Oct 30, 2020 at 2:13 AM Frank Schilder <fr...@dtu.dk> wrote:
>
> Dear cephers,
>
> I have a somewhat strange situation. I have the health warning:
>
> # ceph health detail
> HEALTH_WARN 3 clients failing to respond to capability release
> MDS_CLIENT_LATE_RELEASE 3 clients failing to respond to capability release
>     mdsceph-12(mds.0): Client sn106.hpc.ait.dtu.dk:con-fs2-hpc failing to 
> respond to capability release client_id: 30716617
>     mdsceph-12(mds.0): Client sn269.hpc.ait.dtu.dk:con-fs2-hpc failing to 
> respond to capability release client_id: 30717358
>     mdsceph-12(mds.0): Client sn009.hpc.ait.dtu.dk:con-fs2-hpc failing to 
> respond to capability release client_id: 30749150
>
> However, these clients are not busy right now. Also, they hold almost 
> nothing; see snippets from "session ls" below. It is possible that a very IO 
> intensive application was running on these nodes and these release requests 
> got stuck. How do I resolve this issue? Can I just evict the client?
>
> Version is mimic 13.2.8. Note that we execute a drop cache command after a 
> job finishes on these clients. Its possible that the clients dropped the caps 
> already before the MDS request was handled/received.
Can you share any config changes you've made on the MDS?

Also, Mimic is EOL as you probably know. Please upgrade :)

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to