On Mon, Jan 21, 2019 at 11:43 AM ST Wong (ITSC) <s...@itsc.cuhk.edu.hk> wrote: > > Hi, we’re trying mimic on an VM farm. It consists 4 OSD hosts (8 OSDs) and 3 > MON. We tried mounting as RBD and CephFS (fuse and kernel mount) on > different clients without problem.
Is this an upgraded or a fresh cluster? > > Then one day we perform failover test and stopped one of the OSD. Not sure > if it’s related but after that testing, the RBD client freeze when trying to > mount the rbd device. > > > > Steps to reproduce: > > > > # modprobe rbd > > > > (dmesg) > > [ 309.997587] Key type dns_resolver registered > > [ 310.043647] Key type ceph registered > > [ 310.044325] libceph: loaded (mon/osd proto 15/24) > > [ 310.054548] rbd: loaded > > > > # rbd -n client.acapp1 map 4copy/foo > > /dev/rbd0 > > > > # rbd showmapped > > id pool image snap device > > 0 4copy foo - /dev/rbd0 > > > > > > Then hangs if I tried to mount or reboot the server after rbd map. There > are lot of error in dmesg, e.g. > > > > Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: blacklist of client74700 failed: -13 > > Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: failed to acquire lock: -13 > > Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: no lock owners detected > > Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: client74700 seems dead, breaking > lock > > Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: blacklist of client74700 failed: -13 > > Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: failed to acquire lock: -13 > > Jan 20 03:43:32 acapp1 kernel: rbd: rbd0: no lock owners detected Does client.acapp1 have the permission to blacklist other clients? You can check with "ceph auth get client.acapp1". If not, follow step 6 of http://docs.ceph.com/docs/master/releases/luminous/#upgrade-from-jewel-or-kraken. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com