You cannot force mds quit "replay" state for obvious reason of keeping data consistent. You might raise mds_beacon_grace to a somewhat reasonable value that would allow MDS to replay the journal without being marked laggy and eventually blacklisted.
________________________________ From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Alessandro De Salvo <alessandro.desa...@roma1.infn.it> Sent: Monday, January 8, 2018 7:40:59 PM To: Lincoln Bryant; ceph-users@lists.ceph.com Subject: Re: [ceph-users] cephfs degraded on ceph luminous 12.2.2 Thanks Lincoln, indeed, as I said the cluster is recovering, so there are pending ops: pgs: 21.034% pgs not active 1692310/24980804 objects degraded (6.774%) 5612149/24980804 objects misplaced (22.466%) 458 active+clean 329 active+remapped+backfill_wait 159 activating+remapped 100 active+undersized+degraded+remapped+backfill_wait 58 activating+undersized+degraded+remapped 27 activating 22 active+undersized+degraded+remapped+backfilling 6 active+remapped+backfilling 1 active+recovery_wait+degraded If it's just a matter to wait for the system to complete the recovery it's fine, I'll deal with that, but I was wondendering if there is a more suble problem here. OK, I'll wait for the recovery to complete and see what happens, thanks. Cheers, Alessandro Il 08/01/18 17:36, Lincoln Bryant ha scritto: > Hi Alessandro, > > What is the state of your PGs? Inactive PGs have blocked CephFS > recovery on our cluster before. I'd try to clear any blocked ops and > see if the MDSes recover. > > --Lincoln > > On Mon, 2018-01-08 at 17:21 +0100, Alessandro De Salvo wrote: >> Hi, >> >> I'm running on ceph luminous 12.2.2 and my cephfs suddenly degraded. >> >> I have 2 active mds instances and 1 standby. All the active >> instances >> are now in replay state and show the same error in the logs: >> >> >> ---- mds1 ---- >> >> 2018-01-08 16:04:15.765637 7fc2e92451c0 0 ceph version 12.2.2 >> (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), >> process >> (unknown), pid 164 >> starting mds.mds1 at - >> 2018-01-08 16:04:15.785849 7fc2e92451c0 0 pidfile_write: ignore >> empty >> --pid-file >> 2018-01-08 16:04:20.168178 7fc2e1ee1700 1 mds.mds1 handle_mds_map >> standby >> 2018-01-08 16:04:20.278424 7fc2e1ee1700 1 mds.1.20635 handle_mds_map >> i >> am now mds.1.20635 >> 2018-01-08 16:04:20.278432 7fc2e1ee1700 1 mds.1.20635 >> handle_mds_map >> state change up:boot --> up:replay >> 2018-01-08 16:04:20.278443 7fc2e1ee1700 1 mds.1.20635 replay_start >> 2018-01-08 16:04:20.278449 7fc2e1ee1700 1 mds.1.20635 recovery set >> is 0 >> 2018-01-08 16:04:20.278458 7fc2e1ee1700 1 mds.1.20635 waiting for >> osdmap 21467 (which blacklists prior instance) >> >> >> ---- mds2 ---- >> >> 2018-01-08 16:04:16.870459 7fd8456201c0 0 ceph version 12.2.2 >> (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), >> process >> (unknown), pid 295 >> starting mds.mds2 at - >> 2018-01-08 16:04:16.881616 7fd8456201c0 0 pidfile_write: ignore >> empty >> --pid-file >> 2018-01-08 16:04:21.274543 7fd83e2bc700 1 mds.mds2 handle_mds_map >> standby >> 2018-01-08 16:04:21.314438 7fd83e2bc700 1 mds.0.20637 handle_mds_map >> i >> am now mds.0.20637 >> 2018-01-08 16:04:21.314459 7fd83e2bc700 1 mds.0.20637 >> handle_mds_map >> state change up:boot --> up:replay >> 2018-01-08 16:04:21.314479 7fd83e2bc700 1 mds.0.20637 replay_start >> 2018-01-08 16:04:21.314492 7fd83e2bc700 1 mds.0.20637 recovery set >> is 1 >> 2018-01-08 16:04:21.314517 7fd83e2bc700 1 mds.0.20637 waiting for >> osdmap 21467 (which blacklists prior instance) >> 2018-01-08 16:04:21.393307 7fd837aaf700 0 mds.0.cache creating >> system >> inode with ino:0x100 >> 2018-01-08 16:04:21.397246 7fd837aaf700 0 mds.0.cache creating >> system >> inode with ino:0x1 >> >> The cluster is recovering as we are changing some of the osds, and >> there >> are a few slow/stuck requests, but I'm not sure if this is the cause, >> as >> there is apparently no data loss (until now). >> >> How can I force the MDSes to quit the replay state? >> >> Thanks for any help, >> >> >> Alessandro >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com