Dear Eugen, This works: so in conclusion, the assumption is that removing the config with: ceph config-key rm "cephfs/mirror/peer/prodfs/b308e268-c7f9-4401-a69a-e625955087f2" And disabling the fs: ceph fs snapshot mirror disable proofs
yields: ceph fs snapshot mirror peer_list prodfs Error EINVAL: filesystem prodfs is not mirrored Reenabling prodfs mirrorring: ceph fs snapshot mirror enable prodfs ceph fs snapshot mirror peer_list prodfs yields: {} So: very grateful for your help, and happily proceeding with re-enabling the mirroring solution. Again: Thank you, Jan Zeinstra Op ma 14 apr 2025 om 10:29 schreef Eugen Block <ebl...@nde.ag>: > I stopped my peer cluster to simulate it's gone, removing the peer > didn't work (although with a different message than yours), but > disabling snapshot mirroring did work: > > site-a:~ # ceph fs snapshot mirror disable cephfs > {} > site-a:~ # ceph fs snapshot mirror peer_list cephfs > Error EINVAL: filesystem cephfs is not mirrored > > I assume that this should suffice to start from scratch, but maybe you > can confirm. > > Zitat von Eugen Block <ebl...@nde.ag>: > > > Please don't drop the list from your responses. > > > > Have you tried disabling mirroring on prodfs after removing the > > keys? I haven't had too many chances too play around yet, no prod > > cluster using that. > > > > Zitat von Jan Zeinstra <j...@delectat.nl>: > > > >> Thanks for the response, > >> I got: > >> ceph config-key ls |grep "peer/prodfs" > >> "cephfs/mirror/peer/prodfs/b308e268-c7f9-4401-a69a-e625955087f2", > >> "cephfs/mirror/peer/prodfs/f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5", > >> > >> Since there is no mirroring at all going on, I saw no harm in deleting > them > >> both. > >> ceph config-key rm > >> "cephfs/mirror/peer/prodfs/f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5" > >> key deleted > >> ceph config-key rm > >> "cephfs/mirror/peer/prodfs/b308e268-c7f9-4401-a69a-e625955087f2" > >> key deleted > >> > >> And sure enough: > >> ceph config-key ls |grep "peer/prodfs" > >> resulted in nothing found > >> > >> However: > >> ceph fs snapshot mirror peer_list proofs > >> gave > >> {"f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5": {"client_name": > >> "client.mirror_remote", "site_name": "bk-site", "fs_name": "prodfs"}} > >> > >> Also after redeploying the daemon. > >> > >> Searching the config for 'bk-site' and '9afbfe8cc1c5' yielded nothing. > >> > >> Do you have more suggestions, where to look ? > >> > >> (Mirroring is still stuck in trying to find the now nonexistent remote > >> cluster: > >> cephadm logs --name cephfs-mirror.s1mon.lvlkwp > >> > >> Apr 14 08:02:36 s1mon systemd[1]: Started Ceph > cephfs-mirror.s1mon.lvlkwp > >> for d0ea284a-8a16-11ee-9232-5934f0f00ec2. > >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: set uid:gid to 167:167 > >> (ceph:ceph) > >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: ceph version 18.2.4 > >> (e7ad5345525c7aa95470c26863873b581076945d) reef (stable), process > >> cephfs-mirror, pid 2 > >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: pidfile_write: ignore > empty > >> --pid-file > >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: mgrc > service_daemon_register > >> cephfs-mirror.23174196 metadata > >> {arch=x86_64,ceph_release=reef,ceph_version=ceph version 18.2.4 > >> (e7ad5345525c7aa95470c26863873b581076945d) reef > >> (stable),ceph_version_short=18.2.4,container_hostna> > >> Apr 14 08:02:40 s1mon cephfs-mirror[2097671]: cephfs::mirror::Utils > >> connect: error connecting to bk-site: (2) No such file or directory > >> Apr 14 08:02:40 s1mon cephfs-mirror[2097671]: > >> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5) init: > >> error connecting to remote cluster: (2) No such file or directory > >> Apr 14 08:02:40 s1mon conmon[2097665]: unable to get monitor info from > DNS > >> SRV with service name: ceph-mon > >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000 > >> 7ff4ed4c1640 -1 failed for service _ceph-mon._tcp > >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000 > >> 7ff4ed4c1640 -1 monclient: get_monmap_and_config cannot identify > monitors > >> to contact > >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000 > >> 7ff4ed4c1640 -1 cephfs::mirror::Utils connect: error connecting to > bk-site: > >> (2) No such file or directory > >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000 > >> 7ff4ed4c1640 -1 > >> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5) init: > >> error connecting to remote cluster: (2) No such file or directory > >> ) > >> > >> Op vr 11 apr 2025 om 14:42 schreef Eugen Block <ebl...@nde.ag>: > >> > >>> Hi, > >>> > >>> I would expect that you have a similar config-key entry: > >>> > >>> ceph config-key ls |grep "peer/cephfs" > >>> "cephfs/mirror/peer/cephfs/18c02021-8902-4e3f-bc17-eaf48331cc56", > >>> > >>> Maybe removing that peer would already suffice? > >>> > >>> > >>> Zitat von Jan Zeinstra <j...@delectat.nl>: > >>> > >>>> Hi, > >>>> This is my first post to the forum and I don't know if it's > appropriate, > >>>> but I'd like to express my gratitude to all people working hard on > ceph > >>>> because I think it's a fantastic piece of software. > >>>> > >>>> The problem I'm having is caused by me; we had a well working ceph fs > >>>> mirror solution; let's call it source cluster A, and target cluster B. > >>>> Source cluster A is a modest cluster consisting of 6 instances, 3 OSD > >>>> instances, and 3 mon instances. The OSD instances all have 3 disks > >>> (HDD's) > >>>> and 3 OSD demons, totalling 9 OSD daemons and 9 HDD's. Target cluster > B > >>> is > >>>> a single node system having 3 OSD daemons and 3 HDD's. Both clusters > run > >>>> ceph 18.2.4 reef. Both clusters use Ubuntu 22.04 as OS throughout. > Both > >>>> systems are installed using cephadm. > >>>> I have destroyed cluster B, and have built it from the ground up (I > made > >>> a > >>>> mistake in PG sizing in the original cluster) > >>>> Now i find i cannot create/ reinstate the mirroring between 2 ceph fs > >>>> filesystems, and i suspect there is a peer left behind in the > filesystem > >>> of > >>>> the source, pointing to the now non-existent target cluster. > >>>> When i do 'ceph fs snapshot mirror peer_list prodfs', i get: > >>>> '{"f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5": {"client_name": > >>>> "client.mirror_remote", "site_name": "bk-site", "fs_name": "prodfs"}}' > >>>> When i try to delete it: 'ceph fs snapshot mirror peer_remove prodfs > >>>> f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5', i get: 'Error EACCES: failed to > >>>> remove peeraccess denied: does your client key have mgr caps? See > >>>> > http://docs.ceph.com/en/latest/mgr/administrator/#client-authentication > >>> ', > >>>> but the logging of the daemon points to the more likely reason of > >>> failure: > >>>> ---- > >>>> Apr 08 12:54:26 s1mon systemd[1]: Started Ceph > cephfs-mirror.s1mon.lvlkwp > >>>> for d0ea284a-8a16-11ee-9232-5934f0f00ec2. > >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: set uid:gid to 167:167 > >>>> (ceph:ceph) > >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: ceph version 18.2.4 > >>>> (e7ad5345525c7aa95470c26863873b581076945d) reef (stable), process > >>>> cephfs-mirror, pid 2 > >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: pidfile_write: ignore > empty > >>>> --pid-file > >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: mgrc > service_daemon_register > >>>> cephfs-mirror.22849497 metadata > >>>> {arch=x86_64,ceph_release=reef,ceph_version=ceph version 18.2.4 > >>>> (e7ad5345525c7a> > >>>> Apr 08 12:54:30 s1mon cephfs-mirror[310088]: > >>>> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5) > init: > >>>> remote monitor host=[v2:172.17.16.12:3300/0,v1:172.17.16.12:6789/0] > >>>> Apr 08 12:54:30 s1mon conmon[310082]: 2025-04-08T10:54:30.365+0000 > >>>> 7f57c51ba640 -1 monclient(hunting): handle_auth_bad_method server > >>>> allowed_methods [2] but i only support [2,1] > >>>> Apr 08 12:54:30 s1mon conmon[310082]: 2025-04-08T10:54:30.365+0000 > >>>> 7f57d81e0640 -1 cephfs::mirror::Utils connect: error connecting to > >>> bk-site: > >>>> (13) Permission denied > >>>> Apr 08 12:54:30 s1mon cephfs-mirror[310088]: cephfs::mirror::Utils > >>> connect: > >>>> error connecting to bk-site: (13) Permission denied > >>>> Apr 08 12:54:30 s1mon conmon[310082]: 2025-04-08T10:54:30.365+0000 > >>>> 7f57d81e0640 -1 > >>>> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5) > init: > >>>> error connecting to remote cl> > >>>> Apr 08 12:54:30 s1mon cephfs-mirror[310088]: > >>>> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5) > init: > >>>> error connecting to remote cluster: (13) Permission denied > >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.362+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.386+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.430+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.466+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.767+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.811+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.851+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.891+0000 > >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be > generated > >>>> by pthread_kill(), raise(), abort(), alarm()> > >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal: Hangup > >>> from > >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(), > alarm() > >>> ) > >>>> UID: 0 > >>>> ---- > >>>> Is there any chance I can get the mirroring daemon to forget about the > >>>> cluster I lost ? > >>>> > >>>> Best regards, Jan Zeinstra > >>>> _______________________________________________ > >>>> ceph-users mailing list -- ceph-users@ceph.io > >>>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>> > >>> > >>> _______________________________________________ > >>> ceph-users mailing list -- ceph-users@ceph.io > >>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>> > > > > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io