Glad it works for you. May I ask, did you have difficulties setting up
mirroring? I tried half a year ago and failed. Then I tried again last
week, I succeeded, but I faced several issues along the way. I would
like to improve the docs (I'm in contact with Zac about that), but if
you didn't encounter any issues and everything was clear, maybe I'm
the issue here. ;-)
Anyway, if you could share your setup experience, I could compare and
figure out which part of the docs could use some improvement.
Zitat von Jan Zeinstra <j...@delectat.nl>:
> Dear Eugen,
>
> This works: so in conclusion, the assumption is that removing the config
> with:
> ceph config-key rm
> "cephfs/mirror/peer/prodfs/b308e268-c7f9-4401-a69a-e625955087f2"
> And disabling the fs:
> ceph fs snapshot mirror disable proofs
>
> yields:
> ceph fs snapshot mirror peer_list prodfs
> Error EINVAL: filesystem prodfs is not mirrored
>
> Reenabling prodfs mirrorring:
> ceph fs snapshot mirror enable prodfs
> ceph fs snapshot mirror peer_list prodfs
>
> yields:
> {}
>
> So: very grateful for your help, and happily proceeding with re-enabling
> the mirroring solution.
> Again:
> Thank you,
>
> Jan Zeinstra
>
>
> Op ma 14 apr 2025 om 10:29 schreef Eugen Block <ebl...@nde.ag>:
>
>> I stopped my peer cluster to simulate it's gone, removing the peer
>> didn't work (although with a different message than yours), but
>> disabling snapshot mirroring did work:
>>
>> site-a:~ # ceph fs snapshot mirror disable cephfs
>> {}
>> site-a:~ # ceph fs snapshot mirror peer_list cephfs
>> Error EINVAL: filesystem cephfs is not mirrored
>>
>> I assume that this should suffice to start from scratch, but maybe you
>> can confirm.
>>
>> Zitat von Eugen Block <ebl...@nde.ag>:
>>
>> > Please don't drop the list from your responses.
>> >
>> > Have you tried disabling mirroring on prodfs after removing the
>> > keys? I haven't had too many chances too play around yet, no prod
>> > cluster using that.
>> >
>> > Zitat von Jan Zeinstra <j...@delectat.nl>:
>> >
>> >> Thanks for the response,
>> >> I got:
>> >> ceph config-key ls |grep "peer/prodfs"
>> >> "cephfs/mirror/peer/prodfs/b308e268-c7f9-4401-a69a-e625955087f2",
>> >> "cephfs/mirror/peer/prodfs/f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5",
>> >>
>> >> Since there is no mirroring at all going on, I saw no harm in
deleting
>> them
>> >> both.
>> >> ceph config-key rm
>> >> "cephfs/mirror/peer/prodfs/f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5"
>> >> key deleted
>> >> ceph config-key rm
>> >> "cephfs/mirror/peer/prodfs/b308e268-c7f9-4401-a69a-e625955087f2"
>> >> key deleted
>> >>
>> >> And sure enough:
>> >> ceph config-key ls |grep "peer/prodfs"
>> >> resulted in nothing found
>> >>
>> >> However:
>> >> ceph fs snapshot mirror peer_list proofs
>> >> gave
>> >> {"f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5": {"client_name":
>> >> "client.mirror_remote", "site_name": "bk-site", "fs_name": "prodfs"}}
>> >>
>> >> Also after redeploying the daemon.
>> >>
>> >> Searching the config for 'bk-site' and '9afbfe8cc1c5' yielded
nothing.
>> >>
>> >> Do you have more suggestions, where to look ?
>> >>
>> >> (Mirroring is still stuck in trying to find the now nonexistent
remote
>> >> cluster:
>> >> cephadm logs --name cephfs-mirror.s1mon.lvlkwp
>> >>
>> >> Apr 14 08:02:36 s1mon systemd[1]: Started Ceph
>> cephfs-mirror.s1mon.lvlkwp
>> >> for d0ea284a-8a16-11ee-9232-5934f0f00ec2.
>> >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: set uid:gid to 167:167
>> >> (ceph:ceph)
>> >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: ceph version 18.2.4
>> >> (e7ad5345525c7aa95470c26863873b581076945d) reef (stable), process
>> >> cephfs-mirror, pid 2
>> >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: pidfile_write: ignore
>> empty
>> >> --pid-file
>> >> Apr 14 08:02:36 s1mon cephfs-mirror[2097671]: mgrc
>> service_daemon_register
>> >> cephfs-mirror.23174196 metadata
>> >> {arch=x86_64,ceph_release=reef,ceph_version=ceph version 18.2.4
>> >> (e7ad5345525c7aa95470c26863873b581076945d) reef
>> >> (stable),ceph_version_short=18.2.4,container_hostna>
>> >> Apr 14 08:02:40 s1mon cephfs-mirror[2097671]: cephfs::mirror::Utils
>> >> connect: error connecting to bk-site: (2) No such file or directory
>> >> Apr 14 08:02:40 s1mon cephfs-mirror[2097671]:
>> >> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5)
init:
>> >> error connecting to remote cluster: (2) No such file or directory
>> >> Apr 14 08:02:40 s1mon conmon[2097665]: unable to get monitor info
from
>> DNS
>> >> SRV with service name: ceph-mon
>> >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000
>> >> 7ff4ed4c1640 -1 failed for service _ceph-mon._tcp
>> >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000
>> >> 7ff4ed4c1640 -1 monclient: get_monmap_and_config cannot identify
>> monitors
>> >> to contact
>> >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000
>> >> 7ff4ed4c1640 -1 cephfs::mirror::Utils connect: error connecting to
>> bk-site:
>> >> (2) No such file or directory
>> >> Apr 14 08:02:40 s1mon conmon[2097665]: 2025-04-14T06:02:40.831+0000
>> >> 7ff4ed4c1640 -1
>> >> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5)
init:
>> >> error connecting to remote cluster: (2) No such file or directory
>> >> )
>> >>
>> >> Op vr 11 apr 2025 om 14:42 schreef Eugen Block <ebl...@nde.ag>:
>> >>
>> >>> Hi,
>> >>>
>> >>> I would expect that you have a similar config-key entry:
>> >>>
>> >>> ceph config-key ls |grep "peer/cephfs"
>> >>>
"cephfs/mirror/peer/cephfs/18c02021-8902-4e3f-bc17-eaf48331cc56",
>> >>>
>> >>> Maybe removing that peer would already suffice?
>> >>>
>> >>>
>> >>> Zitat von Jan Zeinstra <j...@delectat.nl>:
>> >>>
>> >>>> Hi,
>> >>>> This is my first post to the forum and I don't know if it's
>> appropriate,
>> >>>> but I'd like to express my gratitude to all people working hard on
>> ceph
>> >>>> because I think it's a fantastic piece of software.
>> >>>>
>> >>>> The problem I'm having is caused by me; we had a well working ceph
fs
>> >>>> mirror solution; let's call it source cluster A, and target
cluster B.
>> >>>> Source cluster A is a modest cluster consisting of 6 instances, 3
OSD
>> >>>> instances, and 3 mon instances. The OSD instances all have 3 disks
>> >>> (HDD's)
>> >>>> and 3 OSD demons, totalling 9 OSD daemons and 9 HDD's. Target
cluster
>> B
>> >>> is
>> >>>> a single node system having 3 OSD daemons and 3 HDD's. Both
clusters
>> run
>> >>>> ceph 18.2.4 reef. Both clusters use Ubuntu 22.04 as OS throughout.
>> Both
>> >>>> systems are installed using cephadm.
>> >>>> I have destroyed cluster B, and have built it from the ground up (I
>> made
>> >>> a
>> >>>> mistake in PG sizing in the original cluster)
>> >>>> Now i find i cannot create/ reinstate the mirroring between 2 ceph
fs
>> >>>> filesystems, and i suspect there is a peer left behind in the
>> filesystem
>> >>> of
>> >>>> the source, pointing to the now non-existent target cluster.
>> >>>> When i do 'ceph fs snapshot mirror peer_list prodfs', i get:
>> >>>> '{"f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5": {"client_name":
>> >>>> "client.mirror_remote", "site_name": "bk-site", "fs_name":
"prodfs"}}'
>> >>>> When i try to delete it: 'ceph fs snapshot mirror peer_remove
prodfs
>> >>>> f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5', i get: 'Error EACCES:
failed to
>> >>>> remove peeraccess denied: does your client key have mgr caps? See
>> >>>>
>> http://docs.ceph.com/en/latest/mgr/administrator/#client-authentication
>> >>> ',
>> >>>> but the logging of the daemon points to the more likely reason of
>> >>> failure:
>> >>>> ----
>> >>>> Apr 08 12:54:26 s1mon systemd[1]: Started Ceph
>> cephfs-mirror.s1mon.lvlkwp
>> >>>> for d0ea284a-8a16-11ee-9232-5934f0f00ec2.
>> >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: set uid:gid to 167:167
>> >>>> (ceph:ceph)
>> >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: ceph version 18.2.4
>> >>>> (e7ad5345525c7aa95470c26863873b581076945d) reef (stable), process
>> >>>> cephfs-mirror, pid 2
>> >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: pidfile_write: ignore
>> empty
>> >>>> --pid-file
>> >>>> Apr 08 12:54:26 s1mon cephfs-mirror[310088]: mgrc
>> service_daemon_register
>> >>>> cephfs-mirror.22849497 metadata
>> >>>> {arch=x86_64,ceph_release=reef,ceph_version=ceph version 18.2.4
>> >>>> (e7ad5345525c7a>
>> >>>> Apr 08 12:54:30 s1mon cephfs-mirror[310088]:
>> >>>> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5)
>> init:
>> >>>> remote monitor host=[v2:172.17.16.12:3300/0,v1:172.17.16.12:6789/0
]
>> >>>> Apr 08 12:54:30 s1mon conmon[310082]: 2025-04-08T10:54:30.365+0000
>> >>>> 7f57c51ba640 -1 monclient(hunting): handle_auth_bad_method server
>> >>>> allowed_methods [2] but i only support [2,1]
>> >>>> Apr 08 12:54:30 s1mon conmon[310082]: 2025-04-08T10:54:30.365+0000
>> >>>> 7f57d81e0640 -1 cephfs::mirror::Utils connect: error connecting to
>> >>> bk-site:
>> >>>> (13) Permission denied
>> >>>> Apr 08 12:54:30 s1mon cephfs-mirror[310088]: cephfs::mirror::Utils
>> >>> connect:
>> >>>> error connecting to bk-site: (13) Permission denied
>> >>>> Apr 08 12:54:30 s1mon conmon[310082]: 2025-04-08T10:54:30.365+0000
>> >>>> 7f57d81e0640 -1
>> >>>> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5)
>> init:
>> >>>> error connecting to remote cl>
>> >>>> Apr 08 12:54:30 s1mon cephfs-mirror[310088]:
>> >>>> cephfs::mirror::PeerReplayer(f3ea4e15-6d77-4f28-aacb-9afbfe8cc1c5)
>> init:
>> >>>> error connecting to remote cluster: (13) Permission denied
>> >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.362+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.386+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.430+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> Apr 09 00:00:16 s1mon conmon[310082]: 2025-04-08T22:00:16.466+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 09 00:00:16 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.767+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.811+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.851+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 10 00:00:01 s1mon conmon[310082]: 2025-04-09T22:00:01.891+0000
>> >>>> 7f57d99e3640 -1 received signal: Hangup from Kernel ( Could be
>> generated
>> >>>> by pthread_kill(), raise(), abort(), alarm()>
>> >>>> Apr 10 00:00:01 s1mon cephfs-mirror[310088]: received signal:
Hangup
>> >>> from
>> >>>> Kernel ( Could be generated by pthread_kill(), raise(), abort(),
>> alarm()
>> >>> )
>> >>>> UID: 0
>> >>>> ----
>> >>>> Is there any chance I can get the mirroring daemon to forget about
the
>> >>>> cluster I lost ?
>> >>>>
>> >>>> Best regards, Jan Zeinstra
>> >>>> _______________________________________________
>> >>>> ceph-users mailing list -- ceph-users@ceph.io
>> >>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> ceph-users mailing list -- ceph-users@ceph.io
>> >>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> >>>
>>
>>
>>
>>