[ceph-users] CephFS: Isolating folders for different users
Hello everyone, I would like to setup my CephFS with different directories exclusively accessible by corresponding clients. By this, I mean e.g. /dir_a only accessible by client.a and /dir_b only by client.b. From the documentation I gathered, having client caps like client.a key: caps: [mds] allow rw fsname=cephfs path=/dir_a caps: [mon] allow r fsname=cephfs caps: [osd] allow rw tag cephfs data=cephfs client.b key: caps: [mds] allow rw fsname=cephfs path=/dir_b caps: [mon] allow r fsname=cephfs caps: [osd] allow rw tag cephfs data=cephfs is not enough, since it does only restrict the clients' access to the metadata pool. So to restrict access to the data, I create pools for each of the directories, e.g. cephfs_a_data and cephfs_b_data. To make the data end up on the right pool, I set attributes through cephfs-shell: setxattr /dir_a ceph.dir.layout.pool cephfs_a_data setxattr /dir_b ceph.dir.layout.pool cephfs_b_data Through trial an error, I found out the following client caps work with this setup: client.a key: caps: [mds] allow rw fsname=cephfs path=/dir_a caps: [mon] allow r fsname=cephfs caps: [osd] allow rwx pool=cephfs_a_data client.b key: caps: [mds] allow rw fsname=cephfs path=/dir_b caps: [mon] allow r fsname=cephfs caps: [osd] allow rwx pool=cephfs_b_data With only rw on osds, I was not able to write in the mounted dirs. Now the question: Since I established this setup more or less through trial and error, I was wondering if there is a more elegant/better approach than what is outlined above? Thank you for you help! Best regards, Jonas ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS: Isolating folders for different users
Thank you very much! Works like a charm, except for one thing: I gave my clients the MDS caps 'allow rws path=' to also be able to create snapshots from the client, but `mkdir .snap/test` still returns mkdir: cannot create directory ‘.snap/test’: Operation not permitted Do you have an idea what might be the issue here? Best regards, Jonas PS: A happy new year to everyone! On 23.12.22 10:05, Kai Stian Olstad wrote: On 22.12.2022 15:47, Jonas Schwab wrote: Now the question: Since I established this setup more or less through trial and error, I was wondering if there is a more elegant/better approach than what is outlined above? You can use namespace so you don't need separate pools. Unfortunately the documentation is sparse on the subject, I use it with subvolume like this # Create a subvolume ceph fs subvolume create --pool_layout --namespace-isolated The subvolume is created with namespace fsvolume_ You can also find the name with ceph fs subvolume info | jq -r .pool_namespace # Create a user with access to the subvolume and the namespace ## First find the path to the subvolume ceph fs subvolume getpath ## Create the user ceph auth get-or-create client. mon 'allow r' osd 'allow rw pool= namespace=fsvolumens_' I have found this by looking at how Openstack does it and some trial and error. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] CephFS: Questions regarding Namespaces, Subvolumes and Mirroring
Dear everyone, I have several questions regarding CephFS connected to Namespaces, Subvolumes and snapshot Mirroring: *1. How to display/create namespaces used for isolating subvolumes?* I have created multiple subvolumes with the option --namespace-isolated, so I was expecting to see the namespaces returned from ceph fs subvolume info also returned by rbd namespace ls --format=json But the latter command just returns an empty list. Are the namespaces used for rdb and CephFS different ones? *2. Can CephFS Snapshot mirroring also be applied to subvolumes?* I tried this, but without success. Is there something to take into account rather than just mirroring the directory, or is it just not possible right now? *3. Can xattr for namespaces and pools also be mirrored?* Or more specifically, is there a way to preserve the namespace and pool layout of mirrored directories? Thank you for your help! Best regrads, Jonas ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] CephFS: Questions regarding Namespaces, Subvolumes and Mirroring
Dear everyone, I have several questions regarding CephFS connected to Namespaces, Subvolumes and snapshot Mirroring: *1. How to display/create namespaces used for isolating subvolumes?* I have created multiple subvolumes with the option --namespace-isolated, so I was expecting to see the namespaces returned from ceph fs subvolume info also returned by rbd namespace ls --format=json But the latter command just returns an empty list. Are the namespaces used for rdb and CephFS different ones? *2. Can CephFS Snapshot mirroring also be applied to subvolumes?* I tried this, but without success. Is there something to take into account rather than just mirroring the directory, or is it just not possible right now? *3. Can xattr for namespaces and pools also be mirrored?* Or more specifically, is there a way to preserve the namespace and pool layout of mirrored directories? Thank you for your help! Best regrads, Jonas PS: You could receive this mail twice, sine this email address somehow got removed from the ceph-users list. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
No, didn't issue any commands to the OSDs. On 2025-04-10 17:28, Eugen Block wrote: Did you stop the OSDs? Zitat von Jonas Schwab : Thank you very much! I now stated the first step, namely "Collect the map from each OSD host". As I have a cephadm deployment, I will have to execute ceph-objectstore-tool within each container. Unfortunately, this produces the error "Mount failed with '(11) Resource temporarily unavailable'". Does anybody know how to solve this? Best regards, Jonas On 2025-04-10 16:04, Robert Sander wrote: Hi Jonas, Am 4/10/25 um 16:01 schrieb Jonas Schwab: I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. There is a procedure to recover the MON-DB from the OSDs: https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds Regards ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
Thank you for the help! Does that mean stopping the container and mounting the lv? On 2025-04-10 17:38, Eugen Block wrote: You have to stop the OSDs in order to mount them with the objectstore tool. Zitat von Jonas Schwab : No, didn't issue any commands to the OSDs. On 2025-04-10 17:28, Eugen Block wrote: Did you stop the OSDs? Zitat von Jonas Schwab : Thank you very much! I now stated the first step, namely "Collect the map from each OSD host". As I have a cephadm deployment, I will have to execute ceph-objectstore-tool within each container. Unfortunately, this produces the error "Mount failed with '(11) Resource temporarily unavailable'". Does anybody know how to solve this? Best regards, Jonas On 2025-04-10 16:04, Robert Sander wrote: Hi Jonas, Am 4/10/25 um 16:01 schrieb Jonas Schwab: I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. There is a procedure to recover the MON-DB from the OSDs: https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds Regards ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Urgent help: I accidentally nuked all my Monitor
Hello everyone, I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. I am very grateful for all help! Best regards, Jonas Schwab ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: OSDs ignore memory limit
Yes, it's the ceph-osd processes filling up the RAM. On 2025-04-09 15:13, Eugen Block wrote: I noticed the quite high reported memory stats for OSDs as well on a recently upgraded customer cluster, now running 18.2.4. But checking the top output etc. doesn't confirm those values. I don't really know where they come from, tbh. Can you confirm that those are actually OSD processes filling up the RAM? Zitat von Jonas Schwab : Hello everyone, I recently have many problems with OSDs using much more memory than they are supposed to (> 10GB), leading to the node running out of memory and killing processes. Does someone have ideas why the daemons seem to completely ignore the set memory limits? See e.g. the following: $ ceph orch ps ceph2-03 NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mon.ceph2-03 ceph2-03 running (3h) 1s ago 2y 501M 2048M 19.2.1 f2efb0401a30 d876fc30f741 node-exporter.ceph2-03 ceph2-03 *:9100 running (3h) 1s ago 17M 46.5M - 1.7.0 72c9c2088986 d32ec4d266ea osd.4 ceph2-03 running (26m) 1s ago 2y 10.2G 3310M 19.2.1 f2efb0401a30 b712a86dacb2 osd.11 ceph2-03 running (5m) 1s ago 2y 3458M 3310M 19.2.1 f2efb0401a30 f3d7705325b4 osd.13 ceph2-03 running (3h) 1s ago 6d 2059M 3310M 19.2.1 f2efb0401a30 980ee7e11252 osd.17 ceph2-03 running (114s) 1s ago 2y 3431M 3310M 19.2.1 f2efb0401a30 be7319fda00b osd.23 ceph2-03 running (30m) 1s ago 2y 10.4G 3310M 19.2.1 f2efb0401a30 9cfb86c4b34a osd.29 ceph2-03 running (8m) 1s ago 2y 4923M 3310M 19.2.1 f2efb0401a30 d764930bb557 osd.35 ceph2-03 running (14m) 1s ago 2y 7029M 3310M 19.2.1 f2efb0401a30 6a4113adca65 osd.59 ceph2-03 running (2m) 1s ago 2y 2821M 3310M 19.2.1 f2efb0401a30 8871d6d4f50a osd.61 ceph2-03 running (49s) 1s ago 2y 1090M 3310M 19.2.1 f2efb0401a30 3f7a0ed17ac2 osd.67 ceph2-03 running (7m) 1s ago 2y 4541M 3310M 19.2.1 f2efb0401a30 eea0a6bcefec osd.75 ceph2-03 running (3h) 1s ago 2y 1239M 3310M 19.2.1 f2efb0401a30 5a801902340d Best regards, Jonas -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] OSDs ignore memory limit
Hello everyone, I recently have many problems with OSDs using much more memory than they are supposed to (> 10GB), leading to the node running out of memory and killing processes. Does someone have ideas why the daemons seem to completely ignore the set memory limits? See e.g. the following: $ ceph orch ps ceph2-03 NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mon.ceph2-03 ceph2-03 running (3h) 1s ago 2y 501M 2048M 19.2.1 f2efb0401a30 d876fc30f741 node-exporter.ceph2-03 ceph2-03 *:9100 running (3h) 1s ago 17M 46.5M - 1.7.0 72c9c2088986 d32ec4d266ea osd.4 ceph2-03 running (26m) 1s ago 2y 10.2G 3310M 19.2.1 f2efb0401a30 b712a86dacb2 osd.11 ceph2-03 running (5m) 1s ago 2y 3458M 3310M 19.2.1 f2efb0401a30 f3d7705325b4 osd.13 ceph2-03 running (3h) 1s ago 6d 2059M 3310M 19.2.1 f2efb0401a30 980ee7e11252 osd.17 ceph2-03 running (114s) 1s ago 2y 3431M 3310M 19.2.1 f2efb0401a30 be7319fda00b osd.23 ceph2-03 running (30m) 1s ago 2y 10.4G 3310M 19.2.1 f2efb0401a30 9cfb86c4b34a osd.29 ceph2-03 running (8m) 1s ago 2y 4923M 3310M 19.2.1 f2efb0401a30 d764930bb557 osd.35 ceph2-03 running (14m) 1s ago 2y 7029M 3310M 19.2.1 f2efb0401a30 6a4113adca65 osd.59 ceph2-03 running (2m) 1s ago 2y 2821M 3310M 19.2.1 f2efb0401a30 8871d6d4f50a osd.61 ceph2-03 running (49s) 1s ago 2y 1090M 3310M 19.2.1 f2efb0401a30 3f7a0ed17ac2 osd.67 ceph2-03 running (7m) 1s ago 2y 4541M 3310M 19.2.1 f2efb0401a30 eea0a6bcefec osd.75 ceph2-03 running (3h) 1s ago 2y 1239M 3310M 19.2.1 f2efb0401a30 5a801902340d Best regards, Jonas -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
I realized, I have access to a data directory of a monitor I removed just before the oopsie happened. Can I launch a ceph-mon from that? If I try just to launch ceph-mon, it commits suicide: 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 not in monmap and have been in a quorum before; must have been removed 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 commit suicide! 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 failed to initialize On 2025-04-10 16:01, Jonas Schwab wrote: Hello everyone, I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. I am very grateful for all help! Best regards, Jonas Schwab ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
Thank you very much! I now stated the first step, namely "Collect the map from each OSD host". As I have a cephadm deployment, I will have to execute ceph-objectstore-tool within each container. Unfortunately, this produces the error "Mount failed with '(11) Resource temporarily unavailable'". Does anybody know how to solve this? Best regards, Jonas On 2025-04-10 16:04, Robert Sander wrote: Hi Jonas, Am 4/10/25 um 16:01 schrieb Jonas Schwab: I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. There is a procedure to recover the MON-DB from the OSDs: https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds Regards ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
Again, thank you very much for your help! The container is not there any more, but I discovered that the "old" mon data still exists. I have the same situation for two mons I removed at the same time: $ monmaptool --print monmap1 monmaptool: monmap file monmap1 epoch 29 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:21.203171+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (squid) election_strategy: 1 0: [v2:10.127.239.2:3300/0,v1:10.127.239.2:6789/0] mon.ceph2-02 1: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 2: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 3: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 $ monmaptool --print monmap2 monmaptool: monmap file monmap2 epoch 30 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:43.216713+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (unknown) election_strategy: 1 0: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 1: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 2: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 Would it be feasible to move the data from node1 (which still contains node2 as mon) to node2, or would that just result in even more mess? On 2025-04-10 19:57, Eugen Block wrote: It can work, but it might be necessary to modify the monmap first, since it's complaining that it has been removed from it. Are you familiar with the monmap-tool (https://docs.ceph.com/en/latest/man/8/monmaptool/)? The procedure is similar to changing a monitor's IP address the "messy way" (https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address-advanced-method). I also wrote a blog post how to do it with cephadm: https://heiterbiswolkig.blogs.nde.ag/2020/12/18/cephadm-changing-a-monitors-ip-address/ But before changing anything, I'd inspect first what the current status is. You can get the current monmap from within the mon container (is it still there?): cephadm shell --name mon. ceph-monstore-tool /var/lib/ceph/mon/ get monmap -- --out monmap monmaptool --print monmap You can paste the output here, if you want. Zitat von Jonas Schwab : I realized, I have access to a data directory of a monitor I removed just before the oopsie happened. Can I launch a ceph-mon from that? If I try just to launch ceph-mon, it commits suicide: 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 not in monmap and have been in a quorum before; must have been removed 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 commit suicide! 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 failed to initialize On 2025-04-10 16:01, Jonas Schwab wrote: Hello everyone, I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. I am very grateful for all help! Best regards, Jonas Schwab ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
I edited the monmap to include only rgw2-06 and then followed https://docs.ceph.com/en/squid/rados/operations/add-or-rm-mons/#adding-a-monitor-manual to create a new monitor. Unfortunately, `ceph-mon -i mon.rgw2-06 --public-addr 10.127.239.63 -f` crashed with the traceback seen in the attachment. On 2025-04-10 20:34, Eugen Block wrote: It depends a bit. Which mon do the OSDs still know about? You can check /var/lib/ceph//osd.X/config to retrieve that piece of information. I'd try to revive one of them. Do you still have the mon store.db for all of the mons or at least one of them? Just to be safe, back up all the store.db directories. Then modify a monmap to contain the one you want to revive by removing the other ones. Backup your monmap files as well. Then inject the modified monmap into the daemon and try starting it. Zitat von Jonas Schwab : Again, thank you very much for your help! The container is not there any more, but I discovered that the "old" mon data still exists. I have the same situation for two mons I removed at the same time: $ monmaptool --print monmap1 monmaptool: monmap file monmap1 epoch 29 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:21.203171+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (squid) election_strategy: 1 0: [v2:10.127.239.2:3300/0,v1:10.127.239.2:6789/0] mon.ceph2-02 1: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 2: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 3: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 $ monmaptool --print monmap2 monmaptool: monmap file monmap2 epoch 30 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:43.216713+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (unknown) election_strategy: 1 0: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 1: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 2: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 Would it be feasible to move the data from node1 (which still contains node2 as mon) to node2, or would that just result in even more mess? On 2025-04-10 19:57, Eugen Block wrote: It can work, but it might be necessary to modify the monmap first, since it's complaining that it has been removed from it. Are you familiar with the monmap-tool (https://docs.ceph.com/en/latest/man/8/monmaptool/)? The procedure is similar to changing a monitor's IP address the "messy way" (https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address-advanced-method). I also wrote a blog post how to do it with cephadm: https://heiterbiswolkig.blogs.nde.ag/2020/12/18/cephadm-changing-a-monitors-ip-address/ But before changing anything, I'd inspect first what the current status is. You can get the current monmap from within the mon container (is it still there?): cephadm shell --name mon. ceph-monstore-tool /var/lib/ceph/mon/ get monmap -- --out monmap monmaptool --print monmap You can paste the output here, if you want. Zitat von Jonas Schwab : I realized, I have access to a data directory of a monitor I removed just before the oopsie happened. Can I launch a ceph-mon from that? If I try just to launch ceph-mon, it commits suicide: 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 not in monmap and have been in a quorum before; must have been removed 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 commit suicide! 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 failed to initialize On 2025-04-10 16:01, Jonas Schwab wrote: Hello everyone, I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. I am very grateful for all help! Best regards, Jonas Schwab ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de Email: jonas.sch...@uni-wuerzburg.de Tel: +49 931 31-84460 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Jonas Schwab Research Data Management, Cluster of Excellence ct.qmat https://data.ctqmat.de | datamanagement.ct
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
I solved the problem with executing ceph-mon. Among others, -i mon.rgw2-06 was not the correct option, but rather -i rgw2-06. Unfortunately, that brought the next problem: The cluster now shows "100.000% pgs unknown", which is probably because the monitor data is not complete up to date, but rather the state it was in before I switched over to other mons. A few minutes or s after that, the cluster crashed and I lust the mons. I guess this outdated cluster map is probably unusable? All services seem to be running fine and there are not network obstructions. Should I instead go with this: https://docs.ceph.com/en/squid/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds ? I actually already tried the latter option, but ran into the error `rocksdb: [db/db_impl/db_impl_open.cc:2086] DB::Open() failed: IO error: while open a file for lock: /var/lib/ceph/mon/ceph-ceph2-01/store.db/LOCK: Permission denied` Even though I double checked that the permission and ownership on the replacing store.db are properly set. On 2025-04-10 22:45, Jonas Schwab wrote: I edited the monmap to include only rgw2-06 and then followed https://docs.ceph.com/en/squid/rados/operations/add-or-rm-mons/#adding-a-monitor-manual to create a new monitor. Unfortunately, `ceph-mon -i mon.rgw2-06 --public-addr 10.127.239.63 -f` crashed with the traceback seen in the attachment. On 2025-04-10 20:34, Eugen Block wrote: It depends a bit. Which mon do the OSDs still know about? You can check /var/lib/ceph//osd.X/config to retrieve that piece of information. I'd try to revive one of them. Do you still have the mon store.db for all of the mons or at least one of them? Just to be safe, back up all the store.db directories. Then modify a monmap to contain the one you want to revive by removing the other ones. Backup your monmap files as well. Then inject the modified monmap into the daemon and try starting it. Zitat von Jonas Schwab : Again, thank you very much for your help! The container is not there any more, but I discovered that the "old" mon data still exists. I have the same situation for two mons I removed at the same time: $ monmaptool --print monmap1 monmaptool: monmap file monmap1 epoch 29 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:21.203171+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (squid) election_strategy: 1 0: [v2:10.127.239.2:3300/0,v1:10.127.239.2:6789/0] mon.ceph2-02 1: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 2: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 3: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 $ monmaptool --print monmap2 monmaptool: monmap file monmap2 epoch 30 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:43.216713+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (unknown) election_strategy: 1 0: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 1: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 2: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 Would it be feasible to move the data from node1 (which still contains node2 as mon) to node2, or would that just result in even more mess? On 2025-04-10 19:57, Eugen Block wrote: It can work, but it might be necessary to modify the monmap first, since it's complaining that it has been removed from it. Are you familiar with the monmap-tool (https://docs.ceph.com/en/latest/man/8/monmaptool/)? The procedure is similar to changing a monitor's IP address the "messy way" (https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address-advanced-method). I also wrote a blog post how to do it with cephadm: https://heiterbiswolkig.blogs.nde.ag/2020/12/18/cephadm-changing-a-monitors-ip-address/ But before changing anything, I'd inspect first what the current status is. You can get the current monmap from within the mon container (is it still there?): cephadm shell --name mon. ceph-monstore-tool /var/lib/ceph/mon/ get monmap -- --out monmap monmaptool --print monmap You can paste the output here, if you want. Zitat von Jonas Schwab : I realized, I have access to a data directory of a monitor I removed just before the oopsie happened. Can I launch a ceph-mon from that? If I try just to launch ceph-mon, it commits suicide: 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 not in monmap and have been in a quorum before; must have been removed 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 commit suicide! 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 failed to initialize On 2025-04-10 16:01, Jonas Schwab wrote: Hello everyone, I believe I accidentally nuked all monitor of my cluster (please don't ask how). Is there a way to recover from this desaster? I have a cephadm setup. I am
[ceph-users] Re: Urgent help: I accidentally nuked all my Monitor
Yes mgrs are running as intended. It just seems that mons and osd don't recongnize each other, because the monitors map is outdated. On 2025-04-11 07:07, Eugen Block wrote: Is at least one mgr running? PG states are reported by the mgr daemon. Zitat von Jonas Schwab : I solved the problem with executing ceph-mon. Among others, -i mon.rgw2-06 was not the correct option, but rather -i rgw2-06. Unfortunately, that brought the next problem: The cluster now shows "100.000% pgs unknown", which is probably because the monitor data is not complete up to date, but rather the state it was in before I switched over to other mons. A few minutes or s after that, the cluster crashed and I lust the mons. I guess this outdated cluster map is probably unusable? All services seem to be running fine and there are not network obstructions. Should I instead go with this: https://docs.ceph.com/en/squid/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds ? I actually already tried the latter option, but ran into the error `rocksdb: [db/db_impl/db_impl_open.cc:2086] DB::Open() failed: IO error: while open a file for lock: /var/lib/ceph/mon/ceph-ceph2-01/store.db/LOCK: Permission denied` Even though I double checked that the permission and ownership on the replacing store.db are properly set. On 2025-04-10 22:45, Jonas Schwab wrote: I edited the monmap to include only rgw2-06 and then followed https://docs.ceph.com/en/squid/rados/operations/add-or-rm-mons/#adding-a-monitor-manual to create a new monitor. Unfortunately, `ceph-mon -i mon.rgw2-06 --public-addr 10.127.239.63 -f` crashed with the traceback seen in the attachment. On 2025-04-10 20:34, Eugen Block wrote: It depends a bit. Which mon do the OSDs still know about? You can check /var/lib/ceph//osd.X/config to retrieve that piece of information. I'd try to revive one of them. Do you still have the mon store.db for all of the mons or at least one of them? Just to be safe, back up all the store.db directories. Then modify a monmap to contain the one you want to revive by removing the other ones. Backup your monmap files as well. Then inject the modified monmap into the daemon and try starting it. Zitat von Jonas Schwab : Again, thank you very much for your help! The container is not there any more, but I discovered that the "old" mon data still exists. I have the same situation for two mons I removed at the same time: $ monmaptool --print monmap1 monmaptool: monmap file monmap1 epoch 29 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:21.203171+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (squid) election_strategy: 1 0: [v2:10.127.239.2:3300/0,v1:10.127.239.2:6789/0] mon.ceph2-02 1: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 2: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 3: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 $ monmaptool --print monmap2 monmaptool: monmap file monmap2 epoch 30 fsid 6d0d4ed4-0052-4eb9-9d9d-e6872ba7ee96 last_changed 2025-04-10T14:16:43.216713+0200 created 2021-02-26T14:02:29.522695+0100 min_mon_release 19 (unknown) election_strategy: 1 0: [v2:10.127.239.61:3300/0,v1:10.127.239.61:6789/0] mon.rgw2-04 1: [v2:10.127.239.63:3300/0,v1:10.127.239.63:6789/0] mon.rgw2-06 2: [v2:10.127.239.62:3300/0,v1:10.127.239.62:6789/0] mon.rgw2-05 Would it be feasible to move the data from node1 (which still contains node2 as mon) to node2, or would that just result in even more mess? On 2025-04-10 19:57, Eugen Block wrote: It can work, but it might be necessary to modify the monmap first, since it's complaining that it has been removed from it. Are you familiar with the monmap-tool (https://docs.ceph.com/en/latest/man/8/monmaptool/)? The procedure is similar to changing a monitor's IP address the "messy way" (https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address-advanced-method). I also wrote a blog post how to do it with cephadm: https://heiterbiswolkig.blogs.nde.ag/2020/12/18/cephadm-changing-a-monitors-ip-address/ But before changing anything, I'd inspect first what the current status is. You can get the current monmap from within the mon container (is it still there?): cephadm shell --name mon. ceph-monstore-tool /var/lib/ceph/mon/ get monmap -- --out monmap monmaptool --print monmap You can paste the output here, if you want. Zitat von Jonas Schwab : I realized, I have access to a data directory of a monitor I removed just before the oopsie happened. Can I launch a ceph-mon from that? If I try just to launch ceph-mon, it commits suicide: 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 not in monmap and have been in a quorum before; must have been removed 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???) e29 commit suicide! 2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 failed