[ceph-users] Re: ceph status not showing correct monitor services
You can add a mon manually to the monmap, but that requires a downtime of the mons. Here's an example [1] how to modify the monmap (including network change which you don't need, of course). But that would be my last resort, first I would try to find out why the MON fails to join the quorum. What is that mon.a001s016 logging, and what are the other two logging? Do you have another host where you could place a mon daemon to see if that works? [1] https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#example-procedure Zitat von "Adiga, Anantha" : # ceph mon stat e6: 2 mons at {a001s017=[v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0],a001s018=[v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0]}, election epoch 162, leader 0 a001s018, quorum 0,1 a001s018,a001s017 # ceph orch ps | grep mon mon.a001s016a001s016 running (3h) 6m ago 3h 527M2048M 16.2.5 6e73176320aa 39db8cfba7e1 mon.a001s017a001s017 running (22h)47s ago 1h 993M2048M 16.2.5 6e73176320aa e5e5cb6c256c mon.a001s018a001s018 running (5w) 48s ago 2y1167M2048M 16.2.5 6e73176320aa 7d2bb6d41f54 # ceph mgr stat { "epoch": 1130365, "available": true, "active_name": "a001s016.ctmoay", "num_standby": 1 } # ceph orch ps | grep mgr mgr.a001s016.ctmoay a001s016 *:8443 running (18M) 109s ago 23M 518M- 16.2.5 6e73176320aa 169cafcbbb99 mgr.a001s017.bpygfm a001s017 *:8443 running (19M) 5m ago 23M 501M- 16.2.5 6e73176320aa 97257195158c mgr.a001s018.hcxnef a001s018 *:8443 running (20M) 5m ago 23M 113M- 16.2.5 6e73176320aa 21ba5896cee2 # ceph orch ls --service_name=mgr --export service_type: mgr service_name: mgr placement: count: 3 hosts: - a001s016 - a001s017 - a001s018 # ceph orch ls --service_name=mon --export service_type: mon service_name: mon placement: count: 3 hosts: - a001s016 - a001s017 - a001s018 -Original Message- From: Adiga, Anantha Sent: Monday, April 1, 2024 6:06 PM To: Eugen Block Cc: ceph-users@ceph.io Subject: RE: [ceph-users] Re: ceph status not showing correct monitor services # ceph tell mon.a001s016 mon_status Error ENOENT: problem getting command descriptions from mon.a001s016 a001s016 is outside quorum see below # ceph tell mon.a001s017 mon_status { "name": "a001s017", "rank": 1, "state": "peon", "election_epoch": 162, "quorum": [ 0, 1 ], "quorum_age": 79938, "features": { "required_con": "2449958747317026820", "required_mon": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ], "quorum_con": "4540138297136906239", "quorum_mon": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ] }, "outside_quorum": [], "extra_probe_peers": [ { "addrvec": [ { "type": "v2", "addr": "10.45.128.26:3300", "nonce": 0 }, { "type": "v1", "addr": "10.45.128.26:6789", "nonce": 0 } ] } ], "sync_provider": [], "monmap": { "epoch": 6, "fsid": "604d56db-2fab-45db-a9ea-c418f9a8cca8", "modified": "2024-03-31T23:54:18.692983Z", "created": "2021-09-30T16:15:12.884602Z", "min_mon_release": 16, "min_mon_release_name": "pacific", "election_strategy": 1, "disallowed_leaders: ": "", "stretch_mode": false, "features": { "persistent": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ], "optional": [] }, "mons": [ { "rank": 0, "name": "a001s018", "public_addrs": { "addrvec": [ { "type": "v2", "addr": "10.45.128.28:3300", "nonce": 0 }, { "type": "v1", "addr": "10.45.128.28:6789", "nonce": 0
[ceph-users] Re: Drained A Single Node Host On Accident
Hi, without knowing the whole story, to cancel OSD removal you can run this command: ceph orch osd rm stop Regards, Eugen Zitat von "adam.ther" : Hello, I have a single node host with a VM as a backup MON,MGR,ect. This has caused all OSD's to be pending as 'deleting', can i safely cancel this deletion request? Regards, Adam ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Replace block drives of combined NVME+HDD OSDs
Hi, here's the link to the docs [1] how to replace OSDs. ceph orch osd rm --replace --zap [--force] This should zap both the data drive and db LV (yes, its data is useless without the data drive), not sure how it will handle if the data drive isn't accessible though. One thing I'm not sure about is how your spec file will be handled. Since the drive letters can change I recommend to use a more generic approach, for example the rotational flags and drive sizes instead of paths. But if the drive letters won't change for the replaced drives it should work. I also don't expect an impact on the rest of the OSDs (except for backfilling, of course). Regards, Eugen [1] https://docs.ceph.com/en/latest/cephadm/services/osd/#replacing-an-osd Zitat von Zakhar Kirpichenko : Hi, Unfortunately, some of our HDDs failed and we need to replace these drives which are parts of "combined" OSDs (DB/WAL on NVME, block storage on HDD). All OSDs are defined with a service definition similar to this one: ``` service_type: osd service_id: ceph02_combined_osd service_name: osd.ceph02_combined_osd placement: hosts: - ceph02 spec: data_devices: paths: - /dev/sda - /dev/sdb - /dev/sdc - /dev/sdd - /dev/sde - /dev/sdf - /dev/sdg - /dev/sdh - /dev/sdi db_devices: paths: - /dev/nvme0n1 - /dev/nvme1n1 filter_logic: AND objectstore: bluestore ``` In the above example, HDDs `sda` and `sdb` are not readable and data cannot be copied over to new HDDs. NVME partitions of `nvme0n1` with DB/WAL data are intact, but I guess that data is useless. I think the best approach is to replace the dead drives and completely rebuild each affected OSD. How should we go about this, preferably in a way that other OSDs on the node remain unaffected and operational? I would appreciate any advice or pointers to the relevant documentation. Best regards, Zakhar ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephfs inode backtrace information
On 29/03/2024 04:18, Niklas Hambüchen wrote: Hi Loïc, I'm surprised by that high storage amount, my "default" pool uses only ~512 Bytes per file, not ~32 KiB like in your pool. That's a 64x difference! (See also my other response to the original post I just sent.) I'm using Ceph 16.2.1. > Hello, We actually traced the source of this issue: a configuration mistake (data pool not set properly on a client directory). The directories for this client had "a few" large (tens of GiB) files, which were stored in the "default" pool and used up a lot of space. With this client's data moved where they belong: [ceph: root@NODE /]# ceph df --- RAW STORAGE --- CLASS SIZEAVAIL USED RAW USED %RAW USED hdd6.1 PiB 3.8 PiB 2.3 PiB 2.3 PiB 37.15 ssd 52 TiB 49 TiB 3.2 TiB 3.2 TiB 6.04 TOTAL 6.1 PiB 3.9 PiB 2.3 PiB 2.3 PiB 36.89 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 2 1 710 MiB 664 2.1 GiB 0 15 TiB cephfs_EC_data 3 8192 1.7 PiB 606.79M 2.1 PiB 38.132.8 PiB cephfs_metadata4 128 101 GiB 14.55M 304 GiB 0.64 15 TiB cephfs_default 5 128 0 B 162.90M 0 B 0 15 TiB [...] So the "correct" stored value for the default pool should be 0 bytes. Loïc. -- | Loīc Tortay - IN2P3 Computing Centre | ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] CEPH Quincy installation with multipathd enabled
Greetings community, we have a setup comprising of 6 servers hosting CentOS 8 Minimal Installation with CEPH Quincy version 18.2.2 supported by 20Gbps fiber optics NICs and a dual Xeon Intel processors, bootstrapped the installation on the first node then expanded to the others using the cephadm method, having the monitor services deployed on 5 of these nodes as well as 3 manager nodes. Each server has an NVMe boot disk as well as a 1TBs SATA SSD over which the OSDs are deployed. An EC profile was created with k=3 and m=3, serving a CephFS filesystem on top with NFS exports to serve other servers. Up to this point, the setup is quite stable in the sense that upon emergency reboot or network connection failure the OSDs did not fail and remain functional/started normally after reboot. At a certain point in our project, we had the need to activate the multipathd service, adding the boot drive partition and the CEPH SSD to its blacklist as to not be initialized for use by an mpath partition, the blacklist goes like so: boot blacklist: === blacklist { wwid "eui." } SATA SSD blacklist: === blacklist { wwid "naa." } The above blacklist configuration ensures that both the boot disk as well as CEPH's OSD function properly, with the following being lsblk output: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:00 894.3G 0 disk └─ceph---osd--block-- 252:30 894.3G 0 lvm nvme0n1 259:00 238.5G 0 disk ├─nvme0n1p1 259:10 600M 0 part /boot/efi ├─nvme0n1p2 259:20 1G 0 part /boot └─nvme0n1p3 259:30 236.9G 0 part ├─centos-root 252:00 170G 0 lvm / ├─centos-swap 252:10 23.4G 0 lvm [SWAP] ├─centos-var_log_audit 252:20 7.5G 0 lvm /var/log/audit ├─centos-home 252:4026G 0 lvm /home └─centos-var_log 252:5010G 0 lvm /var/log In addition to the above multipathd configuration, we have use_devicesfile=1 in /etc/lvm/lvm.conf, with /etc/lvm/devices/system.devices file being like so, with PVID used from the output of the pvdisplay command, and the IDNAME value extracted from the ouput of "ls -lha /dev/disk/by-id": VERSION=1.1.1 IDTYPE=sys_wwid IDNAME=eui. DEVNAME=/dev/nvme0n1p3 PVID= PART=3 IDTYPE=sys_wwid IDNAME=naa. DEVNAME=/dev/sda PVID= Issues started when performing certain tests regarding the system's integrity, most important of which is emergency shutdown's and reboot of all the nodes, the behavior that follows is that the OSDs are not started automatically as well as their respective LVM volumes not properly showing (except on a single node for some reason), hence the lsblk ouput changes like the snippet below, requiring us rebooting the nodes one by one until all the OSDs are back online: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:00 894.3G 0 disk nvme0n1 259:00 238.5G 0 disk ├─nvme0n1p1 259:10 600M 0 part /boot/efi ├─nvme0n1p2 259:20 1G 0 part /boot └─nvme0n1p3 259:30 236.9G 0 part ├─centos-root 252:00 170G 0 lvm / ├─centos-swap
[ceph-users] Pacific 16.2.15 `osd noin`
Hi, I'm adding a few OSDs to an existing cluster, the cluster is running with `osd noout,noin`: cluster: id: 3f50555a-ae2a-11eb-a2fc-ffde44714d86 health: HEALTH_WARN noout,noin flag(s) set Specifically `noin` is documented as "prevents booting OSDs from being marked in". But freshly added OSDs were immediately marked `up` and `in`: services: ... osd: 96 osds: 96 up (since 5m), 96 in (since 6m); 338 remapped pgs flags noout,noin # ceph osd tree in | grep -E "osd.11|osd.12|osd.26" 11hdd9.38680 osd.11 up 1.0 1.0 12hdd9.38680 osd.12 up 1.0 1.0 26hdd9.38680 osd.26 up 1.0 1.0 Is this expected behavior? Do I misunderstand the purpose of the `noin` option? Best regards, Zakhar ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Replace block drives of combined NVME+HDD OSDs
Thank you, Eugen. It was actually very straightforward. I'm happy to report back that there were no issues with removing and zapping the OSDs whose data devices were unavailable. I had to manually remove stale dm entries, but that was it. /Z On Tue, 2 Apr 2024 at 11:00, Eugen Block wrote: > Hi, > > here's the link to the docs [1] how to replace OSDs. > > ceph orch osd rm --replace --zap [--force] > > This should zap both the data drive and db LV (yes, its data is > useless without the data drive), not sure how it will handle if the > data drive isn't accessible though. > One thing I'm not sure about is how your spec file will be handled. > Since the drive letters can change I recommend to use a more generic > approach, for example the rotational flags and drive sizes instead of > paths. But if the drive letters won't change for the replaced drives > it should work. I also don't expect an impact on the rest of the OSDs > (except for backfilling, of course). > > Regards, > Eugen > > [1] https://docs.ceph.com/en/latest/cephadm/services/osd/#replacing-an-osd > > Zitat von Zakhar Kirpichenko : > > > Hi, > > > > Unfortunately, some of our HDDs failed and we need to replace these > drives > > which are parts of "combined" OSDs (DB/WAL on NVME, block storage on > HDD). > > All OSDs are defined with a service definition similar to this one: > > > > ``` > > service_type: osd > > service_id: ceph02_combined_osd > > service_name: osd.ceph02_combined_osd > > placement: > > hosts: > > - ceph02 > > spec: > > data_devices: > > paths: > > - /dev/sda > > - /dev/sdb > > - /dev/sdc > > - /dev/sdd > > - /dev/sde > > - /dev/sdf > > - /dev/sdg > > - /dev/sdh > > - /dev/sdi > > db_devices: > > paths: > > - /dev/nvme0n1 > > - /dev/nvme1n1 > > filter_logic: AND > > objectstore: bluestore > > ``` > > > > In the above example, HDDs `sda` and `sdb` are not readable and data > cannot > > be copied over to new HDDs. NVME partitions of `nvme0n1` with DB/WAL data > > are intact, but I guess that data is useless. I think the best approach > is > > to replace the dead drives and completely rebuild each affected OSD. How > > should we go about this, preferably in a way that other OSDs on the node > > remain unaffected and operational? > > > > I would appreciate any advice or pointers to the relevant documentation. > > > > Best regards, > > Zakhar > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Questions about rbd flatten command
Do these RBD volumes have a full feature set? I would think that fast-diff and objectmap would speed this. > On Apr 2, 2024, at 00:36, Henry lol wrote: > > I'm not sure, but it seems that read and write operations are > performed for all objects in rbd. > If so, is there any method to apply qos for flatten operation? > > 2024년 4월 1일 (월) 오후 11:59, Henry lol 님이 작성: >> >> Hello, >> >> I executed multiple 'rbd flatten' commands simultaneously on a client. >> The elapsed time of each flatten job increased as the number of jobs >> increased, and network I/O was nearly full. >> >> so, I have two questions. >> 1. isn’t the flatten job running within the ceph cluster? Why is >> client-side network I/O so high? >> 2. How can I apply qos for each flatten job to reduce network I/O? >> >> Sincerely, > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Questions about rbd flatten command
Yes, they do. Actually, the read/write ops will be skipped as you said. Also, is it possible to limit the max network throughput per flatten operation or image? I want to avoid the scenario where the flatten operation consumes network throughput fully. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Replace block drives of combined NVME+HDD OSDs
Nice, thanks for the info. Zitat von Zakhar Kirpichenko : Thank you, Eugen. It was actually very straightforward. I'm happy to report back that there were no issues with removing and zapping the OSDs whose data devices were unavailable. I had to manually remove stale dm entries, but that was it. /Z On Tue, 2 Apr 2024 at 11:00, Eugen Block wrote: Hi, here's the link to the docs [1] how to replace OSDs. ceph orch osd rm --replace --zap [--force] This should zap both the data drive and db LV (yes, its data is useless without the data drive), not sure how it will handle if the data drive isn't accessible though. One thing I'm not sure about is how your spec file will be handled. Since the drive letters can change I recommend to use a more generic approach, for example the rotational flags and drive sizes instead of paths. But if the drive letters won't change for the replaced drives it should work. I also don't expect an impact on the rest of the OSDs (except for backfilling, of course). Regards, Eugen [1] https://docs.ceph.com/en/latest/cephadm/services/osd/#replacing-an-osd Zitat von Zakhar Kirpichenko : > Hi, > > Unfortunately, some of our HDDs failed and we need to replace these drives > which are parts of "combined" OSDs (DB/WAL on NVME, block storage on HDD). > All OSDs are defined with a service definition similar to this one: > > ``` > service_type: osd > service_id: ceph02_combined_osd > service_name: osd.ceph02_combined_osd > placement: > hosts: > - ceph02 > spec: > data_devices: > paths: > - /dev/sda > - /dev/sdb > - /dev/sdc > - /dev/sdd > - /dev/sde > - /dev/sdf > - /dev/sdg > - /dev/sdh > - /dev/sdi > db_devices: > paths: > - /dev/nvme0n1 > - /dev/nvme1n1 > filter_logic: AND > objectstore: bluestore > ``` > > In the above example, HDDs `sda` and `sdb` are not readable and data cannot > be copied over to new HDDs. NVME partitions of `nvme0n1` with DB/WAL data > are intact, but I guess that data is useless. I think the best approach is > to replace the dead drives and completely rebuild each affected OSD. How > should we go about this, preferably in a way that other OSDs on the node > remain unaffected and operational? > > I would appreciate any advice or pointers to the relevant documentation. > > Best regards, > Zakhar > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph status not showing correct monitor services
Hi Eugen, Currently there are only three nodes, but I can add a node to the cluster and check it out. I will take a look at the mon logs Thank you, Anantha -Original Message- From: Eugen Block Sent: Tuesday, April 2, 2024 12:19 AM To: Adiga, Anantha Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services You can add a mon manually to the monmap, but that requires a downtime of the mons. Here's an example [1] how to modify the monmap (including network change which you don't need, of course). But that would be my last resort, first I would try to find out why the MON fails to join the quorum. What is that mon.a001s016 logging, and what are the other two logging? Do you have another host where you could place a mon daemon to see if that works? [1] https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#example-procedure Zitat von "Adiga, Anantha" : > # ceph mon stat > e6: 2 mons at > {a001s017=[v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0],a001s018=[v2 > :10.45.128.28:3300/0,v1:10.45.128.28:6789/0]}, election epoch 162, > leader 0 a001s018, quorum 0,1 > a001s018,a001s017 > > # ceph orch ps | grep mon > mon.a001s016a001s016 running > (3h) 6m ago 3h 527M2048M 16.2.5 > 6e73176320aa 39db8cfba7e1 > mon.a001s017a001s017 running > (22h)47s ago 1h 993M2048M 16.2.5 > 6e73176320aa e5e5cb6c256c > mon.a001s018a001s018 running > (5w) 48s ago 2y1167M2048M 16.2.5 > 6e73176320aa 7d2bb6d41f54 > > # ceph mgr stat > { > "epoch": 1130365, > "available": true, > "active_name": "a001s016.ctmoay", > "num_standby": 1 > } > > # ceph orch ps | grep mgr > mgr.a001s016.ctmoay a001s016 *:8443 running > (18M) 109s ago 23M 518M- 16.2.5 > 6e73176320aa 169cafcbbb99 > mgr.a001s017.bpygfm a001s017 *:8443 running > (19M) 5m ago 23M 501M- 16.2.5 > 6e73176320aa 97257195158c > mgr.a001s018.hcxnef a001s018 *:8443 running > (20M) 5m ago 23M 113M- 16.2.5 > 6e73176320aa 21ba5896cee2 > > # ceph orch ls --service_name=mgr --export > service_type: mgr > service_name: mgr > placement: > count: 3 > hosts: > - a001s016 > - a001s017 > - a001s018 > > # ceph orch ls --service_name=mon --export > service_type: mon > service_name: mon > placement: > count: 3 > hosts: > - a001s016 > - a001s017 > - a001s018 > > -Original Message- > From: Adiga, Anantha > Sent: Monday, April 1, 2024 6:06 PM > To: Eugen Block > Cc: ceph-users@ceph.io > Subject: RE: [ceph-users] Re: ceph status not showing correct monitor > services > > # ceph tell mon.a001s016 mon_status Error ENOENT: problem getting > command descriptions from mon.a001s016 > > a001s016 is outside quorum see below > > # ceph tell mon.a001s017 mon_status { > "name": "a001s017", > "rank": 1, > "state": "peon", > "election_epoch": 162, > "quorum": [ > 0, > 1 > ], > "quorum_age": 79938, > "features": { > "required_con": "2449958747317026820", > "required_mon": [ > "kraken", > "luminous", > "mimic", > "osdmap-prune", > "nautilus", > "octopus", > "pacific", > "elector-pinging" > ], > "quorum_con": "4540138297136906239", > "quorum_mon": [ > "kraken", > "luminous", > "mimic", > "osdmap-prune", > "nautilus", > "octopus", > "pacific", > "elector-pinging" > ] > }, > "outside_quorum": [], > "extra_probe_peers": [ > { > "addrvec": [ > { > "type": "v2", > "addr": "10.45.128.26:3300", > "nonce": 0 > }, > { > "type": "v1", > "addr": "10.45.128.26:6789", > "nonce": 0 > } > ] > } > ], > "sync_provider": [], > "monmap": { > "epoch": 6, > "fsid": "604d56db-2fab-45db-a9ea-c418f9a8cca8", > "modified": "2024-03-31T23:54:18.692983Z", > "created": "2021-09-30T16:15:12.884602Z", > "min_mon_release": 16, > "min_mon_release_name": "pacific", > "election_strategy": 1, > "disallowed_leaders: ": "", > "stretch_mode": false, > "features": { > "persistent": [ > "kraken", > "luminous", > "mimic", > "osdmap-prune", > "nautilus", > "octopus", >
[ceph-users] "ceph orch daemon add osd" deploys broken OSD
Hi everybody. I've faced the situation when I cannot redeploy OSD on a new disk So, I need to replace osd.30 cuz disk always reports about problems with I\O. I do `ceph orch daemon osd.30 --replace` Then I zap DB ``` root@server-2:/# ceph-volume lvm zap /dev/ceph-db/db-88 --> Zapping: /dev/ceph-db/db-88 Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-db/db-88 bs=1M count=10 conv=fsync stderr: 10+0 records in 10+0 records out stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0247342 s, 424 MB/s --> Zapping successful for: ``` And now zap DATA ``` root@server-2:/# ceph-volume lvm zap /dev/sdn --> Zapping: /dev/sdn --> --destroy was not specified, but zapping a whole device will remove the partition table Running command: /usr/bin/dd if=/dev/zero of=/dev/sdn bs=1M count=10 conv=fsync stderr: 10+0 records in 10+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 1.35239 s, 7.8 MB/s --> Zapping successful for: ``` Okay, now disk is ready and orchestrator confirms it ``` root@server-1:~# ceph orch device ls host server-2 --refresh server-2 /dev/sdnhdd ST18000NM008J_5000c500d80398bf 16.3T Yes4m ago ``` Now its time for orchestrator to add new osd ``` root@server-1:~# ceph orch daemon add osd server-2:data_devices=/dev/sdn,db_devices=/dev/ceph-db/db-88 Created no osd(s) on host server-2; already created? ``` But it gives osd.30 in state down. If I try to run systemd service manually, it cannot start because ``` Apr 02 12:30:41 server-2 systemd[1]: Started Ceph osd.30 for ea98e312-dfd9-11ee-a226-33f018c3a407. Apr 02 12:30:41 server-2 bash[3316003]: /bin/bash: /var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/osd.30/unit.run: No such file or directory Apr 02 12:30:41 server-2 systemd[1]: ceph-ea98e312-dfd9-11ee-a226-33f018c3a407@osd.30.service: Main process exited, code=exited, status=127/n/a Apr 02 12:30:41 server-2 bash[3316014]: /bin/bash: /var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/osd.30/unit.poststop: No such file or directory Apr 02 12:30:41 server-2 systemd[1]: ceph-ea98e312-dfd9-11ee-a226-33f018c3a407@osd.30.service: Failed with result 'exit-code'. Apr 02 12:30:51 server-2 systemd[1]: ceph-ea98e312-dfd9-11ee-a226-33f018c3a407@osd.30.service: Scheduled restart job, restart counter is at 1. Apr 02 12:30:51 server-2 systemd[1]: Stopped Ceph osd.30 for ea98e312-dfd9-11ee-a226-33f018c3a407. ``` And even if I try to redeploy osd.30 by ceph orch osd redeploy osd.30 then I get the error in ceph -W cephadm ``` 2024-04-02T12:41:39.856767+ mgr.server-2.opelxj (mgr.2994187) 5453 : cephadm [INF] Reconfiguring daemon osd.30 on server-2 2024-04-02T12:41:41.048352+ mgr.server-2.opelxj (mgr.2994187) 5454 : cephadm [ERR] cephadm exited with an error code: 1, stderr: Non-zero exit code 1 from /usr/bin/docker container inspect --format {{.State.Status}} ceph-ea98e312-dfd9-11ee-a226-33f018c3a407-osd-30 /usr/bin/docker: stdout /usr/bin/docker: stderr Error response from daemon: No such container: ceph-ea98e312-dfd9-11ee-a226-33f018c3a407-osd-30 Non-zero exit code 1 from /usr/bin/docker container inspect --format {{.State.Status}} ceph-ea98e312-dfd9-11ee-a226-33f018c3a407-osd.30 /usr/bin/docker: stdout /usr/bin/docker: stderr Error response from daemon: No such container: ceph-ea98e312-dfd9-11ee-a226-33f018c3a407-osd.30 Reconfig daemon osd.30 ... Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/cephadm.8c89112927b45a1984d03fb02785df709234bdb856619c217e1ad5d54aebef2b/__main__.py", line 10700, in File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/cephadm.8c89112927b45a1984d03fb02785df709234bdb856619c217e1ad5d54aebef2b/__main__.py", line 10688, in main File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/cephadm.8c89112927b45a1984d03fb02785df709234bdb856619c217e1ad5d54aebef2b/__main__.py", line 6620, in command_deploy_from File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/cephadm.8c89112927b45a1984d03fb02785df709234bdb856619c217e1ad5d54aebef2b/__main__.py", line 6638, in _common_deploy File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/cephadm.8c89112927b45a1984d03fb02785df709234bdb856619c217e1ad5d54aebef2b/__main__.py", line , in _dispatch_deploy File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/cephadm.8c89112927b45a1984d03fb02785df709234bdb856619c217e1ad5d54aebef2b/__main__.py", line 3792, in deploy_daemon File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/cephadm.8c89112927b45a1984d03fb02785df709234bdb856619c217e1ad5d54aebef2b/__main__.py", line 3078, in create_daemon_dirs File "/usr/lib/python3.8/contextlib.py", line 120, in __exit__ next(self.gen) File "/var/lib/ceph/ea98e312-dfd9-11ee-a226-33f018c3a407/ceph
[ceph-users] Multi-MDS
Hello, I did the configuration to activate multimds in ceph. The parameters I entered looked like this: 3 assets 1 standby I also placed the distributed pinning configuration at the root of the mounted dir of the storage: setfattr -n ceph.dir.pin.distributed -v 1 / This configuration is working well, but the balance between the MDS ranks is not ok. Look: RANK STATE MDS ACTIVITY DNSINOS DIRS CAPS 0active lovelace.ceph05-ceph.bpqxla Reqs: 91 /s 1396k 1078k 179k 176k 1active lovelace.ceph01-ceph.rncaqh Reqs: 18 /s 862k 571k 110k 292k 2active lovelace.ceph02-ceph.yarywe Reqs: 1155 /s 12804k 12830k 1251k 11672k Is there any extra configuration to improve this balance that I haven't done? Thanks Rafael. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm: daemon osd.x on yyy is in error state
probably `ceph mgr fail` will help. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] cephadm shell version not consistent across monitors
Hi, We are still running ceph Pacific with cephadm and we have run into a peculiar issue. When we run the `cephadm shell` command on monitor1, the container we get runs ceph 16.2.9. However, when we run the same command on monitor2, the container runs 16.2.15, which is the current version of the cluster. Why does it do that and is there a way to force it to 16.2.15 on monitor1? Please note that both monitors have the same configuration. Cephadm has been pulled from GitHub for both monitors instead of the package manager's version. -- Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm shell version not consistent across monitors
From what I can see with the most recent cephadm binary on pacific, unless you have the CEPHADM_IMAGE env variable set, it does a `podman images --filter label=ceph=True --filter dangling=false` (or docker) and takes the first image in the list. It seems to be getting sorted by creation time by default. If you want to guarantee what you get, you can run `cephadm --image shell` and it will try to use the image specified. You could also try that env variable (although I haven't tried that in a very long time if I'm honest, so hopefully it works correctly). If nothing else, just seeing the output of that podman command and removing images that appear before the 16.2.15 one on the list should work. On Tue, Apr 2, 2024 at 5:03 PM J-P Methot wrote: > Hi, > > We are still running ceph Pacific with cephadm and we have run into a > peculiar issue. When we run the `cephadm shell` command on monitor1, the > container we get runs ceph 16.2.9. However, when we run the same command > on monitor2, the container runs 16.2.15, which is the current version of > the cluster. Why does it do that and is there a way to force it to > 16.2.15 on monitor1? > > Please note that both monitors have the same configuration. Cephadm has > been pulled from GitHub for both monitors instead of the package > manager's version. > > -- > Jean-Philippe Méthot > Senior Openstack system administrator > Administrateur système Openstack sénior > PlanetHoster inc. > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Pacific Bug?
https://tracker.ceph.com/issues/64428 should be it. Backports are done for quincy, reef, and squid and the patch will be present in the next release for each of those versions. There isn't a pacific backport as, afaik, there are no more pacific releases planned. On Fri, Mar 29, 2024 at 6:03 PM Alex wrote: > Hi again Adam :-) > > Would you happen to have the Bug Tracker issue for label bug? > > Thanks. > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Failed adding back a node
Hi Adam. Re-deploying didn't work, but `ceph config dump` showed one of the container_images specified 16.2.10-160. After we removed that var, it instantly redeployed the OSDs. Thanks again for your help. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Are we logging IRC channels?
I'll start working on the needed configurations and let you know. On Sat, Mar 23, 2024, 12:09 PM Anthony D'Atri wrote: > I fear this will raise controversy, but in 2024 what’s the value in > perpetuating an interface from early 1980s BITnet batch operating systems? > > > On Mar 23, 2024, at 5:45 AM, Janne Johansson > wrote: > > > >> Sure! I think Wido just did it all unofficially, but afaik we've lost > >> all of those records now. I don't know if Wido still reads the mailing > >> list but he might be able to chime in. There was a ton of knowledge in > >> the irc channel back in the day. With slack, it feels like a lot of > >> discussions have migrated into different channels, though #ceph still > >> gets some community traffic (and a lot of hardware design discussion). > > > > It's also a bit cumbersome to be on IRC when someone pastes 655 lines > > of text on slack, then edits a whitespace or comma that ended up wrong > > and we get a total dump of 655 lines again from the gateway. > > > > -- > > May the most significant bit of your life be positive. > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph status not showing correct monitor services
Hi Eugen, Noticed this in the config dump: Why only "mon.a001s016 " listed?And this is the one that is not listed in "ceph -s" mon advanced auth_allow_insecure_global_id_reclaim false mon advanced auth_expose_insecure_global_id_reclaim false mon advanced mon_compact_on_start true mon.a001s016 basic container_image docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586 * mgr advanced mgr/cephadm/container_image_base docker.io/ceph/daemon mgr advanced mgr/cephadm/container_image_node_exporter docker.io/prom/node-exporter:v0.17.0 cluster: id: 604d56db-2fab-45db-a9ea-c418f9a8cca8 health: HEALTH_OK services: mon: 2 daemons, quorum a001s018,a001s017 (age 45h) mgr: a001s016.ctmoay(active, since 28h), standbys: a001s017.bpygfm mds: 1/1 daemons up, 2 standby osd: 36 osds: 36 up (since 29h), 36 in (since 2y) rgw: 3 daemons active (3 hosts, 1 zones) var lib mon unit.image a001s016: # cat /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s016/unit.image docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586 a001s017: # cat /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s017/unit.image docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586 a001s018: # cat /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s018/unit.image docker.io/ceph/daemon:latest-pacific ceph image tag, digest from docker inspect of: ceph/daemon latest-pacific 6e73176320aa 2 years ago 1.27GB == a001s016: "Id": "sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f", "RepoTags": [ "ceph/daemon:latest-pacific" "RepoDigests": [ "ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586" a001s017: "Id": "sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f", "RepoTags": [ "ceph/daemon:latest-pacific" "RepoDigests": [ "ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586" a001s018: "Id": "sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f", "RepoTags": [ "ceph/daemon:latest-pacific" "RepoDigests": [ "ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586" -Original Message- From: Adiga, Anantha Sent: Tuesday, April 2, 2024 10:42 AM To: Eugen Block Cc: ceph-users@ceph.io Subject: RE: [ceph-users] Re: ceph status not showing correct monitor services Hi Eugen, Currently there are only three nodes, but I can add a node to the cluster and check it out. I will take a look at the mon logs Thank you, Anantha -Original Message- From: Eugen Block Sent: Tuesday, April 2, 2024 12:19 AM To: Adiga, Anantha Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services You can add a mon manually to the monmap, but that requires a downtime of the mons. Here's an example [1] how to modify the monmap (including network change which you don't need, of course). But that would be my last resort, first I would try to find out why the MON fails to join the quorum. What is that mon.a001s016 logging, and what are the other two logging? Do you have another host where you could place a mon daemon to see if that works? [1] https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#example-procedure Zitat von "Adiga, Anantha" : > # ceph mon stat > e6: 2 mons at > {a001s017=[v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0],a001s018=[v2 > :10.45.128.28:3300/0,v1:10.45.128.28:6789/0]}, election epoch 162, > leader 0 a001s018, quorum 0,1 > a001s018,a001s017 > > # ceph orch ps | grep mon > mon.a001s016a001s016 running > (3h) 6m ago 3h 527M2048M 16.2.5 > 6e73176320aa 39db8cfba7e1 > mon.a001s017a001s017 running > (22h)47s ago 1h 993M2048M 16.2.5 > 6e73176320aa e5e5cb6c256c > mon.a001s018a001s018 running > (5w) 48s ago 2y1167M2048M 16.2.5 > 6e73176320aa 7d2bb6d41f54 > >
[ceph-users] Re: RGW Data Loss Bug in Octopus 15.2.0 through 15.2.6
Jonas Nemeiksis wrote: > Hello, > > Maybe your issue depends to this https://tracker.ceph.com/issues/63642 > > > > On Wed, Mar 27, 2024 at 7:31 PM xu chenhui> wrote: > > > Hi, Eric Ivancich > >I have similar problem in ceph version 16.2.5. Has this problem been > > completely resolved in Pacific version? > > Our bucket has no lifecycle rules and no copy operation. This is a very > > serious data loss issue for us and It happens occasionally in our > > environment. > > > > Detail describe: > > > > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/XQRUOEPZ7Y… > > > > thanks. > > ___ > > ceph-users mailing list -- ceph-users(a)ceph.io > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > Hi, My problem is different from https://tracker.ceph.com/issues/63642. All of multiparts and shadow object had lost and only have head object in our environment. Maybe this is new issue that happened in low probability and I haven't reproduce. Is there other information that can help locate root cause or reduce data loss ? thanks. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] RBD image metric
Hi, Trying to pull out some metrics from ceph about the rbd images sizes but haven't found anything only pool related metrics. Wonder is there any metric about images or I need to create by myself to collect it with some third party tool? Thank you This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: rgw s3 bucket policies limitations (on users)
Hey Garcetto, On 29.03.24 4:13 PM, garcetto wrote: i am trying to set bucket policies to allow to different users to access same bucket with different permissions, BUT it seems that is not yet supported, am i wrong? https://docs.ceph.com/en/reef/radosgw/bucketpolicy/#limitations "We do not yet support setting policies on users, groups, or roles." Maybe check out my previous, somewhat similar question: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/S2TV7GVFJTWPYA6NVRXDL2JXYUIQGMIN/ And PR https://github.com/ceph/ceph/pull/44434 could also be of interest. I would love for RGW to support more detailed bucket policies, especially with external / Keystone authentication. Regards Christian ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io