[ceph-users] Re: quincy v17.2.8 QE Validation status

2024-11-06 Thread Brad Hubbard
On Thu, Nov 7, 2024 at 2:27 PM Brad Hubbard wrote: > > On Wed, Nov 6, 2024 at 2:05 AM Yuri Weinstein wrote: > > > > > powercycle - Brad pls approve > > Approved. The issue in both cases is > https://tracker.ceph.com/issues/68864 which is an issue in building an > executable run as workload for th

[ceph-users] Re: quincy v17.2.8 QE Validation status

2024-11-06 Thread Brad Hubbard
On Wed, Nov 6, 2024 at 2:05 AM Yuri Weinstein wrote: > > powercycle - Brad pls approve Approved. The issue in both cases is https://tracker.ceph.com/issues/68864 which is an issue in building an executable run as workload for the test, not the test itself. > > ceph-volume - Guillaume pls take a

[ceph-users] Re: quincy v17.2.8 QE Validation status

2024-11-06 Thread Yuri Weinstein
Hi Nizam https://tracker.ceph.com/issues/68861 (test for https://github.com/ceph/ceph/pull/60634) assigned for your review/approval I will cherry-ping this ^ plus https://github.com/ceph/ceph/pull/60366 https://github.com/ceph/ceph/pull/59142 as soon as you approve. Thx On Wed, Nov 6, 2024 at 6

[ceph-users] Re: [External Email] Re: Recreate Destroyed OSD

2024-11-06 Thread Tim Holloway
OK. Here's comprehensive info on ceph and spec files: https://documentation.suse.com/ses/7.1/single-html/ses-deployment/index.html#cephadm-service-and-placement-specs I was remembering correctly. You can use one file to deploy multiple services. In essence, this replaces a lot of the stuff that

[ceph-users] Re: [External Email] Re: Recreate Destroyed OSD

2024-11-06 Thread Adam King
Quick comment on the CLI argos vs. the spec file. It actually shouldn't allow you to do both for any flags that actually affect the service. If you run `ceph orch apply -i ` it will only make use of the spec file and should return an error if flags that affect the service like `--unamanged` or `--p

[ceph-users] Re: Unable to add OSD

2024-11-06 Thread Adam King
I see you mentioned apparmor and MongoDB, so I guess there's a chance you found https://tracker.ceph.com/issues/66389 already (your traceback also looks the same). Other than making sure that relevant apparmor file it's parsing doesn't contain settings with spaces or trying to manually apply the fi

[ceph-users] Re: [EXTERNAL] Re: Ceph Multisite Version Compatibility

2024-11-06 Thread Alex Hussein-Kershaw (HE/HIM)
I made some progress understanding this. It seems the RGW is aware that the sync is behind, despite not reporting it on "sync status". $ radosgw-admin sync status realm d2fa006d-7ced-423f-8510-9ac494c4f4ec (geored_realm) zonegroup 583c773c-b7e5-4e7f-a51e-c602237ec9c6 (geored_zg)

[ceph-users] Re: Squid 19.2.0 balancer causes restful requests to be lost

2024-11-06 Thread Ernesto Puerta
> We use the restful API for monitoring (using the Ceph for Zabbix Agent 2 > plugin, as Zabbix is the over-arching monitoring platform in the data > centre) Chris, just FYI: the "restful" mgr module was deprecated 4 years ago [1] and will be removed in v20 (Tentacle). [2] Something similar will ha

[ceph-users] Re: [External Email] Re: Recreate Destroyed OSD

2024-11-06 Thread Frédéric Nass
- Le 6 Nov 24, à 12:29, Eugen Block ebl...@nde.ag a écrit : > Dave, > > I noticed that the advanced osd spec docs are missing a link to > placement-by-pattern-matching docs (thanks to Zac and Adam for picking > that up): > > https://docs.ceph.com/en/latest/cephadm/services/#placement-by-pa

[ceph-users] Re: Unable to add OSD

2024-11-06 Thread Tim Holloway
1. Make sure you have enough RAM on ceph-1 and the "ls -h /" indicates that the system disk is less than 70% full (managed services eat a LOt of disk space!) 2. Check your selunix audit log to make sure nothing's being blocked. 3. Check your /var/lib/ceph and /var/lib/ceph/16a56cdf-9bb4-11ef-

[ceph-users] Re: [External Email] Re: Recreate Destroyed OSD

2024-11-06 Thread Tim Holloway
On 11/6/24 11:04, Frédéric Nass wrote: ... You could enumerate all hosts one by one or use a pattern like 'ceph0[1-2]' You may also use regex patterns depending on the version of Ceph that you're using. Check [1]. Regex patterns should be available in next minor Quincy release 17.2.8. [1] htt

[ceph-users] Re: Ceph Multisite Version Compatibility

2024-11-06 Thread Eugen Block
Hi Alex, I don't have a real good answer, just wanted to mention that one of our customers had some issues with multi-site when they were on the same major version (Octopus) but not on the same minor version. But it wasn't that the sync didn't work at all, it worked in general. Only from

[ceph-users] Re: Backfill full osds

2024-11-06 Thread Anthony D'Atri
I’ve successfully used a *temporary* relax of the ratios to get out of a sticky situation, but I must qualify that with an admonition to make SURE that you move them back ASAP. Note that backfillfull_ratio is enforced to be lower than full_ratio, so depending on how close to the precipice you s

[ceph-users] Re: [External Email] Re: Recreate Destroyed OSD

2024-11-06 Thread Frédéric Nass
- Le 1 Nov 24, à 19:28, Dave Hall kdh...@binghamton.edu a écrit : > Tim, > > Actually, the links the Eugen shared earlier were sufficient. I ended up > with > > service_type: osd > service_name: osd > placement: > host_pattern: 'ceph01' > spec: > data_devices: >rotational: 1 > db_d

[ceph-users] Pacific: mgr loses osd removal queue

2024-11-06 Thread Eugen Block
Hi, I'm not sure if this has been asked before, or if there's an existing tracker issue already. It's difficult to reproduce it on my lab clusters. I'm testing some new SSD OSDs on a Pacific cluster (16.2.15) and noticed that if we instruct the orchestrator to remove two or three OSDs (is

[ceph-users] Backfill full osds

2024-11-06 Thread Szabo, Istvan (Agoda)
Hi, We have a 60% full cluster where unfortunately pg scaler hasn't beek used and during the time generated gigantic pgs which when moved from 1 osd to the other it gets full and blocks write. Tried to add 2 new node but before they would he utilised other osd gets full. Trying now backfill full

[ceph-users] Re: [EXTERNAL] Re: Ceph Multisite Version Compatibility

2024-11-06 Thread Alex Hussein-Kershaw (HE/HIM)
Hi Eugen, Thanks for the suggestions. It has worked for me before. It's certainly possible it's a misconfiguration, however I've reproduced this on upgrade of some long lived systems that have been happily syncing away on Octopus for several years. Definitely keen to understand if I'm missing

[ceph-users] Re: Setting temporary CRUSH "constraint" for planned cross-datacenter downtime

2024-11-06 Thread Frédéric Nass
Hi Niklas, To explain the 33% misplaced objects after you move a host to another DC, one would have to check the current crush rule (ceph osd getcrushmap | crushtool -d -) and to which OSDs PGs are mapped to before and after the move operation (ceph pg dump). Regarding the replicated crush rul

[ceph-users] Re: OSD refuse to start

2024-11-06 Thread Albert Shih
Le 05/11/2024 à 13:59:34-0500, Tim Holloway a écrit Hi, Thanks for you long answer. > That can be a bit sticky. > > First, check to see if you have a /var/log/messages file. The dmesg log > isn't always as complete. Forget to say, no relevant message at least for my understanding, just mess

[ceph-users] Unable to add OSD

2024-11-06 Thread tpDev Tester
Hi, I try to add OSDs to my new Cluster (Ubuntu 24.04 + podman), Four devices are listed as available: root@ceph-1:~#  ceph-volume inventory Device Path   Size Device nodes    rotates available Model name /dev/nvme0n1  1.82 TB  nvme0n1 False True 

[ceph-users] Re: Backfill full osds

2024-11-06 Thread Szabo, Istvan (Agoda)
This is the correct link sorry: https://gist.github.com/Badb0yBadb0y/f29af56ab724603ac5fc385a680c4316 From: Szabo, Istvan (Agoda) Sent: Wednesday, November 6, 2024 10:09 PM To: Eugen Block ; ceph-users@ceph.io Subject: Re: [ceph-users] Re: Backfill full osds Tha

[ceph-users] Re: Backfill full osds

2024-11-06 Thread Szabo, Istvan (Agoda)
Thank you, I've collected some outputs here: https://gist.githubusercontent.com/Badb0yBadb0y/f29af56ab724603ac5fc385a680c4316/raw/95959203701a8cc0c85312a69dc77de25fc347d9/gistfile1.txt From: Eugen Block Sent: Wednesday, November 6, 2024 9:11 PM To: ceph-users@ce

[ceph-users] Re: [External Email] Re: Recreate Destroyed OSD

2024-11-06 Thread Eugen Block
Dave, I noticed that the advanced osd spec docs are missing a link to placement-by-pattern-matching docs (thanks to Zac and Adam for picking that up): https://docs.ceph.com/en/latest/cephadm/services/#placement-by-pattern-matching But according to that, your host_pattern specification shou

[ceph-users] Re: quincy v17.2.8 QE Validation status

2024-11-06 Thread Nizamudeen A
Hi Yuri, The major item I see in the dashboard where we will need help testing would be this fix which fixes the teuthtology failures in quincy https://github.com/ceph/ceph/pull/60634 The other one is https://github.com/ceph/ceph/pull/60366, which doesn't n

[ceph-users] Re: Backfill full osds

2024-11-06 Thread Eugen Block
Hi, depending on the actual size of the PGs and OSDs, it could be sufficient to temporarily increase the backfillfull_ratio (default 90%) to 91% or 92%, at 95% is the cluster is considered full, so you need to be really careful with those ratios. If you provided more details about the cur