[ceph-users] Re: Ceph symbols for v15_2_0 in pacific libceph-common

2025-01-16 Thread Bill Scales
Hi, Nothing to worry about here you are using the correct symbols – the v15_2_0 in symbols like ceph::buffer::v15_2_0::ptr::copy_out is an API version, not the code version. There have not been any API changes to ceph::buffer for several years so it still has v15_2_0 even in the latest squid re

[ceph-users] Re: MDS hung in purge_stale_snap_data after populating cache

2025-01-16 Thread Frank Schilder
The MDS was up over night and it started showing CPU load again. I added a screen show to the imgur post (https://imgur.com/a/mds-hung-purge-stale-snap-data-after-populating-cache-RF7ExSP). Unfortunately, its only the messenger threads. The MDS seems to idle around. Best regards, ==

[ceph-users] Re: Many misplaced PG's, full OSD's and a good amount of manual intervention to keep my Ceph cluster alive.

2025-01-16 Thread Janne Johansson
Den tors 16 jan. 2025 kl 00:08 skrev Bruno Gomes Pessanha < bruno.pessa...@gmail.com>: > Hi everyone. Yes. All the tips definitely helped! Now I have more free > space in the pools, the number of misplaced PG's decreased a lot and lower > std deviation of the usage of OSD's. The storage looks way

[ceph-users] Re: More objects misplaced than exist?

2025-01-16 Thread Andre Tann
Hi Anthony, answering also to the list... Am 16.01.25 um 15:52 schrieb Anthony D'Atri: When I see anomalous status my first thought is to manually failover the mgr I stopped the active mgr, another took over, but still the status is the same:   data:     volumes: 1/1 healthy     pools:   4

[ceph-users] Re: Modify or override ceph_default_alerts.yml

2025-01-16 Thread Redouane Kachach
Hi Eugen, Not sure if that will work or not (I didn't try it myself) but there's an option to configure the ceph alerts path in cephadm: Option( 'prometheus_alerts_path', type='str', *default='/etc/prometheus/ceph/ceph_default_alerts.yml'*,

[ceph-users] Re: MDS hung in purge_stale_snap_data after populating cache

2025-01-16 Thread Frank Schilder
I think I finally found the moment where everything goes downhill. Please take a look at this comment: https://tracker.ceph.com/issues/69547?next_issue_id=69546#note-4 . This looks a lot like a timeout, but I have no clue what to look for. Any hint is greatly appreciated. Thanks and best regar

[ceph-users] Re: [EXTERNAL] Re: Cephadm: Specifying RGW Certs & Keys By Filepath

2025-01-16 Thread Alex Hussein-Kershaw (HE/HIM)
Oh actually I have spoke to soon. That does work, but it also exposes port HTTP over port 80. 🙁   beast port=80 ssl_port=7480 ssl_certificate=/etc/ssl/certs/server.crt ssl_private_key=/etc/ssl/private/server.key From: Alex Hussein-Kershaw (HE/HIM) Sent: Thur

[ceph-users] Re: Cephadm: Specifying RGW Certs & Keys By Filepath

2025-01-16 Thread Alex Hussein-Kershaw (HE/HIM)
I had a look at the code and came to the conclusion that this isn't possible currently, but I think I can make a small code change to support this. I've raised a tracker: Bug #69567: Cephadm: Specifying RGW Certs & Keys By Filepath - Orchestrator - Ceph, an

[ceph-users] Re: Cephadm: Specifying RGW Certs & Keys By Filepath

2025-01-16 Thread Redouane Kachach
You are getting the double option because "ssl: true" ... try to disable ssl since you are passing the arguments and certificates by hand! Another option is to have cephadm generate the certificates for you by setting the `generate_cert` field in the spec to true. But I'm not sure if that works fo

[ceph-users] Re: [EXTERNAL] Re: Cephadm: Specifying RGW Certs & Keys By Filepath

2025-01-16 Thread Alex Hussein-Kershaw (HE/HIM)
Amazing. How did I miss that. Dropping "ssl: true" and adding "ssl_port=1234" to the rgw_frontend_extra_args values has me sorted. Many thanks! From: Redouane Kachach Sent: Thursday, January 16, 2025 4:39 PM To: Alex Hussein-Kershaw (HE/HIM) Cc: ceph-users Subj

[ceph-users] Re: Modify or override ceph_default_alerts.yml

2025-01-16 Thread Eugen Block
Hi Redo, I've been looking into the templates and have a question. Maybe you could help clarify. I understand that I can create custom alerts and inject them with: ceph config-key set mgr/cephadm/services/prometheus/alerting/custom_alerts.yml -i custom_alerts.yml It works when I want

[ceph-users] More objects misplaced than exist?

2025-01-16 Thread Andre Tann
Hi all, # ceph -w ... volumes: 1/1 healthy pools: 4 pools, 2081 pgs objects: 3.72M objects, 12 TiB usage: 36 TiB used, 226 TiB / 262 TiB avail pgs: 18590764/11154459 objects misplaced (166.667%) 2081 active+clean+remapped How can more objects be misplaced

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-16 Thread Thomas Byrne - STFC UKRI
Hi Frédéric, We've had an internal discussion, and we would love to share our experience as a case study. If you still think this would be of interest, please let us know what we need to do. We have had 5 monitors on this cluster from about 2018 I think. I actually did a quick investigation in

[ceph-users] Re: issue with new AWS cli when upload: MissingContentLength

2025-01-16 Thread Christian Rohmann
I added Matt as CC as he is the one who implemented the new checksum capabilities I reference below ... On 16.01.25 8:26 AM, Szabo, Istvan (Agoda) wrote: Amazon released a new version of their cli today https://github.com/aws/aws-cli/tags and seems to break our stuffs with the following erro

[ceph-users] Cephadm: Specifying RGW Certs & Keys By Filepath

2025-01-16 Thread Alex Hussein-Kershaw (HE/HIM)
Hi Folks, Looking for some advice on RGW service specs and Cephadm. I've read the docs here: RGW Service — Ceph Documentation Using a service spec I can deploy a RGW: service_type: rgw service_id: '25069123' service_name: rgw.25069123 placem

[ceph-users] Re: More objects misplaced than exist?

2025-01-16 Thread Andre Tann
Am 16.01.25 um 16:53 schrieb Andre Tann:    --- POOLS ---    POOL ID   PGS   STORED OBJECTS USED %USED  MAX AVAIL    .mgr  1 1  7.6 MiB 3   15 MiB  100.00    0 B    ReplicationPool   2  1024  8.0 TiB 2.11M   24 TiB  100.00    0 B    cephfs_data  

[ceph-users] Re: squid 19.2.1 RC QE validation status

2025-01-16 Thread Guillaume ABRIOUX
Hello, The rook ci must be failing because a `ceph-bluestore-tool` backport [1] is missing. This backport was merged ~6 hours ago. [1] https://github.com/ceph/ceph/pull/60543 Regards, -- Guillaume Abrioux Software Engineer De : Travis Nielsen Envoyé : mercredi

[ceph-users] Re: squid 19.2.1 RC QE validation status

2025-01-16 Thread Yuri Weinstein
Does this have to be cherry-picked to 19.2.1? What tests are to be rerun if yes? On Thu, Jan 16, 2025 at 11:21 AM Guillaume ABRIOUX wrote: > > Hello, > > The rook ci must be failing because a `ceph-bluestore-tool` backport [1] is > missing. > This backport was merged ~6 hours ago. > > [1] https

[ceph-users] Re: MDS hung in purge_stale_snap_data after populating cache

2025-01-16 Thread Bailey Allison
Frank, Are you able to share an update to date ceph config dump and ceph daemon mds.X perf dump | grep strays from the cluster? We're just getting through our comically long ceph outage, so i'd like to be able to share the love here hahahaha Regards, Bailey Allison Service Team Lead 45Driv

[ceph-users] Re: squid 19.2.1 RC QE validation status

2025-01-16 Thread Travis Nielsen
I confirmed that the Rook CI is now passing with the latest squid devel image that was pushed a couple hours ago, including this fix. It is a blocker for properly starting OSDs, at least for Rook. Guillaume was also able to repro the related issue outside Rook. So yes, please it needs to be include

[ceph-users] Re: squid 19.2.1 RC QE validation status

2025-01-16 Thread Travis Nielsen
Although, I am not clear the difference between the squid branch since we started seeing this issue last week, and the 19.2.1 branch, so Guillaume or RADOS team should confirm for sure. On Thu, Jan 16, 2025 at 1:38 PM Travis Nielsen wrote: > I confirmed that the Rook CI is now passing with the l

[ceph-users] Re: [EXTERNAL] Re: Cephadm: Specifying RGW Certs & Keys By Filepath

2025-01-16 Thread Redouane Kachach
That's strange... in the code I can see that when rgw_frontend_port it's used so I can't see why you get port=80 ... can you plz post your spec? On Thu, Jan 16, 2025 at 7:09 PM Alex Hussein-Kershaw (HE/HIM) < alex...@microsoft.com> wrote: > Oh actually I have spoke to soon. That does work, but