[ceph-users] Re: [EXTERNAL] Re: RGW Multisite with a Self-Signed CA

2024-07-16 Thread Alex Hussein-Kershaw (HE/HIM)
Hi Eugen, Thanks for the advice - I've tried something similiar but no luck (my base OS is RHEL 9, so the paths don't quite line up with yours, I have no /var/lib/ca-certificates directory). I can curl inside the RGW container to the remote site and it is happy with the certificate as per below

[ceph-users] Re: Large omap in index pool even if properly sharded and not "OVER"

2024-07-16 Thread Szabo, Istvan (Agoda)
Finally this article was the solution: https://access.redhat.com/solutions/6450561 Main point is to trim the bilog: radosgw-admin bilog trim --bucket="bucket-name" --bucket-id="bucket-id" Then scrub, done! Happy day! From: Frédéric Nass Sent: Monday, July 15,

[ceph-users] Re: Schödinger's OSD

2024-07-16 Thread Tim Holloway
Interesting. And thanks for the info. I did a quick look-around. The admin node, which is one of the mixed- osd machines has these packages installed: centos-release-ceph-pacific-1.0-2.el9.noarch cephadm-16.2.14-2.el9s.noarch libcephfs2-16.2.15-1.el9s.x86_64 python3-ceph-common-16.2.15-1.el9s.x86

[ceph-users] Unable to mount with 18.2.2

2024-07-16 Thread Albert Shih
Hi everyone My cluster ceph run currently 18.2.2 and ceph -s say everything are OK root@cthulhu1:/var/lib/ceph/crash# ceph -s cluster: id: 9c5bb196-c212-11ee-84f3-c3f2beae892d health: HEALTH_OK services: mon: 5 daemons, quorum cthulhu1,cthulhu5,cthulhu3,cthulhu4,cthulhu2 (age

[ceph-users] Re: Schödinger's OSD

2024-07-16 Thread Tim Holloway
OK. I deleted the questionable stuff with this command: dnf erase ceph-mgr-modules-core-16.2.15-1.el9s.noarch ceph-mgr- diskprediction-local-16.2.15-1.el9s.noarch ceph-mgr-16.2.15- 1.el9s.x86_64 ceph-mds-16.2.15-1.el9s.x86_64 ceph-mon-16.2.15- 1.el9s.x86_64 That left these two: centos-release-

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-16 Thread Eugen Block
Are all clients trying to connect to the same ceph cluster? Have you compared their ceph.conf files? Maybe during the upgrade something went wrong and an old file was applied or something? Zitat von Albert Shih : Hi everyone My cluster ceph run currently 18.2.2 and ceph -s say everything a

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-16 Thread Albert Shih
Le 16/07/2024 à 12:38:08+, Eugen Block a écrit Hi, > Are all clients trying to connect to the same ceph cluster? Have you Yes. > compared their ceph.conf files? Maybe during the upgrade something went Yes they are identical, same ceph.conf everywhere. My ceph.conf are deploy by puppet.

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-16 Thread David C.
Hi Albert, I think it's related to your network change. Can you send me the return of "ceph report" ? Le mar. 16 juil. 2024 à 14:34, Albert Shih a écrit : > Hi everyone > > My cluster ceph run currently 18.2.2 and ceph -s say everything are OK > > root@cthulhu1:/var/lib/ceph/crash# ceph -s >

[ceph-users] Re: Schödinger's OSD

2024-07-16 Thread Eugen Block
Hi, The final machine is operational and I'm going to leave it, but it does show 1 quirk. Dashboard and osd tree show its OSD as up/running, but "ceph orch ps" shows it as "stopped". My guess is that ceph orch is looking for the container OSD and doesn't notice the legacy OSD. I assume so, yes

[ceph-users] How to detect condition for offline compaction of RocksDB?

2024-07-16 Thread Rudenko Aleksandr
Hi, We have a big Ceph cluster (RGW case) with a lot of big buckets 10-500M objects with 31-1024 shards and a lot of io generated by many clients. Index pool placed on enterprise SSDs. We have about 120 SSDs (replication 3) and about 90Gb of OMAP data on each drive. About 75 PGs on each SSD for

[ceph-users] Re: Unable to mount with 18.2.2

2024-07-16 Thread David C.
Albert, The network is ok. However, strangely, the osd and mds did not activate msgr v2 (msgr v2 was activated on mon). It is possible to bypass by adding the "ms_mode=legacy" option but you need to find out why msgr v2 is not activated Le mar. 16 juil. 2024 à 15:18, Albert Shih a écrit : >

[ceph-users] Re: Node Exporter keep failing while upgrading cluster in Air-gapped ( isolated environment ).

2024-07-16 Thread Saif Mohammad
Hello Adam, Thanks for the prompt response. We have below image in private-registry for node-exporter. 192.168.1.10:5000/prometheus/node-exporter v1.5.0 0da6a335fe13 19 months ago 22.5MB But upon ceph upgrade, we are getting the mentioned image ( quay.io/prometheus/node-expo

[ceph-users] Re: How to detect condition for offline compaction of RocksDB?

2024-07-16 Thread Joshua Baergen
Hello Aleksandr, What you're probably experiencing is tombstone accumulation, a known issue for Ceph's use of rocksdb. > 1. Why can't automatic compaction manage this on its own? rocksdb compaction is normally triggered by level fullness and not tombstone counts. However, there is a feature in r

[ceph-users] Re: Node Exporter keep failing while upgrading cluster in Air-gapped ( isolated environment ).

2024-07-16 Thread Adam King
I wouldn't worry about the one the config option gives you right now. The one on your local repo looks like the same version. For isolated deployments like this, the default options aren't going to work, as they'll always point to images that require internet access to pull. I'd just update the con

[ceph-users] Re: How to detect condition for offline compaction of RocksDB?

2024-07-16 Thread Frédéric Nass
Hi Rudenko, There's been this bug [1] in the past preventing BlueFS alert from popping up on ceph -s due to some code refactoring. You might just be facing over spilling without noticing. I'm saying this because you're running v16.2.13 and this bug was fixed in v16.2.14 (by [3], based on Pacifi