[ceph-users] Re: ceph-ansible LARGE OMAP in RGW pool

2025-03-14 Thread Danish Khan
Dear Frédéric, 1/ Identify the shards with the most sync errors log entries: I have identified the shard which is causing the issue is shard 31, but almost all the error shows only one object of a bucket. And the object exists in the master zone. but I'm not sure why the replication site is unabl

[ceph-users] Re: Adding device class to CRUSH rule without data movement

2025-03-14 Thread Anthony D'Atri
I haven’t used reclassify in a while, but does the output look like the below, specifying the device class with item_name ? { "rule_id": 3, "rule_name": "tsECpool", "type": 3, "steps": [ { "op": "set_choo

[ceph-users] Re: Massive performance issues

2025-03-14 Thread Anthony D'Atri
> > - Le 14 Mar 25, à 8:40, joachim kraftmayer joachim.kraftma...@clyso.com a > écrit : > >> Hi Thomas & Anthony, >> >> Anthony provided great recommendations. Danke! >> ssd read performance: >> I find the total number of pg per ssd osd too low; it can be twice as high. They’re short o

[ceph-users] Re: Attention: Documentation

2025-03-14 Thread Dan van der Ster
Thanks Joel -- adding Zac. Cheers, dan On Fri, Mar 14, 2025 at 11:08 AM Joel Davidow wrote: > > While reading about mons, I've come across two documentation issues. > > 1. The 2013 blog "Monitors and Paxos, a chat with Joao" has a broken link. > >- The broken link returns a 404 >- The ba

[ceph-users] Re: Attention: Documentation

2025-03-14 Thread Zac Dover
I’m on it as well. I’ll take care of it, Anthony. This arrived at 3 AM on Saturday here, so I am just now waking up and seeing it. Sent from [Proton Mail](https://proton.me/mail/home) for iOS On Sat, Mar 15, 2025 at 08:06, Anthony D'Atri <[anthony.da...@gmail.com](mailto:On Sat, Mar 15, 2025 at

[ceph-users] Re: Attention: Documentation

2025-03-14 Thread Anthony D'Atri
I’ll PR this over the weekend. I found the missing graphics via the Wayback Machine. A lot has changed since then, you might get fresher context from the online docs. > On Mar 14, 2025, at 5:08 PM, Dan van der Ster wrote: > > Thanks Joel -- adding Zac. > > Cheers, dan > >> On Fri, Mar

[ceph-users] Re: [RGW] Full replication gives stale recovering shard

2025-03-14 Thread Gilles Mocellin
Le 2025-03-12 19:33, Gilles Mocellin a écrit : Hello Cephers, Since I didn't have any progress on my side, I share again my problem, hoping for some clues. --- Since I was not confident in my replication status, I've done a radosgw sync init one after the other, in both of my zones. Since

[ceph-users] Re: Adding device class to CRUSH rule without data movement

2025-03-14 Thread Eugen Block
The crushtool would do that with the --reclassify flag. There was a thread here on this list a couple of months ago. I’m on my mobile, I don’t have a link for you right now. But the docs should also contain some examples, if I’m not mistaken. Zitat von Hector Martin : Hi, I have an old

[ceph-users] Adding device class to CRUSH rule without data movement

2025-03-14 Thread Hector Martin
Hi, I have an old Mimic cluster that I'm doing some cleanup work on and adding SSDs, before upgrading to a newer version. As part of adding SSDs, I need to switch the existing CRUSH rules to only use the HDD device class first. Is there some way of doing this that doesn't result in 100% data move

[ceph-users] Re: How to (permanently) disable msgr v1 on Ceph?

2025-03-14 Thread Stefan Kooman
On 14-03-2025 10:44, Janne Johansson wrote: I'll leave it to the devs to discuss this one. It would be nice if the defaults for newly created clusters also came with the global reclaim id thing disabled, so we didn't have to manually enable msgrv2 (and disable v1 possibly as per this thread) an

[ceph-users] Re: How to (permanently) disable msgr v1 on Ceph?

2025-03-14 Thread Radoslaw Zarzynski
Just a supplement. I'm aware there are different stances on the client compatibility ranging from perpetual to N-3. Yet, being "compatible by default" is somewhat different from having "optional compatibility" (by opt-in). On Fri, Mar 14, 2025 at 11:34 AM Radoslaw Zarzynski wrote: > > Thank you t

[ceph-users] Re: How to (permanently) disable msgr v1 on Ceph?

2025-03-14 Thread Radoslaw Zarzynski
Thank you the bug report! Yeah, there was no so much concern about disabling v1 AFAIK. > In what Ceph release will v1 be disabled by default on new clusters? This a really good question that has many common points with the old discussions on our guarantees toward the compatibility with clients. S

[ceph-users] Re: How to (permanently) disable msgr v1 on Ceph?

2025-03-14 Thread Janne Johansson
> >> I'll leave it to the devs to discuss this one. > > > > It would be nice if the defaults for newly created clusters also came > > with the global reclaim id thing disabled, so we didn't have to > > manually enable msgrv2 (and disable v1 possibly as per this thread) > > and also disable the recl

[ceph-users] Re: How to (permanently) disable msgr v1 on Ceph?

2025-03-14 Thread Stefan Kooman
On 14-03-2025 09:53, Janne Johansson wrote: On 13-03-2025 16:08, Frédéric Nass wrote: If ceph-mon respected ms_bind_msgr1 = false, then one could add --ms-bind-msgr1=false as extra_entrypoint_args in the mon service_type [1], so as to have any ceph-mon daemons deployed or redeployed using msgr

[ceph-users] Re: How to (permanently) disable msgr v1 on Ceph?

2025-03-14 Thread Janne Johansson
> On 13-03-2025 16:08, Frédéric Nass wrote: > > If ceph-mon respected ms_bind_msgr1 = false, then one could add > > --ms-bind-msgr1=false as extra_entrypoint_args in the mon service_type [1], > > so as to have any ceph-mon daemons deployed or redeployed using msgr v2 > > exclusively. > > Unfortu

[ceph-users] Re: Massive performance issues

2025-03-14 Thread Frédéric Nass
Hi folks, I would also run an iostat -dmx 1 on host 'lenticular' during the fio benchmark just to make sure osd.10 is not being badly hammered with I/Os, which could be capping the cluster's HDD performance due to the very high number of PGs this OSD is involved in. > 10hdd 3.63869 1.0

[ceph-users] Re: Massive performance issues

2025-03-14 Thread Joachim Kraftmayer
Hi Thomas & Anthony, Anthony provided great recommendations. ssd read performance: I find the total number of pg per ssd osd too low; it can be twice as high. hdd read performance What makes me a little suspicious is that the maximum throughput of about 120 MB/s is exactly the maximum of a 1 Gbi