[ceph-users] How can I increase or decrease the number of osd backfilling instantly

2024-06-27 Thread Jaemin Joo
Hi All, I'd like to speed up or slow down osd recovery in ceph v18.2.1. According to the page ( https://www.suse.com/ko-kr/support/kb/doc/?id=19693 ), I understand that osd_max_backfills, osd_recovery_max_active have to be increased or decreased. but It seems not to impact the number of osd ba

[ceph-users] Re: Viability of NVMeOF/TCP for VMWare

2024-06-27 Thread Alexander Patrakov
For NFS (e.g., as implemented by NFS-ganesha), the situation is also quite stupid. Without high availability (HA), it works (that is, until you update NFS-Ganesha version), but corporate architects won't let you deploy any system without HA, because, in their view, non-HA systems are not productio

[ceph-users] Re: Huge amounts of objects orphaned by lifecycle policy.

2024-06-27 Thread Adam Prycki
Hi Casey, I did use full `radosgw-admin gc process --include-all` I know about that background gc delay. Is running `radosgw-admin gc process --include-all` from terminal any different than gc process running in the background? I wonder if I should use it while trying to recreate this issue.

[ceph-users] Ceph Days London CFP Deadline

2024-06-27 Thread Noah Lehman
Hi everyone, This is a reminder that the deadline to submit a speaking proposal for Ceph Days London is this Sunday, June 30th. Be sure to submit a proposal before then! Details here . Best, Noah ___

[ceph-users] Re: Huge amounts of objects orphaned by lifecycle policy.

2024-06-27 Thread Casey Bodley
hi Adam, On Thu, Jun 27, 2024 at 4:41 AM Adam Prycki wrote: > > Hello, > > I have a question. Do people use rgw lifecycle policies in production? > I had big hopes for this technology bug in practice it seems to be very > unreliable. > > Recently I've been testing different pool layouts and using

[ceph-users] Re: Viability of NVMeOF/TCP for VMWare

2024-06-27 Thread Anthony D'Atri
There are folks actively working on this gateway and there's a Slack channel. I haven't used it myself yet. My understanding is that ESXi supports NFS. Some people have had good success mounting KRBD volumes on a gateway system or VM and re-exporting via NFS. > On Jun 27, 2024, at 09:01, Dr

[ceph-users] Re: ceph rgw zone create fails EINVAL

2024-06-27 Thread Daniel Gryniewicz
I would guess that it probably does, but I don't know for sure. Daniel On 6/26/24 10:04 AM, Adam King wrote: Interesting. Given this is coming from a radosgw-admin call being done from within the rgw mgr module, I wonder if a  radosgw-admin log file is ending up in the active mgr container whe

[ceph-users] Viability of NVMeOF/TCP for VMWare

2024-06-27 Thread Drew Weaver
Howdy, I recently saw that Ceph has a gateway which allows VMWare ESXi to connect to RBD. We had another gateway like this awhile back the ISCSI gateway. The ISCSI gateway ended up being... let's say problematic. Is there any reason to believe that NVMeOF will also end up on the floor and has

[ceph-users] Large omap in index pool even if properly sharded and not "OVER"

2024-06-27 Thread Szabo, Istvan (Agoda)
Hi, I have a pretty big bucket which sharded with 1999 shard so in theory can hold close to 200m objects (199.900.000). Currently it has 54m objects. Bucket limit check looks also good: "bucket": ""xyz, "tenant": "", "num_objects": 53619489, "num_shards": 1999, "objects_per_shard": 26823,

[ceph-users] Re: CephFS MDS crashing during replay with standby MDSes crashing afterwards

2024-06-27 Thread Dhairya Parmar
Ivan, before resetting the journal, could you take the backup of your journal using `cephfs-journal-tool export` [0] and send it to us through `ceph-post-file` [1] or any other means you're comfortable with? [0] https://docs.ceph.com/en/latest/cephfs/cephfs-journal-tool/#example-journal-import-exp

[ceph-users] Re: CephFS MDS crashing during replay with standby MDSes crashing afterwards

2024-06-27 Thread Dhairya Parmar
Hi Ivan, The solution (which has been successful for us in the past) is to reset the journal. This would bring the fs back online and return the MDSes to a stable state, but some data would be lost—the data in the journal that hasn't been flushed to the backing store would be gone. Therefore, you

[ceph-users] Re: CephFS MDS crashing during replay with standby MDSes crashing afterwards

2024-06-27 Thread Ivan Clayson
Hi Dhairya, We can induce the crash by simply restarting the MDS and the crash seems to happen when an MDS goes from up:standby to up:replay. The MDS works through a few files in the log before eventually crashing where I've included the logs for this here (this is after I imported the backed

[ceph-users] Re: OSD service specs in mixed environment

2024-06-27 Thread Frédéric Nass
Hi Torkil, Ruben, I see two theoretical ways to do this without additional OSD service. One that probably doesn't work :-) and another one that could work depending on how the orchestrator prioritize its actions based on services criteria. The one that probably doesn't work is by specifying mul

[ceph-users] Huge amounts of objects orphaned by lifecycle policy.

2024-06-27 Thread Adam Prycki
Hello, I have a question. Do people use rgw lifecycle policies in production? I had big hopes for this technology bug in practice it seems to be very unreliable. Recently I've been testing different pool layouts and using lifecycle policy to move data between them. Once I've checked orphaned

[ceph-users] Re: pg deep-scrub control scheme

2024-06-27 Thread Frank Schilder
Sorry, the entry point is actually https://github.com/frans42/ceph-goodies/blob/main/doc/TuningScrub.md = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: Thursday, June 27, 2024 9:02 AM To: David Yang; Ceph U

[ceph-users] Re: pg deep-scrub control scheme

2024-06-27 Thread Frank Schilder
> Is there a calculation formula that can be used to easily configure > the scrub/deepscrub strategy? There is: https://github.com/frans42/ceph-goodies/blob/main/doc/RecommendationsForScrub.md Tested on Octopus with osd_op_queue=wpq, osd_op_queue_cut_off=high. Best regards, = F