[ceph-users] lifecycle for versioned bucket

2024-09-17 Thread Lukasz Borek
Hi, I'm having issue with lifecycle jobs for 18.2.4 cluster with versioning enabled bucket. /# radosgw-admin lc list [ { "bucket": ":mongobackup-prod:c3e0a369-71df-40f5-a5c0-51e859efe0e0.96754.1", "shard": "lc.0", "started": "Thu, 01 Jan 1970 00:00:00 GMT", "s

[ceph-users] Re: [EXT] mclock scheduler kills clients IOs

2024-09-17 Thread Kai Stian Olstad
On Tue, Sep 17, 2024 at 04:22:40PM +0200, Denis Polom wrote: Hi, yes mclock scheduler doesn't looks like stable and ready for production Ceph cluster. I just switched back to wpq and everything goes smoothly. In our cluster all IO stopped when I set 3 OSD to out when running Mclock. After sw

[ceph-users] Re: [EXT] mclock scheduler kills clients IOs

2024-09-17 Thread Denis Polom
Hi, yes mclock scheduler doesn't looks like stable and ready for production Ceph cluster. I just switched back to wpq and everything goes smoothly. thx! On 17. 09. 24 13:14, Justin Mammarella wrote: Hi Denis, We have had the same issue with MClock, and now switch back to WPQ when draining

[ceph-users] Re: Ceph RBD w/erasure coding

2024-09-17 Thread andre
I did have "allow_ec_overwrites" set, which is why I was stumped. The resolution in my case was editing my client.libvirt user adding to the mon caps, and allowing the erasure coded pool on the osd caps: ceph auth caps client.libvirt mon 'allow r, allow command "osd blacklist", allow command

[ceph-users] Re: Ceph RBD w/erasure coding

2024-09-17 Thread andre
My VM was running Reef, actually. But I determined the issue to be write perms on the erasure coded pool. The resolution in my case was editing my client.libvirt user adding to the mon caps, and allowing the erasure coded pool on the osd caps: ceph auth caps client.libvirt mon 'allow r, allo

[ceph-users] Re: Blocking/Stuck file

2024-09-17 Thread dominik.baack
Hi, Thank you for your input. We checked the MDS logs, mgr logs and ceph-fuse logs but did not find much. The E-Mail was stuck several days in transfer, so we found the solution at the end of last week which was a defect network interface handling the public trafic on one of our storage nodes.

[ceph-users] Radosgw bucket check fix doesn't do anything

2024-09-17 Thread Reid Guyett
Hello, I recently moved a bucket from 1 cluster to another cluster using rclone. I noticed that the source bucket had around 35k objects and the destination bucket only had around 18k objects after the sync was completed. Source bucket stats showed: > radosgw-admin bucket stats --bucket mimir-pr

[ceph-users] Re: [EXT] mclock scheduler kills clients IOs

2024-09-17 Thread Justin Mammarella
Hi Denis, We have had the same issue with MClock, and now switch back to WPQ when draining nodes. I couldn’t identify the cause of the slow-ops. Even with custom mclock tuning and 1 backfill per osd, it was still causing client io issues. From: Denis Polom Date: Tuesday, 17 September 2024 at

[ceph-users] mclock scheduler kills clients IOs

2024-09-17 Thread Denis Polom
Hi guys, we have Ceph cluster with Quincy 17.2.7 where we are draining 3 hosts (each one is from one failure domain). Each host has 13 HDDs and there are another 38 hosts with same size in each failure domain. There is a lot of free space. I've set up primary-affinity on drained OSDs to 0

[ceph-users] Re: [EXTERNAL] Re: Bucket Notifications v2 & Multisite Redundancy

2024-09-17 Thread Alex Hussein-Kershaw (HE/HIM)
Thanks 🙂 I've raised: Bug #68102: rgw: "radosgw-admin topic list" may contain duplicated data and redundant nesting - rgw - Ceph Enhancement #68104: rgw: Add a "disable replication" flag to bucket notification configuration - rgw - Ceph

[ceph-users] Re: Ceph octopus version cluster not starting

2024-09-17 Thread Frank Schilder
Hi Amudhan, sounds like the dependency doesn't have a timeout. It would help if there was a (log) message by systemd every minute or so about a dependency pending (like on the boot screen). Not sure if this can be configured. Otherwise, you could add a timeout and make the units fail after 2-5