[ceph-users] Re: Experience with 100G Ceph in Proxmox

2025-03-18 Thread Giovanna Ratini
Hello Antony, no, no QoS applied to Vms. The Server has PCIe Gen 4 ceph osd dump | grep pool pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 21 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application

[ceph-users] Re: Is it safe to set multiple OSD out across multiple failure domain?

2025-03-18 Thread Tyler Stachecki
On Tue, Mar 18, 2025, 5:02 PM Kai Stian Olstad wrote: > On Mon, Mar 17, 2025 at 03:08:54PM +, Eugen Block wrote: > >Before I replied, I wanted to renew my confidence and do a small test > >in a lab environment. I also created a k4m2 pool with host as > >failure-domain, started to write data c

[ceph-users] Re: ceph-osd/bluestore using page cache

2025-03-18 Thread Brian Marcotte
> The setting you're looking for is bluefs_buffered_io. This is very > much a YMMV setting, so it's best to test with both modes, but I > usually recommend turning it off for all but omap-intensive workloads > (e.g. RGW index) ... We're not using RGW, only RBD. Currently I find it hard to prevent

[ceph-users] All Github Actions immediately blocked, except GH-official and Ceph-hosted ones

2025-03-18 Thread Ernesto Puerta
Hi Cephers, Due to the escalating situation in which leaked secrets from a previously compromised GH Action (tj-actions) [1] are used to compromise more popular GH Actions (reviewdog)[2], *we have decided to immediately disable all Github Actions in all repositories, except the official GH ones, a

[ceph-users] Re: Is it safe to set multiple OSD out across multiple failure domain?

2025-03-18 Thread Kai Stian Olstad
On Mon, Mar 17, 2025 at 03:08:54PM +, Eugen Block wrote: Before I replied, I wanted to renew my confidence and do a small test in a lab environment. I also created a k4m2 pool with host as failure-domain, started to write data chunks into it in a while loop and then marked three of the OSDs

[ceph-users] Ceph User + Developer March Meetup happening tomorrow!

2025-03-18 Thread Laura Flores
Hi all, Join us at the User + Dev meeting tomorrow, where we'll be presenting some highlights from the Ceph User Stories survey and discussing next steps! RSVP here: https://www.meetup.com/ceph-user-group/events/306792345/ Thanks, Laura -- Laura Flores She/Her/Hers Software Engineer, Ceph S

[ceph-users] Re: Experience with 100G Ceph in Proxmox

2025-03-18 Thread Anthony D'Atri
> On Mar 18, 2025, at 2:13 PM, Giovanna Ratini > wrote: > > Hello Antony, > > no, no QoS applied to Vms. > > The Server has PCIe Gen 4 > > ceph osd dump | grep pool > pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins > pg_num 1 pgp_num 1 autoscale_mode on last_ch

[ceph-users] Re: Adding OSD nodes

2025-03-18 Thread Laimis Juzeliūnas
Hi Sinan, Agree on the safe approach to use upmap-remapped.py tool - it can help to reduce the unwanted data movement when new nodes are added. However since these are new nodes being added and not old ones removed/swapped - I suspect not much data movement going above the thresholds. In case y

[ceph-users] Re: Experience with 100G Ceph in Proxmox

2025-03-18 Thread Anthony D'Atri
> Then I tested on the *Proxmox host*, and the results were significantly > better. My Proxmox prowess is limited, but from my experience with other virtualization platforms, I have to ask if there is any QoS throttling applied to VMs. With OpenStack or DO there is often IOPS and/or throughp

[ceph-users] Re: Experience with 100G Ceph in Proxmox

2025-03-18 Thread Giovanna Ratini
Hello again, I tried running tests with *--iodepth=16* and *32*. The values got even worse. # *IOPS*: *8.7k* # *Bandwidth*: *34.1MiB/s (35.7MB/s)* # *Latency*: * *Avg*: *7.3ms* * *99.9th percentile*: *15.8ms* # *CPU Usage*: *usr=0.74%, sys=5.60%* The problem seems to be only inside the VMs

[ceph-users] Re: Adding OSD nodes

2025-03-18 Thread Frédéric Nass
Hi Sinan, The safest approach would be to use the upmap-remapped.py tool developed by Dan at CERN. See [1] for details. The idea is to leverage the upmap load balancer to progressively migrate the data to the new servers, minimizing performance impact on the cluster and clients. I like to cre

[ceph-users] Re: Experience with 100G Ceph in Proxmox

2025-03-18 Thread Anthony D'Atri
Which NVMe drive SKUs specifically? Are you running a recent kernel? Have you updated firmware on the NVMe devices? > On Mar 11, 2025, at 6:55 AM, Giovanna Ratini > wrote: > > However, even 80 MB per second with an NVMe drive is quite disappointing. __