Please do let me know if that strategy works out. When you change an osd_spec, out of an abundance of caution it won’t be retroactively applied to existing OSDs, which can be exploited for migrations.
> On Apr 11, 2025, at 3:29 PM, Giovanna Ratini > <giovanna.rat...@uni-konstanz.de> wrote: > > Hello Eneko, > > I switched to KRDB, and I’m seeing slightly better performance now. > > For Switching: > https://forum.proxmox.com/threads/how-to-safely-enable-krbd-in-a-5-node-production-environment-running-7-4-19.159186/ > > NVMe performance remains disappointing, though... > They went from 35MB/s to 45MB/s. > > I’m planning to apply the change that Anthony recommended: > setting mon_target_pg_per_osd to 250 and configuring 2 osds_per_device. > This will take a bit of time. > ceph config set global mon_target_pg_per_osd 250 > ceph config set global osds_per_device 2 > > To split the drives into 2 OSDs each, > I’ll need to update the ceph orch ls --export OSD service spec, > > then zap an existing OSD, allow it to be rebuilt as two, and repeat the > process for the remaining ones. > > We'll see if this change helps. I’ll write the results here once it's done. > > Cheers, > > Gio > > root@gitlab:~# fio --name=registry-read --ioengine=libaio --rw=randread > --bs=4k --numjobs=4 --iodepth=16 --size=1G --runtime=60 > > registry-read: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) > 4096B-4096B, ioengine=libaio, iodepth=16 > ... > fio-3.33 > Starting 4 processes > Jobs: 4 (f=4): [r(4)][100.0%][r=91.7MiB/s][r=23.5k IOPS][eta 00m:00s] > registry-read: (groupid=0, jobs=1): err= 0: pid=2547: Fri Apr 11 21:02:31 2025 > read: IOPS=2756, BW=10.8MiB/s (11.3MB/s)(646MiB/60001msec) > slat (usec): min=50, max=8619, avg=360.14, stdev=217.84 > clat (usec): min=2, max=17259, avg=5441.99, stdev=1633.01 > lat (usec): min=108, max=17721, avg=5802.13, stdev=1728.71 > clat percentiles (usec): > | 1.00th=[ 1909], 5.00th=[ 2507], 10.00th=[ 2966], 20.00th=[ 3818], > | 30.00th=[ 4621], 40.00th=[ 5342], 50.00th=[ 5932], 60.00th=[ 6259], > | 70.00th=[ 6456], 80.00th=[ 6718], 90.00th=[ 6980], 95.00th=[ 7308], > | 99.00th=[ 9241], 99.50th=[10290], 99.90th=[13173], 99.95th=[13698], > | 99.99th=[16450] > bw ( KiB/s): min= 8456, max=22296, per=24.64%, avg=10937.08, > stdev=3222.24, samples=119 > iops : min= 2114, max= 5574, avg=2734.27, stdev=805.56, samples=119 > lat (usec) : 4=0.01%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% > lat (msec) : 2=1.33%, 4=20.70%, 10=77.32%, 20=0.65% > cpu : usr=0.78%, sys=6.75%, ctx=165432, majf=0, minf=27 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=165408,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=16 > registry-read: (groupid=0, jobs=1): err= 0: pid=2548: Fri Apr 11 21:02:31 2025 > read: IOPS=2807, BW=11.0MiB/s (11.5MB/s)(658MiB/60001msec) > slat (usec): min=50, max=8950, avg=353.61, stdev=213.68 > clat (usec): min=2, max=17110, avg=5344.32, stdev=1642.90 > lat (usec): min=93, max=17575, avg=5697.93, stdev=1740.41 > clat percentiles (usec): > | 1.00th=[ 1844], 5.00th=[ 2409], 10.00th=[ 2868], 20.00th=[ 3687], > | 30.00th=[ 4490], 40.00th=[ 5276], 50.00th=[ 5866], 60.00th=[ 6194], > | 70.00th=[ 6390], 80.00th=[ 6587], 90.00th=[ 6915], 95.00th=[ 7242], > | 99.00th=[ 8979], 99.50th=[10159], 99.90th=[13042], 99.95th=[13829], > | 99.99th=[15926] > bw ( KiB/s): min= 8536, max=23624, per=25.10%, avg=11138.08, > stdev=3441.69, samples=119 > iops : min= 2134, max= 5906, avg=2784.52, stdev=860.42, samples=119 > lat (usec) : 4=0.01%, 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01% > lat (usec) : 1000=0.01% > lat (msec) : 2=1.80%, 4=22.21%, 10=75.40%, 20=0.58% > cpu : usr=0.98%, sys=6.72%, ctx=168450, majf=0, minf=25 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=168432,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=16 > registry-read: (groupid=0, jobs=1): err= 0: pid=2549: Fri Apr 11 21:02:31 2025 > read: IOPS=2773, BW=10.8MiB/s (11.4MB/s)(650MiB/60001msec) > slat (usec): min=46, max=8246, avg=357.89, stdev=213.33 > clat (usec): min=2, max=19652, avg=5408.19, stdev=1641.03 > lat (usec): min=411, max=20124, avg=5766.08, stdev=1738.36 > clat percentiles (usec): > | 1.00th=[ 1909], 5.00th=[ 2474], 10.00th=[ 2933], 20.00th=[ 3752], > | 30.00th=[ 4555], 40.00th=[ 5342], 50.00th=[ 5932], 60.00th=[ 6259], > | 70.00th=[ 6456], 80.00th=[ 6652], 90.00th=[ 6980], 95.00th=[ 7242], > | 99.00th=[ 9110], 99.50th=[10421], 99.90th=[12911], 99.95th=[14353], > | 99.99th=[16909] > bw ( KiB/s): min= 8432, max=22520, per=24.79%, avg=11004.77, > stdev=3330.83, samples=119 > iops : min= 2108, max= 5630, avg=2751.19, stdev=832.71, samples=119 > lat (usec) : 4=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% > lat (msec) : 2=1.40%, 4=21.56%, 10=76.44%, 20=0.60% > cpu : usr=0.99%, sys=6.58%, ctx=166457, majf=0, minf=25 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=166442,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=16 > registry-read: (groupid=0, jobs=1): err= 0: pid=2550: Fri Apr 11 21:02:31 2025 > read: IOPS=2757, BW=10.8MiB/s (11.3MB/s)(646MiB/60001msec) > slat (usec): min=49, max=7497, avg=360.11, stdev=212.22 > clat (usec): min=2, max=19699, avg=5441.22, stdev=1616.73 > lat (usec): min=390, max=20175, avg=5801.33, stdev=1712.21 > clat percentiles (usec): > | 1.00th=[ 1909], 5.00th=[ 2540], 10.00th=[ 2999], 20.00th=[ 3818], > | 30.00th=[ 4621], 40.00th=[ 5407], 50.00th=[ 5932], 60.00th=[ 6259], > | 70.00th=[ 6456], 80.00th=[ 6652], 90.00th=[ 6980], 95.00th=[ 7308], > | 99.00th=[ 8979], 99.50th=[10159], 99.90th=[13042], 99.95th=[13829], > | 99.99th=[16057] > bw ( KiB/s): min= 8512, max=23152, per=24.65%, avg=10941.71, > stdev=3229.43, samples=119 > iops : min= 2128, max= 5788, avg=2735.43, stdev=807.36, samples=119 > lat (usec) : 4=0.01%, 500=0.01%, 1000=0.01% > lat (msec) : 2=1.39%, 4=20.78%, 10=77.28%, 20=0.54% > cpu : usr=0.80%, sys=6.75%, ctx=165463, majf=0, minf=27 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, > >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, > >=64=0.0% > issued rwts: total=165432,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=16 > > Run status group 0 (all jobs): > READ: bw=43.3MiB/s (45.4MB/s), 10.8MiB/s-11.0MiB/s (11.3MB/s-11.5MB/s), > io=2600MiB (2727MB), run=60001-60001msec > > Disk stats (read/write): > dm-0: ios=663651/273, merge=0/0, ticks=221100/28, in_queue=221128, > util=99.88%, aggrios=666145/189, aggrmerge=202/85, aggrticks=206340/50, > aggrin_queue=206423, aggrutil=66.45% > sda: ios=666145/189, merge=202/85, ticks=206340/50, in_queue=206423, > util=66.45% > > > Am 20.03.2025 um 16:57 schrieb Eneko Lacunza: >> Hi Chris, >> >> I tried KRBD, even with a newly created disk and after shuting down and >> starting VM again, but no measurable difference. >> >> Our Ceph is 18.2.4, that may be a factor to consider, but 9k -> 273k?! >> >> Maybe Giovanna can test KRBD option and report back... :) >> >> Cheers >> >> El 20/3/25 a las 16:19, Chris Palmer escribió: >>> HI Eneko >>> >>> No containers. In the Promox console go to Datacenter\Storage, click on the >>> storage you are using, then Edit. There is a tick box KRBD. With that set, >>> any virtual disks created in that storage will use KRBD rather than librbd. >>> So it applies to all VMs that use that storage. >>> >>> Chris >>> >>> On 20/03/2025 15:00, Eneko Lacunza wrote: >>>> >>>> Chris, you tested from a container? Or how do you configure a KRBD disk >>>> for a VM? >>>> >>>> El 20/3/25 a las 15:15, Chris Palmer escribió: >>>>> I just ran that command on one of my VMs. Salient details: >>>>> >>>>> * Ceph cluster 19.2.1 with 3 nodes, 4 x SATA disks with shared NVMe >>>>> DB/WAL, single 10g NICs >>>>> * Promox 8.3.5 cluster with 2 nodes (separate nodes to Ceph), single >>>>> 10g NICs , single 1g NICs for corosync >>>>> * Test VM was using KRBD R3 pool on HDD, iothread=1, aio=io_uring, >>>>> cache=writeback >>>>> >>>>> The results are very different: >>>>> >>>>> # fio --name=registry-read --ioengine=libaio --rw=randread --bs=4k >>>>> --numjobs=4 --size=1G --runtime=60 --group_reporting --iodepth=16 >>>>> registry-read: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, >>>>> (T) 4096B-4096B, ioengine=libaio, iodepth=16 >>>>> ... >>>>> fio-3.37 >>>>> Starting 4 processes >>>>> Jobs: 4 (f=4): [r(4)][-.-%][r=1080MiB/s][r=277k IOPS][eta 00m:00s] >>>>> registry-read: (groupid=0, jobs=4): err= 0: pid=13355: Thu Mar 20 >>>>> 13:57:05 2025 >>>>> read: IOPS=273k, BW=1068MiB/s (1120MB/s)(4096MiB/3835msec) >>>>> slat (usec): min=7, max=3802, avg=13.77, stdev= 6.41 >>>>> clat (nsec): min=599, max=4395.1k, avg=215298.68, stdev=38131.71 >>>>> lat (usec): min=11, max=4408, avg=229.07, stdev=40.01 >>>>> clat percentiles (usec): >>>>> | 1.00th=[ 194], 5.00th=[ 200], 10.00th=[ 202], 20.00th=[ 204], >>>>> | 30.00th=[ 206], 40.00th=[ 208], 50.00th=[ 210], 60.00th=[ 212], >>>>> | 70.00th=[ 215], 80.00th=[ 217], 90.00th=[ 227], 95.00th=[ 243], >>>>> | 99.00th=[ 367], 99.50th=[ 420], 99.90th=[ 594], 99.95th=[ 668], >>>>> | 99.99th=[ 963] >>>>> bw ( MiB/s): min= 920, max= 1118, per=100.00%, avg=1068.04, >>>>> stdev=16.81, samples=28 >>>>> iops : min=235566, max=286286, avg=273417.14, stdev=4303.79, >>>>> samples=28 >>>>> lat (nsec) : 750=0.01%, 1000=0.01% >>>>> lat (usec) : 20=0.01%, 50=0.01%, 100=0.01%, 250=96.06%, 500=3.67% >>>>> lat (usec) : 750=0.24%, 1000=0.02% >>>>> lat (msec) : 2=0.01%, 4=0.01%, 10=0.01% >>>>> cpu : usr=4.68%, sys=29.99%, ctx=1048987, majf=0, minf=102 >>>>> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >>>>> >=64=0.0% >>>>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>> >=64=0.0% >>>>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >>>>> >=64=0.0% >>>>> issued rwts: total=1048576,0,0,0 short=0,0,0,0 dropped=0,0,0,0 >>>>> latency : target=0, window=0, percentile=100.00%, depth=16 >>>>> >>>>> Run status group 0 (all jobs): >>>>> READ: bw=1068MiB/s (1120MB/s), 1068MiB/s-1068MiB/s >>>>> (1120MB/s-1120MB/s), io=4096MiB (4295MB), run=3835-3835msec >>>>> >>>>> Disk stats (read/write): >>>>> sdc: ios=999346/0, sectors=7994768/0, merge=0/0, ticks=10360/0, >>>>> in_queue=10361, util=95.49% >>>>> >>>>> >>>>> >>>>> On 20/03/2025 12:23, Eneko Lacunza wrote: >>>>>> Hi Giovanna, >>>>>> >>>>>> I just tested one of my VMs: >>>>>> # fio --name=registry-read --ioengine=libaio --rw=randread --bs=4k >>>>>> --numjobs=4 --size=1G --runtime=60 --group_reporting >>>>>> registry-read: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, >>>>>> (T) 4096B-4096B, ioengine=libaio, iodepth=1 >>>>>> registry-read: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, >>>>>> (T) 4096B-4096B, ioengine=libaio, iodepth=1 >>>>>> ... >>>>>> fio-3.33 >>>>>> Starting 4 processes >>>>>> registry-read: Laying out IO file (1 file / 1024MiB) >>>>>> registry-read: Laying out IO file (1 file / 1024MiB) >>>>>> registry-read: Laying out IO file (1 file / 1024MiB) >>>>>> registry-read: Laying out IO file (1 file / 1024MiB) >>>>>> Jobs: 4 (f=0): [f(4)][100.0%][r=33.5MiB/s][r=8578 IOPS][eta 00m:00s] >>>>>> registry-read: (groupid=0, jobs=4): err= 0: pid=24261: Thu Mar 20 >>>>>> 12:57:26 2025 >>>>>> read: IOPS=8538, BW=33.4MiB/s (35.0MB/s)(2001MiB/60001msec) >>>>>> slat (usec): min=309, max=4928, avg=464.54, stdev=73.15 >>>>>> clat (nsec): min=602, max=1532.4k, avg=1999.15, stdev=3724.16 >>>>>> lat (usec): min=310, max=4931, avg=466.54, stdev=73.36 >>>>>> clat percentiles (nsec): >>>>>> | 1.00th=[ 812], 5.00th=[ 884], 10.00th=[ 940], 20.00th=[ >>>>>> 1096], >>>>>> | 30.00th=[ 1368], 40.00th=[ 1576], 50.00th=[ 1720], 60.00th=[ >>>>>> 1832], >>>>>> | 70.00th=[ 1944], 80.00th=[ 2096], 90.00th=[ 2480], 95.00th=[ >>>>>> 3024], >>>>>> | 99.00th=[12480], 99.50th=[15808], 99.90th=[47360], >>>>>> 99.95th=[61696], >>>>>> | 99.99th=[90624] >>>>>> bw ( KiB/s): min=30448, max=35868, per=100.00%, avg=34155.76, >>>>>> stdev=269.75, samples=476 >>>>>> iops : min= 7612, max= 8966, avg=8538.87, stdev=67.43, >>>>>> samples=476 >>>>>> lat (nsec) : 750=0.06%, 1000=14.94% >>>>>> lat (usec) : 2=59.18%, 4=23.07%, 10=1.28%, 20=1.17%, 50=0.21% >>>>>> lat (usec) : 100=0.08%, 250=0.01%, 500=0.01% >>>>>> lat (msec) : 2=0.01% >>>>>> cpu : usr=1.04%, sys=5.50%, ctx=537639, majf=0, minf=36 >>>>>> IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >>>>>> >=64=0.0% >>>>>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>>> >=64=0.0% >>>>>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>>> >=64=0.0% >>>>>> issued rwts: total=512316,0,0,0 short=0,0,0,0 dropped=0,0,0,0 >>>>>> latency : target=0, window=0, percentile=100.00%, depth=1 >>>>>> >>>>>> Run status group 0 (all jobs): >>>>>> READ: bw=33.4MiB/s (35.0MB/s), 33.4MiB/s-33.4MiB/s >>>>>> (35.0MB/s-35.0MB/s), io=2001MiB (2098MB), run=60001-60001msec >>>>>> >>>>>> Results are worse than yours, but this is on a production (not very >>>>>> busy) pool with 4x3.84TB SATA disks (4 disks total vs ~15 disks in your >>>>>> case) and 10G network. >>>>>> >>>>>> VM cpu is x86_64_v3 and host CPU Ryzen 1700. >>>>>> >>>>>> I gest almost the same IOPS with --iodepth=16 . >>>>>> >>>>>> I tried moving the VM to a Ryzen 5900X and results are somewhat better: >>>>>> >>>>>> # fio --name=registry-read --ioengine=libaio --rw=randread --bs=4k >>>>>> --numjobs=4 --size=1G --runtime=60 --group_reporting --iodepth=16 >>>>>> registry-read: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, >>>>>> (T) 4096B-4096B, ioengine=libaio, iodepth=16 >>>>>> ... >>>>>> fio-3.33 >>>>>> Starting 4 processes >>>>>> Jobs: 4 (f=4): [r(4)][100.0%][r=45.4MiB/s][r=11.6k IOPS][eta 00m:00s] >>>>>> registry-read: (groupid=0, jobs=4): err= 0: pid=24282: Thu Mar 20 >>>>>> 13:18:23 2025 >>>>>> read: IOPS=11.6k, BW=45.5MiB/s (47.7MB/s)(2730MiB/60001msec) >>>>>> slat (usec): min=110, max=21206, avg=341.21, stdev=79.69 >>>>>> clat (nsec): min=1390, max=42395k, avg=5147009.08, stdev=475506.40 >>>>>> lat (usec): min=335, max=42779, avg=5488.22, stdev=498.03 >>>>>> clat percentiles (usec): >>>>>> | 1.00th=[ 4621], 5.00th=[ 4752], 10.00th=[ 4817], 20.00th=[ >>>>>> 4948], >>>>>> | 30.00th=[ 5014], 40.00th=[ 5080], 50.00th=[ 5080], 60.00th=[ >>>>>> 5145], >>>>>> | 70.00th=[ 5211], 80.00th=[ 5276], 90.00th=[ 5407], 95.00th=[ >>>>>> 5538], >>>>>> | 99.00th=[ 6194], 99.50th=[ 6783], 99.90th=[ 9765], >>>>>> 99.95th=[12125], >>>>>> | 99.99th=[24249] >>>>>> bw ( KiB/s): min=36434, max=48352, per=100.00%, avg=46612.18, >>>>>> stdev=300.09, samples=476 >>>>>> iops : min= 9108, max=12088, avg=11653.04, stdev=75.03, >>>>>> samples=476 >>>>>> lat (usec) : 2=0.01%, 500=0.01%, 750=0.01%, 1000=0.01% >>>>>> lat (msec) : 2=0.01%, 4=0.01%, 10=99.90%, 20=0.08%, 50=0.01% >>>>>> cpu : usr=0.98%, sys=4.18%, ctx=706399, majf=0, minf=99 >>>>>> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >>>>>> >=64=0.0% >>>>>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >>>>>> >=64=0.0% >>>>>> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >>>>>> >=64=0.0% >>>>>> issued rwts: total=698956,0,0,0 short=0,0,0,0 dropped=0,0,0,0 >>>>>> latency : target=0, window=0, percentile=100.00%, depth=16 >>>>>> >>>>>> Run status group 0 (all jobs): >>>>>> READ: bw=45.5MiB/s (47.7MB/s), 45.5MiB/s-45.5MiB/s >>>>>> (47.7MB/s-47.7MB/s), io=2730MiB (2863MB), run=60001-60001msec >>>>>> >>>>>> I think we're limited by the IO thread. I suggest you try multiple disks >>>>>> with SCSI Virtio single. >>>>>> >>>>>> My VM conf: >>>>>> agent: 1 >>>>>> boot: order=scsi0;ide2;net0 >>>>>> cores: 2 >>>>>> cpu: x86-64-v3 >>>>>> ide2: none,media=cdrom >>>>>> memory: 2048 >>>>>> meta: creation-qemu=9.0.2,ctime=1739888364 >>>>>> name: elacunza-btrfs-test >>>>>> net0: virtio=BC:24:11:47:9B:58,bridge=vmbr0,firewall=1 >>>>>> numa: 0 >>>>>> ostype: l26 >>>>>> scsi0: proxmox_r3_ssd2:vm-112-disk-0,discard=on,iothread=1,size=15G >>>>>> scsihw: virtio-scsi-single >>>>>> smbios1: uuid=263ab229-4379-4abf-b6bf-615b98ccd3d4 >>>>>> sockets: 1 >>>>>> vmgenid: 13b7f2a4-2a42-4600-845a-da88f96ae6e8 >>>>>> >>>>>> I think this is a KVM/QEMU issue, not a Ceph issue :) Maybe you can get >>>>>> better suggestions in pve-user mailing list. >>>>>> >>>>>> Cheers >>>>>> >>>>>> El 20/3/25 a las 12:29, Giovanna Ratini escribió: >>>>>>> Hello Eneko, >>>>>>> >>>>>>> this is my configuration. The performance is similar across all VMs. I >>>>>>> am now checking GitLab, as that is where people are complaining the >>>>>>> most. >>>>>>> >>>>>>> agent: 1 >>>>>>> balloon: 65000 >>>>>>> bios: ovmf >>>>>>> boot: order=scsi0;net0 >>>>>>> cores: 10 >>>>>>> cpu: host >>>>>>> efidisk0: cephvm:vm-6506-disk-0,efitype=4m,size=528K >>>>>>> memory: 130000 >>>>>>> meta: creation-qemu=9.0.2,ctime=1734995123 >>>>>>> name: gitlab02 >>>>>>> net0: virtio=BC:24:11:6E:28:71,bridge=vmbr1,firewall=1 >>>>>>> numa: 0 >>>>>>> ostype: l26 >>>>>>> scsi0: >>>>>>> cephvm:vm-6506-disk-1,aio=native,cache=writeback,iothread=1,size=64G,ssd=1 >>>>>>> scsi1: >>>>>>> cephvm:vm-6506-disk-2,aio=native,cache=writeback,iothread=1,size=10T,ssd=1 >>>>>>> scsihw: virtio-scsi-single >>>>>>> smbios1: uuid=0a5294c0-c82a-40f2-aae4-f5880022a2ac >>>>>>> sockets: 2 >>>>>>> vmgenid: ea610fde-6c71-4b7f-9257-fa431a428e16 >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Gio >>>>>>> >>>>>>> Am 20.03.2025 um 10:23 schrieb Eneko Lacunza: >>>>>>>> Hi Giovanna, >>>>>>>> >>>>>>>> Can you post VM's full config? >>>>>>>> >>>>>>>> Also, can you test with IO thread enabled and SCSI virtio single, and >>>>>>>> multiple disks? >>>>>>>> >>>>>>>> Cheers >>>>>>>> >>>>>>>> El 19/3/25 a las 17:27, Giovanna Ratini escribió: >>>>>>>>> >>>>>>>>> hello Eneko, >>>>>>>>> >>>>>>>>> Yes I did. No significant changes. :-( >>>>>>>>> Cheers, >>>>>>>>> >>>>>>>>> Gio >>>>>>>>> >>>>>>>>> >>>>>>>>> Am Mittwoch, März 19, 2025 13:09 CET, schrieb Eneko Lacunza >>>>>>>>> <elacu...@binovo.es>: >>>>>>>>> >>>>>>>>>> Hi Giovanna, >>>>>>>>>> >>>>>>>>>> Have you tried increasing iothreads option for the VM? >>>>>>>>>> >>>>>>>>>> Cheers >>>>>>>>>> >>>>>>>>>> El 18/3/25 a las 19:13, Giovanna Ratini escribió: >>>>>>>>>> > Hello Antony, >>>>>>>>>> > >>>>>>>>>> > no, no QoS applied to Vms. >>>>>>>>>> > >>>>>>>>>> > The Server has PCIe Gen 4 >>>>>>>>>> > >>>>>>>>>> > ceph osd dump | grep pool >>>>>>>>>> > pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash >>>>>>>>>> > rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 21 flags >>>>>>>>>> > hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application >>>>>>>>>> > mgr >>>>>>>>>> > read_balance_score 13.04 >>>>>>>>>> > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 >>>>>>>>>> > object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on >>>>>>>>>> > last_change 598 lfor 0/598/596 flags hashpspool stripe_width 0 >>>>>>>>>> > application cephfs read_balance_score 2.02 >>>>>>>>>> > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 >>>>>>>>>> > object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on >>>>>>>>>> > last_change 50 flags hashpspool stripe_width 0 pg_autoscale_bias 4 >>>>>>>>>> > pg_num_min 16 recovery_priority 5 application cephfs >>>>>>>>>> > read_balance_score 2.42 >>>>>>>>>> > pool 4 'cephvm' replicated size 3 min_size 2 crush_rule 0 >>>>>>>>>> > object_hash >>>>>>>>>> > rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 16386 >>>>>>>>>> > lfor 0/644/2603 flags hashpspool,selfmanaged_snaps stripe_width 0 >>>>>>>>>> > application rbd read_balance_score 1.52 >>>>>>>>>> > >>>>>>>>>> > I think, this is the default config. 🙈 >>>>>>>>>> > >>>>>>>>>> > I will search for my chassies supermicro upgrade. >>>>>>>>>> > >>>>>>>>>> > Thank you >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > Am 18.03.2025 um 17:57 schrieb Anthony D'Atri: >>>>>>>>>> >>> Then I tested on the *Proxmox host*, and the results were >>>>>>>>>> >>> significantly better. >>>>>>>>>> >> My Proxmox prowess is limited, but from my experience with other >>>>>>>>>> >> virtualization platforms, I have to ask if there is any QoS >>>>>>>>>> >> throttling applied to VMs. With OpenStack or DO there is often >>>>>>>>>> >> IOPS >>>>>>>>>> >> and/or throughput throttling via libvirt to mitigate noisy >>>>>>>>>> >> neighbors. >>>>>>>>>> >> >>>>>>>>>> >>> fio --name=host-test --filename=/dev/rbd0 --ioengine=libaio >>>>>>>>>> >>> --rw=randread --bs=4k --numjobs=4 --iodepth=32 --size=1G >>>>>>>>>> >>> --runtime=60 --group_reporting >>>>>>>>>> >>> >>>>>>>>>> >>> *IOPS*: *1.54M* >>>>>>>>>> >>> >>>>>>>>>> >>> # *Bandwidth*: *6032MiB/s (6325MB/s)* >>>>>>>>>> >>> # *Latency*: >>>>>>>>>> >>> >>>>>>>>>> >>> * *Avg*: *39.8µs* >>>>>>>>>> >>> * *99.9th percentile*: *71µs* >>>>>>>>>> >>> >>>>>>>>>> >>> # *CPU Usage*: *usr=22.60%, sys=77.13%* >>>>>>>>>> >>> # >>>>>>>>>> >>> >>>>>>>>>> >>> Am 18.03.2025 um 15:27 schrieb Anthony D'Atri: >>>>>>>>>> >>>> Which NVMe drive SKUs specifically? >>>>>>>>>> >>> # */dev/nvme6n1* – *KCD61LUL15T3* – 15.36 TB – SN: 6250A02QT5A8 >>>>>>>>>> >>> # */dev/nvme5n1* – *KCD61LUL15T3* – 15.36 TB – SN: 42R0A036T5A8 >>>>>>>>>> >>> # */dev/nvme4n1* – *KCD61LUL15T3* – 15.36 TB – SN: 6250A02UT5A8 >>>>>>>>>> >> Kioxia CD6. If you were using client-class drives all manner of >>>>>>>>>> >> performance issues would be expected. >>>>>>>>>> >> >>>>>>>>>> >> Is your server chassis at least PCIe Gen 4? If it’s Gen 3 that may >>>>>>>>>> >> hamper these drives. >>>>>>>>>> >> >>>>>>>>>> >> Also, how many of these are in your cluster? If it’s a small >>>>>>>>>> >> number >>>>>>>>>> >> you might still benefit from chopping each into at least 2 >>>>>>>>>> >> separate >>>>>>>>>> >> OSDs. >>>>>>>>>> >> >>>>>>>>>> >> And please send `ceph osd dump | grep pool`, having too few PGs >>>>>>>>>> >> wouldn’t do you any favors. >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >>>> Are you running a recent kernel? >>>>>>>>>> >>> penultimate: 6.8.12-8-pve (VM, yes) >>>>>>>>>> >> Groovy. If you were running like a CentOS 6 or CentOS 7 kernel >>>>>>>>>> >> then >>>>>>>>>> >> NVMe issues might be expected as old kernels had rudimentary NVMe >>>>>>>>>> >> support. >>>>>>>>>> >> >>>>>>>>>> >>>> Have you updated firmware on the NVMe devices? >>>>>>>>>> >>> No. >>>>>>>>>> >> Kioxia appears to not release firmware updates publicly but your >>>>>>>>>> >> chassis brand (Dell, HP, SMCI, etc) might have an update. >>>>>>>>>> >> e.g.https://www.dell.com/support/home/en-vc/drivers/driversdetails?driverid=7ny55 >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> If there is an available update I would strongly suggest >>>>>>>>>> >> applying. >>>>>>>>>> > >>>>>>>>>> >> >>>>>>>>>> >>> Thanks again, >>>>>>>>>> >>> >>>>>>>>>> >>> best regards, >>>>>>>>>> >>> Gio >>>>>>>>>> >>> >>>>>>>>>> >>> _______________________________________________ >>>>>>>>>> >>> ceph-users mailing list --ceph-users@ceph.io >>>>>>>>>> >>> To unsubscribe send an email toceph-users-le...@ceph.io >>>>>>>>>> > _______________________________________________ >>>>>>>>>> > ceph-users mailing list -- ceph-users@ceph.io >>>>>>>>>> > To unsubscribe send an email to ceph-users-le...@ceph.io >>>>>>>>>> >>>>>>>>>> Eneko Lacunza >>>>>>>>>> Zuzendari teknikoa | Director técnico >>>>>>>>>> Binovo IT Human Project >>>>>>>>>> >>>>>>>>>> Tel. +34 943 569 206 <tel:+34 943 569 206> | https://www.binovo.es >>>>>>>>>> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun >>>>>>>>>> >>>>>>>>>> https://www.youtube.com/user/CANALBINOVO >>>>>>>>>> https://www.linkedin.com/company/37269706/ >>>>>>>>>> _______________________________________________ >>>>>>>>>> ceph-users mailing list -- ceph-users@ceph.io >>>>>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> EnekoLacunza >>>>>>>> >>>>>>>> Director Técnico | Zuzendari teknikoa >>>>>>>> >>>>>>>> Binovo IT Human Project >>>>>>>> >>>>>>>> 943 569 206 <tel:943 569 206> >>>>>>>> >>>>>>>> elacu...@binovo.es <mailto:elacu...@binovo.es> >>>>>>>> >>>>>>>> binovo.es <//binovo.es> >>>>>>>> >>>>>>>> Astigarragako Bidea, 2 - 2 izda. Oficina 10-11, 20180 Oiartzun >>>>>>>> >>>>>>>> >>>>>>>> youtube <https://www.youtube.com/user/CANALBINOVO/> >>>>>>>> linkedin <https://www.linkedin.com/company/37269706/> >>>>>>>> _______________________________________________ >>>>>>>> ceph-users mailing list -- ceph-users@ceph.io >>>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list -- ceph-users@ceph.io >>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io >>>>>> >>>>>> Eneko Lacunza >>>>>> Zuzendari teknikoa | Director técnico >>>>>> Binovo IT Human Project >>>>>> >>>>>> Tel. +34 943 569 206 | https://www.binovo.es >>>>>> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun >>>>>> >>>>>> https://www.youtube.com/user/CANALBINOVO >>>>>> https://www.linkedin.com/company/37269706/ >>>>>> _______________________________________________ >>>>>> ceph-users mailing list -- ceph-users@ceph.io >>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io >>>>> >>>> >>>> Eneko Lacunza >>>> Zuzendari teknikoa | Director técnico >>>> Binovo IT Human Project >>>> >>>> Tel. +34 943 569 206 |https://www.binovo.es >>>> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun >>>> >>>> https://www.youtube.com/user/CANALBINOVO >>>> https://www.linkedin.com/company/37269706/ >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@ceph.io >>>> To unsubscribe send an email to ceph-users-le...@ceph.io >>> >> >> Eneko Lacunza >> Zuzendari teknikoa | Director técnico >> Binovo IT Human Project >> >> Tel. +34 943 569 206 | https://www.binovo.es >> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun >> >> https://www.youtube.com/user/CANALBINOVO >> https://www.linkedin.com/company/37269706/ >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io