Hi Martin,

I think what Peter suggests is that you should try with --numjobs=128 and 
--iodepth=16 to see what your hardware is really capable of with this very 
small I/O workload.

Regards,
Frédéric.


________________________________
De : Martin Gerhard Loschwitz <martin.loschw...@true-west.com>
Envoyé : mardi 26 novembre 2024 22:08
À : Peter Linder
Cc: ceph-users@ceph.io 
Objet : [ceph-users] Re: 4k IOPS: miserable performance in All-SSD cluster

Here’s a benchmark of another setup I did a few months back, with NVME flash 
drives and a Mellanox EVPN fabric (Spectrum ASIC) between the nodes (no RDMA). 
3 hosts and 24 drives in total. 

root@test01:~# fio --ioengine=libaio --filename=/dev/sdb --direct=1 --sync=1 
--rw=write --bs=4K --numjobs=1 --iodepth=1 --runtime=60 --time_based --name=fio 
fio: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, 
ioengine=libaio, iodepth=1 
fio-3.33 
Starting 1 process 
Jobs: 1 (f=1): [W(1)][100.0%][w=6966KiB/s][w=1741 IOPS][eta 00m:00s] 
fio: (groupid=0, jobs=1): err= 0: pid=115698: Tue May 28 16:54:38 2024 
  write: IOPS=1804, BW=7218KiB/s (7391kB/s)(423MiB/60001msec); 0 zone resets 
    slat (nsec): min=2872, max=92926, avg=5026.65, stdev=2710.03 
    clat (usec): min=419, max=4486, avg=548.34, stdev=54.66 
     lat (usec): min=461, max=4490, avg=553.37, stdev=55.02 
    clat percentiles (usec): 
     |  1.00th=[  486],  5.00th=[  502], 10.00th=[  510], 20.00th=[  523], 
     | 30.00th=[  529], 40.00th=[  537], 50.00th=[  545], 60.00th=[  553], 
     | 70.00th=[  562], 80.00th=[  570], 90.00th=[  586], 95.00th=[  594], 
     | 99.00th=[  660], 99.50th=[  758], 99.90th=[ 1156], 99.95th=[ 1287], 
     | 99.99th=[ 2606] 
   bw (  KiB/s): min= 6664, max= 8072, per=100.00%, avg=7225.95, stdev=268.19, 
samples=119 
   iops        : min= 1666, max= 2018, avg=1806.49, stdev=67.05, samples=119 
  lat (usec)   : 500=4.95%, 750=94.52%, 1000=0.38% 
  lat (msec)   : 2=0.13%, 4=0.02%, 10=0.01% 
  cpu          : usr=0.57%, sys=1.46%, ctx=108317, majf=0, minf=12 
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% 
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 
     issued rwts: total=0,108275,0,0 short=0,0,0,0 dropped=0,0,0,0 
     latency   : target=0, window=0, percentile=100.00%, depth=1 

Run status group 0 (all jobs): 
  WRITE: bw=7218KiB/s (7391kB/s), 7218KiB/s-7218KiB/s (7391kB/s-7391kB/s), 
io=423MiB (443MB), run=60001-60001msec 

Disk stats (read/write): 
  sdb: ios=80/108093, merge=0/0, ticks=21/59172, in_queue=59193, util=99.96% 

This was in an instance inside VMware, so there was iSCSI involved in the data 
path in addition to the normal Ceph replication, with Ceph being mostly 
out-of-the-box and standard. 

I wouldn’t believe 40 (or 400 in the SSD cluster) would be a bad value had I 
not seen substantially better values in the past. And even the 1000 would be a 
very substantial improvement compared to what I see now. 

Best regards 
Martin 

-- 


Martin Gerhard Loschwitz 
Geschäftsführer / CEO, True West IT Services GmbH 
P +49 2433 5253130 <tel:+49 2433 5253130> 
M +49 176 61832178 <https://mysig.io/4ngY23j0> 
A Schmiedegasse 24a, 41836 Hückelhoven, Deutschland 
R HRB 21985, Amtsgericht Mönchengladbach <https://mysig.io/b4g0y3rz> 
<https://mysignature.io/editor?utm_source=expiredpixel> 
True West IT Services GmbH is compliant with the GDPR regulation on data 
protection and privacy in the European Union and the European Economic Area. 
You can request the information on how we collect and process your private data 
according to the law by contacting the email sender. 


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to