On Mon, Feb 13, 2023 at 01:50:13PM -0000, Stuart Henderson wrote:
> ...
> It maybe worth checking whether mfs is actually helping -
> it's easy to assume that because it's in RAM it must be fast,
> but I've had machines where mfs was slower than SSD
> (https://marc.info/?l=openbsd-misc&m=164942119618029&w=2),
> also it's taking memory that could otherwise be used by
> buffer cache.

Hi All,

Since you mentioned it, I thought I would retry your dd test ...

# mount | grep /tmp
mfs:15266 on /tmp type mfs (asynchronous, local, nodev, nosuid, size=16777216 
512-blocks)

% cd !$ ; for i in `jot 5`; do dd if=/dev/zero of=mfs bs=1m count=990 2>&1 | 
grep bytes; done
cd /tmp/dd_test ; for i in `jot 5`; do dd if=/dev/zero of=mfs bs=1m count=990 
2>&1 | grep bytes; done
1038090240 bytes transferred in 1.376 secs (754215208 bytes/sec)
1038090240 bytes transferred in 1.189 secs (872536649 bytes/sec)
1038090240 bytes transferred in 1.227 secs (845718432 bytes/sec)
1038090240 bytes transferred in 1.186 secs (874866632 bytes/sec)
1038090240 bytes transferred in 1.254 secs (827186370 bytes/sec)

# mount | grep /fast
/dev/sd1l on /fast type ffs (local, nodev, nosuid, softdep)
# dmesg | grep sd1
sd1 at scsibus2 targ 1 lun 0: <NVMe, Samsung SSD 970, 1B2Q>
...

% cd /fast/dd_test ; for i in `jot 5`; do dd if=/dev/zero of=fast bs=1m 
count=990 2>&1 | grep bytes; done 
1038090240 bytes transferred in 0.871 secs (1191076597 bytes/sec)
1038090240 bytes transferred in 0.635 secs (1633246669 bytes/sec)
1038090240 bytes transferred in 0.615 secs (1685529408 bytes/sec)
1038090240 bytes transferred in 0.605 secs (1714639562 bytes/sec)
1038090240 bytes transferred in 0.612 secs (1694489764 bytes/sec)


So it seems that the Samsung NVMe device is much faster ...

However, I also tried testing the same two filesystems using the
"Flexible IO Tester" or fio (it's available as a package). When I used it
to do random 4K reads and writes, I appear to have the opposite result:

fio --name=rand_mmap_r+w --directory=/tmp/fio_test --rw=randrw --blocksize=4k 
--size=6g --io_size=60g --runtime=600 --ioengine=psync --fsync=1 --thread 
--numjobs=1 --group_reporting
...
Run status group 0 (all jobs):
   READ: bw=130MiB/s (136MB/s), 130MiB/s-130MiB/s (136MB/s-136MB/s), io=30.0GiB 
(32.2GB), run=236394-236394msec
  WRITE: bw=130MiB/s (136MB/s), 130MiB/s-130MiB/s (136MB/s-136MB/s), io=30.0GiB 
(32.2GB), run=236394-236394msec

% fio --name=rand_mmap_r+w --directory=/fast/fio_test --rw=randrw 
--blocksize=4k --size=6g --io_size=60g --runtime=600 --ioengine=psync --fsync=1 
--thread --numjobs=1 --group_reporting
...
Run status group 0 (all jobs):
   READ: bw=34.8MiB/s (36.5MB/s), 34.8MiB/s-34.8MiB/s (36.5MB/s-36.5MB/s), 
io=20.4GiB (21.9GB), run=600000-600000msec
  WRITE: bw=34.8MiB/s (36.4MB/s), 34.8MiB/s-34.8MiB/s (36.4MB/s-36.4MB/s), 
io=20.4GiB (21.9GB), run=600000-600000msec

I wonder why that would be?

Disclaimer: I know almost nothing about fio, I've never used it before.
In particular, it isn't clear to me what the correct/best choice is for
the "ioengine" option. (I played around with a few different settings,
that's why you can see that "mmap" in the (test)name argument.)

This is on a 8th generation i5 Intel NUC running a recent snapshot: 7.2
GENERIC.MP#1049

The CPU has 4 cores, hyperthreading is off. The underlying device for
"/fast" is a Samsung M.2 NVMe "stick":
nvme0: Samsung SSD 970 EVO Plus 500GB, firmware 1B2QEXM7 ...

The full output from fio is included below for anyone who might be
interested ...

Cheers,
Robb.


fio --name=rand_mmap_r+w --directory=/tmp/fio_test --rw=randrw --blocksize=4k 
--size=6g --io_size=60g --runtime=600 --ioengine=psync --fsync=1 --thread 
--numjobs=1 --group_reporting
rand_mmap_r+w: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=psync, iodepth=1
fio-3.33
Starting 1 thread
rand_mmap_r+w: Laying out IO file (1 file / 6144MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=134MiB/s,w=134MiB/s][r=34.3k,w=34.2k IOPS][eta 
00m:00s]
rand_mmap_r+w: (groupid=0, jobs=1): err= 0: pid=669956672: Wed Feb 15 13:52:03 
2023
  read: IOPS=33.3k, BW=130MiB/s (136MB/s)(30.0GiB/236394msec)
    clat (nsec): min=1523, max=1504.6k, avg=5387.11, stdev=1201.82
     lat (nsec): min=1580, max=1504.7k, avg=5450.15, stdev=1203.46
    clat percentiles (nsec):
     |  1.00th=[ 3632],  5.00th=[ 4576], 10.00th=[ 4832], 20.00th=[ 5024],
     | 30.00th=[ 5152], 40.00th=[ 5280], 50.00th=[ 5344], 60.00th=[ 5472],
     | 70.00th=[ 5600], 80.00th=[ 5792], 90.00th=[ 5984], 95.00th=[ 6176],
     | 99.00th=[ 6496], 99.50th=[ 6688], 99.90th=[13376], 99.95th=[18048],
     | 99.99th=[26240]
   bw (  KiB/s): min=126573, max=144312, per=100.00%, avg=133298.71, 
stdev=2476.36, samples=472
   iops        : min=31643, max=36078, avg=33324.48, stdev=619.06, samples=472
  write: IOPS=33.2k, BW=130MiB/s (136MB/s)(30.0GiB/236394msec); 0 zone resets
    clat (usec): min=3, max=1549, avg=13.84, stdev= 2.06
     lat (usec): min=3, max=1549, avg=13.92, stdev= 2.07
    clat percentiles (nsec):
     |  1.00th=[ 6624],  5.00th=[11712], 10.00th=[12352], 20.00th=[12864],
     | 30.00th=[13376], 40.00th=[13760], 50.00th=[14016], 60.00th=[14400],
     | 70.00th=[14656], 80.00th=[15040], 90.00th=[15552], 95.00th=[15936],
     | 99.00th=[16512], 99.50th=[16768], 99.90th=[26752], 99.95th=[34560],
     | 99.99th=[41216]
   bw (  KiB/s): min=127680, max=144753, per=100.00%, avg=133082.82, 
stdev=2194.44, samples=472
   iops        : min=31920, max=36188, avg=33270.51, stdev=548.59, samples=472
  lat (usec)   : 2=0.01%, 4=0.92%, 10=50.99%, 20=47.96%, 50=0.11%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%
  lat (msec)   : 2=0.01%
  fsync/fdatasync/sync_file_range:
    sync (nsec): min=1227, max=1556.5k, avg=4512.57, stdev=3612.39
    sync percentiles (nsec):
     |  1.00th=[ 1272],  5.00th=[ 1288], 10.00th=[ 1288], 20.00th=[ 1304],
     | 30.00th=[ 1304], 40.00th=[ 1352], 50.00th=[ 3216], 60.00th=[ 6496],
     | 70.00th=[ 6816], 80.00th=[ 8256], 90.00th=[ 9152], 95.00th=[ 9408],
     | 99.00th=[ 9792], 99.50th=[10048], 99.90th=[13632], 99.95th=[17792],
     | 99.99th=[35584]
  cpu          : usr=13.95%, sys=80.43%, ctx=3930455, majf=0, minf=2
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=7870630,7858010,0,15728645 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=130MiB/s (136MB/s), 130MiB/s-130MiB/s (136MB/s-136MB/s), io=30.0GiB 
(32.2GB), run=236394-236394msec
  WRITE: bw=130MiB/s (136MB/s), 130MiB/s-130MiB/s (136MB/s-136MB/s), io=30.0GiB 
(32.2GB), run=236394-236394msec


% fio --name=rand_mmap_r+w --directory=/fast/fio_test --rw=randrw 
--blocksize=4k --size=6g --io_size=60g --runtime=600 --ioengine=psync --fsync=1 
--thread --numjobs=1 --group_reporting
rand_mmap_r+w: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=psync, iodepth=1
fio-3.33
Starting 1 thread
rand_mmap_r+w: Laying out IO file (1 file / 6144MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=34.4MiB/s,w=34.7MiB/s][r=8806,w=8887 IOPS][eta 
00m:00s]
rand_mmap_r+w: (groupid=0, jobs=1): err= 0: pid=547844160: Wed Feb 15 14:02:54 
2023
  read: IOPS=8912, BW=34.8MiB/s (36.5MB/s)(20.4GiB/600000msec)
    clat (nsec): min=1936, max=1338.2k, avg=7079.46, stdev=1693.74
     lat (usec): min=2, max=1338, avg= 7.16, stdev= 1.69
    clat percentiles (nsec):
     |  1.00th=[ 3184],  5.00th=[ 3952], 10.00th=[ 6624], 20.00th=[ 6944],
     | 30.00th=[ 7072], 40.00th=[ 7200], 50.00th=[ 7264], 60.00th=[ 7392],
     | 70.00th=[ 7456], 80.00th=[ 7584], 90.00th=[ 7712], 95.00th=[ 7840],
     | 99.00th=[ 8096], 99.50th=[ 8256], 99.90th=[15296], 99.95th=[22656],
     | 99.99th=[35584]
   bw (  KiB/s): min= 2156, max=40456, per=100.00%, avg=35685.30, 
stdev=1649.88, samples=1199
   iops        : min=  539, max=10114, avg=8921.12, stdev=412.46, samples=1199
  write: IOPS=8898, BW=34.8MiB/s (36.4MB/s)(20.4GiB/600000msec); 0 zone resets
    clat (usec): min=2, max=5449, avg=30.59, stdev= 8.59
     lat (usec): min=2, max=5449, avg=30.68, stdev= 8.59
    clat percentiles (nsec):
     |  1.00th=[ 4960],  5.00th=[ 8640], 10.00th=[24960], 20.00th=[29312],
     | 30.00th=[31872], 40.00th=[32640], 50.00th=[33024], 60.00th=[33536],
     | 70.00th=[34048], 80.00th=[34560], 90.00th=[35072], 95.00th=[35584],
     | 99.00th=[36608], 99.50th=[37120], 99.90th=[56576], 99.95th=[59136],
     | 99.99th=[71168]
   bw (  KiB/s): min= 2156, max=40873, per=100.00%, avg=35630.64, 
stdev=1555.53, samples=1199
   iops        : min=  539, max=10218, avg=8907.43, stdev=388.88, samples=1199
  lat (usec)   : 2=0.01%, 4=2.70%, 10=51.30%, 20=0.49%, 50=45.42%
  lat (usec)   : 100=0.09%, 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=22, max=34396, avg=36.21, stdev=25.32
    sync percentiles (usec):
     |  1.00th=[   24],  5.00th=[   24], 10.00th=[   25], 20.00th=[   25],
     | 30.00th=[   25], 40.00th=[   25], 50.00th=[   41], 60.00th=[   47],
     | 70.00th=[   47], 80.00th=[   48], 90.00th=[   49], 95.00th=[   51],
     | 99.00th=[   53], 99.50th=[   55], 99.90th=[   75], 99.95th=[   80],
     | 99.99th=[  126]
  cpu          : usr=5.17%, sys=44.93%, ctx=16028689, majf=0, minf=2
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=5347458,5339252,0,10686712 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=34.8MiB/s (36.5MB/s), 34.8MiB/s-34.8MiB/s (36.5MB/s-36.5MB/s), 
io=20.4GiB (21.9GB), run=600000-600000msec
  WRITE: bw=34.8MiB/s (36.4MB/s), 34.8MiB/s-34.8MiB/s (36.4MB/s-36.4MB/s), 
io=20.4GiB (21.9GB), run=600000-600000msec

Reply via email to