Public bug reported:

$ lsb_release -rd
Description:    Ubuntu 16.04.2 LTS
Release:        16.04

Package: fio (2.2.10-1ubuntu1) [universe]

I've been using fio to look at the effect of zfs recordsize on random
writes to a zfs dataset. After wasting the better part of a day trying
to understand fio's unrealistically high reported bandwidth I finally
used hexcurse to have a look at one of the files that fio allocated
before doing its randwrite benchmark. I discovered that chunks that were
supposed to be random data were in fact highly non-random; each "random"
chunk was identical (or nearly so) to the previous "random" chunk. In
case the zfs dataset uses compression (lz4 is on by default) the data
end up being highly compressed before writing to disk, severely
distorting the reported benchmark results. For example, for this fio
command:

fio \
--name=random-writers \
--ioengine=sync \
--direct=0 \
--rw=randwrite \
--bs=512k \
--refill_buffers \
--scramble_buffers=0 \
--buffer_compress_percentage=50 \
--buffer_compress_chunk=512 \
--size=2g \
--time_based \
--runtime=20 \
--randrepeat=0 \
--norandommap \
--random_generator=lfsr \
--numjobs=8 \
--group_reporting

xenial's fio 2.2.10 reported an aggregate bandwidth of 3315 MB/s (for an
SATA 3.1 SSD) due to the highly compressible data:

# zfs get written,logicalused,compressratio tank/test
NAME       PROPERTY       VALUE    SOURCE
tank/test  written        2.03G    -
tank/test  logicalused    16.0G    -
tank/test  compressratio  7.94x    -

Not all of the above fio parameters (in particular, --random_generator)
are required in order to see the problem; this was the final command I
wound up with trying to fix the non-random "random" data. I finally gave
up and compiled fio 2.21 from source:

https://github.com/axboe/fio/releases

Running the exact same command on the same zfs dataset on the same
hardware, fio 2.21 reports a believable aggregate bandwidth of 808 MiB/s
(848 MB/s) and zfs reports compressratio 1.59x, in line with what I'd
expect for data that is "50% random" as requested by the fio command.
Using hexcurse to look at a file fio generated before starting the
benchmark, the "random" file chunks now appear truly random; there is no
longer any obvious pattern repeated among successive "random" chunks.

So this fio bug has been fixed upstream somewhere between 2.2.11 - 2.21. Short 
changelogs are available from:
http://brick.kernel.dk/snaps/

I was unable to determine when exactly the bug was fixed.

But wanted to issue a public service announcement: DO NOT rely on
Xenial's fio 2.2.10 to use truly random data.

Until a more recent fio is available for Xenial, in repo or as PPA, I
recommend compiling fio from source. The bug reported here is fixed in
fio 2.21 (and probably some earlier versions).

** Affects: fio (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1703440

Title:
  xenial fio 2.2.10 randwrite: "random" data NOT random, highly
  compressible -> highly misleading output

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fio/+bug/1703440/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to