Public bug reported: $ lsb_release -rd Description: Ubuntu 16.04.2 LTS Release: 16.04
Package: fio (2.2.10-1ubuntu1) [universe] I've been using fio to look at the effect of zfs recordsize on random writes to a zfs dataset. After wasting the better part of a day trying to understand fio's unrealistically high reported bandwidth I finally used hexcurse to have a look at one of the files that fio allocated before doing its randwrite benchmark. I discovered that chunks that were supposed to be random data were in fact highly non-random; each "random" chunk was identical (or nearly so) to the previous "random" chunk. In case the zfs dataset uses compression (lz4 is on by default) the data end up being highly compressed before writing to disk, severely distorting the reported benchmark results. For example, for this fio command: fio \ --name=random-writers \ --ioengine=sync \ --direct=0 \ --rw=randwrite \ --bs=512k \ --refill_buffers \ --scramble_buffers=0 \ --buffer_compress_percentage=50 \ --buffer_compress_chunk=512 \ --size=2g \ --time_based \ --runtime=20 \ --randrepeat=0 \ --norandommap \ --random_generator=lfsr \ --numjobs=8 \ --group_reporting xenial's fio 2.2.10 reported an aggregate bandwidth of 3315 MB/s (for an SATA 3.1 SSD) due to the highly compressible data: # zfs get written,logicalused,compressratio tank/test NAME PROPERTY VALUE SOURCE tank/test written 2.03G - tank/test logicalused 16.0G - tank/test compressratio 7.94x - Not all of the above fio parameters (in particular, --random_generator) are required in order to see the problem; this was the final command I wound up with trying to fix the non-random "random" data. I finally gave up and compiled fio 2.21 from source: https://github.com/axboe/fio/releases Running the exact same command on the same zfs dataset on the same hardware, fio 2.21 reports a believable aggregate bandwidth of 808 MiB/s (848 MB/s) and zfs reports compressratio 1.59x, in line with what I'd expect for data that is "50% random" as requested by the fio command. Using hexcurse to look at a file fio generated before starting the benchmark, the "random" file chunks now appear truly random; there is no longer any obvious pattern repeated among successive "random" chunks. So this fio bug has been fixed upstream somewhere between 2.2.11 - 2.21. Short changelogs are available from: http://brick.kernel.dk/snaps/ I was unable to determine when exactly the bug was fixed. But wanted to issue a public service announcement: DO NOT rely on Xenial's fio 2.2.10 to use truly random data. Until a more recent fio is available for Xenial, in repo or as PPA, I recommend compiling fio from source. The bug reported here is fixed in fio 2.21 (and probably some earlier versions). ** Affects: fio (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1703440 Title: xenial fio 2.2.10 randwrite: "random" data NOT random, highly compressible -> highly misleading output To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/fio/+bug/1703440/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs