On 11/22/23 00:04, Eugene Grosbein wrote:
22.11.2023 13:49, Jonathan Chen wrote:
Hi,
I'm running a somewhat recent version of STABLE-13/amd64:
stable/13-n256681-0b7939d725ba: Fri Nov 10 08:48:36 NZDT 2023, and I'm seeing
some unusual behaviour with ZFS.
To reproduce:
1. one big empty disk, GPT scheme, 1 freebsd-zfs partition.
2. create a zpool, eg: tank
3. create 2 sub-filesystems, eg: tank/one, tank/two
4. fill each sub-filesystem with large files until the pool is ~80% full. In
my case I had 200 10Gb files in each.
5. in one session run 'md5 tank/one/*'
6. in another session run 'md5 tank/two/*'
For most of my runs, one of the sessions against a sub-filesystem will be
starved of I/O, while the other one is performant.
Is anyone else seeing this?
More details of the disk, disk controller, and FreeBSD version may be
helpful. If it is SATA, maybe there is impact form its own organizing of
when to handle the queue of tasks it is given in addition to what
FreeBSD+ZFS have assigned and OS (+ZFS if not its default) version could
say what IO balancing is currently present/available.
Please try repeating the test with atime updates disabled:
zfs set atime=off tank/one
zfs set atime=off tank/two
atime's impact is a write and writes get priority so if anything there
would be 'little' breaks in the reads to write such data. I doubt the
scenario+hardware in discussion is bottlenecking on writing atime data
for the access of these 10GB files but it would be interesting. On the
other hand, I think it is atime that trashes a smooth disk of freshly
created file structure with many files after default cronjobs pass over
it due to it + COW fragmenting the data structure to do basic things
like list disk contents; I have not tested properly what the source of
that repeatable issue was yet. Accessing data within the file doesn't
seem impacted the same way as the directory listing though.
I was thinking maybe an impact with sysctl settings involving prefetch
may impact the sequence. vfs.zfs.prefetch_disable=1 is probably the one
I am thinking of but I normally don't tweak zfs and related settings if
I don't have to as I usually later find the tweaks are problems themselves.
I keep my system running smoother when I put it under excessive load
with idprio and nice being used on the heavier noninteractive load.
Does it make any difference?
Does it make any difference, if you import the pool with readonly=on instead?
Writing to ~80% pool is almost always slow for ZFS.
Been there, done that. It is painful but I think it is more complicated
than just a total free space counter before that issue shows up. Other
performance issues also exist as I've had horrible I/O on disks that
never exceeded 20% used since being formatted.