On 01.11.19 11:28, Vladimir Sementsov-Ogievskiy wrote: > 01.11.2019 13:20, Max Reitz wrote: >> On 01.11.19 11:00, Max Reitz wrote: >>> Hi, >>> >>> This series builds on the previous RFC. The workaround is now applied >>> unconditionally of AIO mode and filesystem because we don’t know those >>> things for remote filesystems. Furthermore, bdrv_co_get_self_request() >>> has been moved to block/io.c. >>> >>> Applying the workaround unconditionally is fine from a performance >>> standpoint, because it should actually be dead code, thanks to patch 1 >>> (the elephant in the room). As far as I know, there is no other block >>> driver but qcow2 in handle_alloc_space() that would submit zero writes >>> as part of normal I/O so it can occur concurrently to other write >>> requests. It still makes sense to take the workaround for file-posix >>> because we can’t really prevent that any other block driver will submit >>> zero writes as part of normal I/O in the future. >>> >>> Anyway, let’s get to the elephant. >>> >>> From input by XFS developers >>> (https://bugzilla.redhat.com/show_bug.cgi?id=1765547#c7) it seems clear >>> that c8bb23cbdbe causes fundamental performance problems on XFS with >>> aio=native that cannot be fixed. In other cases, c8bb23cbdbe improves >>> performance or we wouldn’t have it. >>> >>> In general, avoiding performance regressions is more important than >>> improving performance, unless the regressions are just a minor corner >>> case or insignificant when compared to the improvement. The XFS >>> regression is no minor corner case, and it isn’t insignificant. Laurent >>> Vivier has found performance to decrease by as much as 88 % (on ppc64le, >>> fio in a guest with 4k blocks, iodepth=8: 1662 kB/s from 13.9 MB/s). >> >> Ah, crap. >> >> I wanted to send this series as early today as possible to get as much >> feedback as possible, so I’ve only started doing benchmarks now. >> >> The obvious >> >> $ qemu-img bench -t none -n -w -S 65536 test.qcow2 >> >> on XFS takes like 6 seconds on master, and like 50 to 80 seconds with >> c8bb23cbdbe reverted. So now on to guest tests... > > Aha, that's very interesting) What about aio-native which should be slowed > down? > Could it be tested like this?
That is aio=native (-n). But so far I don’t see any significant difference in guest tests (i.e., fio --rw=write --bs=4k --iodepth=8 --runtime=1m --direct=1 --ioengine=libaio --thread --numjobs=16 --size=2G --time_based), neither with 64 kB nor with 2 MB clusters. (But only on XFS, I’ll have to see about ext4 still.) (Reverting c8bb23cbdbe makes it like 1 to 2 % faster.) Max
signature.asc
Description: OpenPGP digital signature