On Tue, Aug 04, 2020 at 03:37:26PM +0800, Derek Su wrote: > Set the cache=none in virtiofsd and direct=1 in fio, > here are the results and kvm-exit count in 5 seconds. > > --thread-pool-size=64 (default) > seq read: 307 MB/s (kvm-exit count=1076463) > seq write: 430 MB/s (kvm-exit count=1302493) > rand 4KB read: 65.2k IOPS (kvm-exit count=1322899) > rand 4KB write: 97.2k IOPS (kvm-exit count=1568618) > > --thread-pool-size=1 > seq read: 303 MB/s (kvm-exit count=1034614) > seq write: 358 MB/s. (kvm-exit count=1537735) > rand 4KB read: 7995 IOPS (kvm-exit count=438348) > rand 4KB write: 97.7k IOPS (kvm-exit count=1907585) > > The thread-pool-size=64 improves the rand 4KB read performance largely, > but doesn't increases the kvm-exit count too much. > > In addition, the fio avg. clat of rand 4K write are 960us for > thread-pool-size=64 and 7700us for thread-pool-size=1.
These numbers make sense to me. The thread pool is generally faster. Note that virtiofsd opens files without O_DIRECT, even if with the cache=none option. This explains why rand 4KB write reaches 97.7K but rand 4KB read only does 7885 IOPS (random reads result in page cache misses on the host). I don't have a good explanation of why the thread pool was slower with direct=0 though :(. One way to investigate that is by checking whether the I/O pattern submitted by the guest is comparable between --thread-pool-size=64 and --thread-pool-size=1. You could try to observe this by tracing virtiofsd preadv()/pwritev() system calls. If you find that --thread-pool-size=64 made more I/O requests and with smaller block sizes, then it's probably a timing issue where the guest page cache responds differently because the virtiofsd thread pool completes requests at a different rate. Maybe it affects how the guest page cache is populated and a slower virtiofsd leads to more efficient page cache activity in the guest (-> fewer and bigger FUSE read/write requests)? Stefan
signature.asc
Description: PGP signature