On Fri, Apr 04, 2025 at 02:49:08PM +0200, Hanna Czenczek wrote: > On 27.03.25 16:55, Stefan Hajnoczi wrote: > > On Tue, Mar 25, 2025 at 05:06:54PM +0100, Hanna Czenczek wrote: > > > FUSE allows creating multiple request queues by "cloning" /dev/fuse FDs > > > (via open("/dev/fuse") + ioctl(FUSE_DEV_IOC_CLONE)). > > > > > > We can use this to implement multi-threading. > > > > > > Note that the interface presented here differs from the multi-queue > > > interface of virtio-blk: The latter maps virtqueues to iothreads, which > > > allows processing multiple virtqueues in a single iothread. The > > > equivalent (processing multiple FDs in a single iothread) would not make > > > sense for FUSE because those FDs are used in a round-robin fashion by > > > the FUSE kernel driver. Putting two of them into a single iothread will > > > just create a bottleneck. > > This text might be outdated. virtio-blk's new iothread-vq-mapping > > parameter provides the "array of iothreads" mentioned below and a way to > > assign virtqueues to those IOThreads. > > Ah, yes. The difference is still that with FUSE, there is no such > assignment, because it wouldn’t make sense. But I can change s/maps > virtqueues/allows mapping virtqueues/, and s/differs from/is only a subset > of/, if that’s alright.
Sure, thanks! > > > Therefore, all we need is an array of iothreads, and we will create one > > > "queue" (FD) per thread. > > > > > > These are the benchmark results when using four threads (compared to a > > > single thread); note that fio still only uses a single job, but > > > performance can still be improved because of said round-robin usage for > > > the queues. (Not in the sync case, though, in which case I guess it > > > just adds overhead.) > > Interesting. FUSE-over-io_uring seems to be different from > > FUSE_DEV_IOC_CLONE here. It doesn't do round-robin. It uses CPU affinity > > instead, handing requests to the io_uring context associated with the > > current CPU when possible. > > Do you think that should have implications for the QAPI interface? It would be helpful to document the behavior so users know when round-robin or CPU affinity are used, but the parameter itself would be unchanged: an array of IOThreads. > > [...] > > > > qapi/block-export.json | 8 +- > > > block/export/fuse.c | 214 +++++++++++++++++++++++++++++++++-------- > > > 2 files changed, 179 insertions(+), 43 deletions(-) > > > > > > diff --git a/qapi/block-export.json b/qapi/block-export.json > > > index c783e01a53..0bdd5992eb 100644 > > > --- a/qapi/block-export.json > > > +++ b/qapi/block-export.json > > > @@ -179,12 +179,18 @@ > > > # mount the export with allow_other, and if that fails, try again > > > # without. (since 6.1; default: auto) > > > # > > > +# @iothreads: Enables multi-threading: Handle requests in each of the > > > +# given iothreads (instead of the block device's iothread, or the > > > +# export's "main" iothread). For this, the FUSE FD is duplicated so > > > +# there is one FD per iothread. (since 10.1) > > This option isn't FUSE-specific but FUSE is the first export type to > > support it. Please add it to BlockExportOptions instead and refuse > > export creation when the export type only supports 1 IOThread. > > Makes sense. I’ll try to go with what Kevin suggested, i.e. have @iothread > be an alternate type. > > Hanna > > > > > Eric: Are you interested in implementing support for multiple IOThreads > > in the NBD export? I remember some time ago we talked about NBD > > multi-conn support, although maybe that was for the client rather than > > the server. >
signature.asc
Description: PGP signature