On Fri, Apr 04, 2025 at 02:49:08PM +0200, Hanna Czenczek wrote:
> On 27.03.25 16:55, Stefan Hajnoczi wrote:
> > On Tue, Mar 25, 2025 at 05:06:54PM +0100, Hanna Czenczek wrote:
> > > FUSE allows creating multiple request queues by "cloning" /dev/fuse FDs
> > > (via open("/dev/fuse") + ioctl(FUSE_DEV_IOC_CLONE)).
> > > 
> > > We can use this to implement multi-threading.
> > > 
> > > Note that the interface presented here differs from the multi-queue
> > > interface of virtio-blk: The latter maps virtqueues to iothreads, which
> > > allows processing multiple virtqueues in a single iothread.  The
> > > equivalent (processing multiple FDs in a single iothread) would not make
> > > sense for FUSE because those FDs are used in a round-robin fashion by
> > > the FUSE kernel driver.  Putting two of them into a single iothread will
> > > just create a bottleneck.
> > This text might be outdated. virtio-blk's new iothread-vq-mapping
> > parameter provides the "array of iothreads" mentioned below and a way to
> > assign virtqueues to those IOThreads.
> 
> Ah, yes.  The difference is still that with FUSE, there is no such
> assignment, because it wouldn’t make sense.  But I can change s/maps
> virtqueues/allows mapping virtqueues/, and s/differs from/is only a subset
> of/, if that’s alright.

Sure, thanks!

> > > Therefore, all we need is an array of iothreads, and we will create one
> > > "queue" (FD) per thread.
> > > 
> > > These are the benchmark results when using four threads (compared to a
> > > single thread); note that fio still only uses a single job, but
> > > performance can still be improved because of said round-robin usage for
> > > the queues.  (Not in the sync case, though, in which case I guess it
> > > just adds overhead.)
> > Interesting. FUSE-over-io_uring seems to be different from
> > FUSE_DEV_IOC_CLONE here. It doesn't do round-robin. It uses CPU affinity
> > instead, handing requests to the io_uring context associated with the
> > current CPU when possible.
> 
> Do you think that should have implications for the QAPI interface?

It would be helpful to document the behavior so users know when
round-robin or CPU affinity are used, but the parameter itself would be
unchanged: an array of IOThreads.

> 
> [...]
> 
> > >   qapi/block-export.json |   8 +-
> > >   block/export/fuse.c    | 214 +++++++++++++++++++++++++++++++++--------
> > >   2 files changed, 179 insertions(+), 43 deletions(-)
> > > 
> > > diff --git a/qapi/block-export.json b/qapi/block-export.json
> > > index c783e01a53..0bdd5992eb 100644
> > > --- a/qapi/block-export.json
> > > +++ b/qapi/block-export.json
> > > @@ -179,12 +179,18 @@
> > >   #     mount the export with allow_other, and if that fails, try again
> > >   #     without.  (since 6.1; default: auto)
> > >   #
> > > +# @iothreads: Enables multi-threading: Handle requests in each of the
> > > +#     given iothreads (instead of the block device's iothread, or the
> > > +#     export's "main" iothread).  For this, the FUSE FD is duplicated so
> > > +#     there is one FD per iothread.  (since 10.1)
> > This option isn't FUSE-specific but FUSE is the first export type to
> > support it. Please add it to BlockExportOptions instead and refuse
> > export creation when the export type only supports 1 IOThread.
> 
> Makes sense.  I’ll try to go with what Kevin suggested, i.e. have @iothread
> be an alternate type.
> 
> Hanna
> 
> > 
> > Eric: Are you interested in implementing support for multiple IOThreads
> > in the NBD export? I remember some time ago we talked about NBD
> > multi-conn support, although maybe that was for the client rather than
> > the server.
> 

Attachment: signature.asc
Description: PGP signature

Reply via email to