On Tue, Apr 29, 2025 at 01:01:34PM +0200, Markus Armbruster wrote:
> "Richard W.M. Jones" <rjo...@redhat.com> writes:
> 
> > On Tue, Apr 29, 2025 at 07:49:00AM +0200, Markus Armbruster wrote:
> >> Eric Blake <ebl...@redhat.com> writes:
> >> 
> >> > From: "Richard W.M. Jones" <rjo...@redhat.com>
> >> >
> >> > Add multi-conn option to the NBD client.  This commit just adds the
> >> > option, it is not functional.
> >> >
> >> > Setting this to a value > 1 permits multiple connections to the NBD
> >> > server; a typical value might be 4.  The default is 1, meaning only a
> >> > single connection is made.  If the NBD server does not advertise that
> >> > it is safe for multi-conn then this setting is forced to 1.
> >> >
> >> > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com>
> >> > [eblake: also expose it through QMP]
> >> > Signed-off-by: Eric Blake <ebl...@redhat.com>
> >> > ---
> >> >  qapi/block-core.json |  8 +++++++-
> >> >  block/nbd.c          | 24 ++++++++++++++++++++++++
> >> >  2 files changed, 31 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> >> > index 7f70ec6d3cb..5c10824f35b 100644
> >> > --- a/qapi/block-core.json
> >> > +++ b/qapi/block-core.json
> >> > @@ -4545,6 +4545,11 @@
> >> >  #     until successful or until @open-timeout seconds have elapsed.
> >> >  #     Default 0 (Since 7.0)
> >> >  #
> >> > +# @multi-conn: Request the number of parallel client connections to make
> >> > +#     to the server, up to 16.  If the server does not advertise support
> >> > +#     for multiple connections, or if this value is 0 or 1, all traffic
> >> > +#     is sent through a single connection.  Default 1 (Since 10.1)
> >> > +#
> >> 
> >> So we silently ignore @multi-conn when its value is (nonsensical) zero,
> >> and when the server doesn't let us honor the value.  Hmm.  Silently
> >> ignoring the user's wishes can result in confusion.  Should we reject
> >> instead?
> >
> > We could certainly reject 0.  It's also possible to reject the case
> > where multi-conn is not supported by the server, but is requested by
> > the client, but I feel that's a bit user-unfriendly.  After all,
> > multi-conn isn't essential for it to work, it's needed if you want
> > best performance.  (Maybe issue a warning in the code - below - where
> > we set multi-conn back to 1?  I don't know what qemu thinks about
> > warnings.)
> 
> QMP doesn't support warnings, so they go to stderr instead, where they
> may or may not be seen.
> 
> When I instruct a program to do X, I prefer it to do exactly X, and fail
> when that's not possible.  Correcting X behind my back may be friendly,
> until the day I spent quality time figuring out WTF is going on.
> 
> Perhaps this one is a matter of documentation.  As is, @multi-conn feels
> like "set the number of connections" to me, until I read the fine print,
> which contradicts it.  We could perhaps phrase it as a limit instead:
> enable multiple connections and simultaneously limit their number.

It is tricky.  In nbdcopy we've preferred to go with "you suggest some
numbers and we'll pick something that works":

https://gitlab.com/nbdkit/libnbd/-/blob/master/copy/main.c?ref_type=heads#L446-L493

but also we do provide a way for you to find out what was selected:

https://gitlab.com/nbdkit/libnbd/-/blob/master/copy/main.c?ref_type=heads#L521

(I'm not claiming this is the best approach or suitable for everyone.)

In the context of qemu that might suggest having separate
multi_conn_requested and multi_conn fields, where the latter can be
queried over QMP to find out what is actually going on.  Could even
add multi_conn_max to allow MAX_MULTI_CONN constant to be read out.

> >> >  # Features:
> >> >  #
> >> >  # @unstable: Member @x-dirty-bitmap is experimental.
> >> > @@ -4558,7 +4563,8 @@
> >> >              '*tls-hostname': 'str',
> >> >              '*x-dirty-bitmap': { 'type': 'str', 'features': [ 
> >> > 'unstable' ] },
> >> >              '*reconnect-delay': 'uint32',
> >> > -            '*open-timeout': 'uint32' } }
> >> > +            '*open-timeout': 'uint32',
> >> > +            '*multi-conn': 'uint32' } }
> >> >
> >> >  ##
> >> >  # @BlockdevOptionsRaw:
> >> > diff --git a/block/nbd.c b/block/nbd.c
> >> > index d5a2b21c6d1..5eb00e360af 100644
> >> > --- a/block/nbd.c
> >> > +++ b/block/nbd.c
> >> > @@ -48,6 +48,7 @@
> >> >
> >> >  #define EN_OPTSTR ":exportname="
> >> >  #define MAX_NBD_REQUESTS    16
> >> > +#define MAX_MULTI_CONN      16
> >> 
> >> Out of curiosity: where does this value come from?
> >
> > So I should note first this is a maximum, not a recommendation.
> 
> Is it the arbitrarily chosen maximum for QEMU, or is it the well-known
> maximum for NBD, or is it something else?

I don't recall exactly, but it was probably an ass-pulled number.

> > nbdcopy defaults to 4, which was derived from testing on a high end
> > (for 2024) AMD machine.  Above 4 performance doesn't increase any
> > further on that machine.  It's going to very much depend on how many
> > cores you have spare, how many TCP connections you want to open, and
> > how effectively the client and server handle parallelism.
> >
> > And imponderables like one we hit in virt-v2v: If accessing a VMware
> > server, the VMware server actually slows down as you add more
> > connections, even though it should theoretically support multi-conn.
> > We ended up forcing multi-conn to 1 in this case.  You can't know this
> > in advance from the client side.
> >
> >> >
> >> >  #define COOKIE_TO_INDEX(cookie) ((cookie) - 1)
> >> >  #define INDEX_TO_COOKIE(index)  ((index) + 1)
> >> > @@ -97,6 +98,7 @@ typedef struct BDRVNBDState {
> >> >      /* Connection parameters */
> >> >      uint32_t reconnect_delay;
> >> >      uint32_t open_timeout;
> >> > +    uint32_t multi_conn;
> >> >      SocketAddress *saddr;
> >> >      char *export;
> >> >      char *tlscredsid;
> >> > @@ -1840,6 +1842,15 @@ static QemuOptsList nbd_runtime_opts = {
> >> >                      "attempts until successful or until @open-timeout 
> >> > seconds "
> >> >                      "have elapsed. Default 0",
> >> >          },
> >> > +        {
> >> > +            .name = "multi-conn",
> >> > +            .type = QEMU_OPT_NUMBER,
> >> > +            .help = "If > 1 permit up to this number of connections to 
> >> > the "
> >> > +                    "server. The server must also advertise multi-conn "
> >> > +                    "support.  If <= 1, only a single connection is 
> >> > made "
> >> > +                    "to the server even if the server advertises 
> >> > multi-conn. "
> >> > +                    "Default 1",
> >> 
> >> This text implies the requested value is silently limited to the value
> >> provided by the server, unlike the doc comment above.  Although the
> >> "must" in "the sever must" could also be understood as "error when it
> >> doesn't".
> >
> > I'll just note that multi-conn is a boolean flag advertised by the
> > server.  Servers don't advertise any preferred number of connections.
> > I don't know how to improve the text.
> 
> Let's get the QAPI schema doc comment right before we worry about this
> one.
> 
> >> > +        },
> >> >          { /* end of list */ }
> >> >      },
> >> >  };
> >> > @@ -1895,6 +1906,10 @@ static int nbd_process_options(BlockDriverState 
> >> > *bs, QDict *options,
> >> >
> >> >      s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 
> >> > 0);
> >> >      s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0);
> >> > +    s->multi_conn = qemu_opt_get_number(opts, "multi-conn", 1);
> >> > +    if (s->multi_conn > MAX_MULTI_CONN) {
> >> > +        s->multi_conn = MAX_MULTI_CONN;
> >> > +    }
> >> 
> >> We silently cap the user's requested number to 16.  Not clear from QAPI
> >> schema doc comment; the "up to 16" there suggests more is an error.
> >> Should we error out instead?
> >> 
> >> >
> >> >      ret = 0;
> >> >
> >> > @@ -1949,6 +1964,15 @@ static int nbd_open(BlockDriverState *bs, QDict 
> >> > *options, int flags,
> >> >
> >> >      nbd_client_connection_enable_retry(s->conn);
> >> >
> >> > +    /*
> >> > +     * We set s->multi_conn in nbd_process_options above, but now that
> >> > +     * we have connected if the server doesn't advertise that it is
> >> > +     * safe for multi-conn, force it to 1.
> >> > +     */
> >> > +    if (!(s->info.flags & NBD_FLAG_CAN_MULTI_CONN)) {
> >> > +        s->multi_conn = 1;
> >> > +    }
> >> > +
> >> >      return 0;
> >> >
> >> >  fail:
> >
> > Rich.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org


Reply via email to