"Richard W.M. Jones" <rjo...@redhat.com> writes:

> On Tue, Apr 29, 2025 at 07:49:00AM +0200, Markus Armbruster wrote:
>> Eric Blake <ebl...@redhat.com> writes:
>> 
>> > From: "Richard W.M. Jones" <rjo...@redhat.com>
>> >
>> > Add multi-conn option to the NBD client.  This commit just adds the
>> > option, it is not functional.
>> >
>> > Setting this to a value > 1 permits multiple connections to the NBD
>> > server; a typical value might be 4.  The default is 1, meaning only a
>> > single connection is made.  If the NBD server does not advertise that
>> > it is safe for multi-conn then this setting is forced to 1.
>> >
>> > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com>
>> > [eblake: also expose it through QMP]
>> > Signed-off-by: Eric Blake <ebl...@redhat.com>
>> > ---
>> >  qapi/block-core.json |  8 +++++++-
>> >  block/nbd.c          | 24 ++++++++++++++++++++++++
>> >  2 files changed, 31 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/qapi/block-core.json b/qapi/block-core.json
>> > index 7f70ec6d3cb..5c10824f35b 100644
>> > --- a/qapi/block-core.json
>> > +++ b/qapi/block-core.json
>> > @@ -4545,6 +4545,11 @@
>> >  #     until successful or until @open-timeout seconds have elapsed.
>> >  #     Default 0 (Since 7.0)
>> >  #
>> > +# @multi-conn: Request the number of parallel client connections to make
>> > +#     to the server, up to 16.  If the server does not advertise support
>> > +#     for multiple connections, or if this value is 0 or 1, all traffic
>> > +#     is sent through a single connection.  Default 1 (Since 10.1)
>> > +#
>> 
>> So we silently ignore @multi-conn when its value is (nonsensical) zero,
>> and when the server doesn't let us honor the value.  Hmm.  Silently
>> ignoring the user's wishes can result in confusion.  Should we reject
>> instead?
>
> We could certainly reject 0.  It's also possible to reject the case
> where multi-conn is not supported by the server, but is requested by
> the client, but I feel that's a bit user-unfriendly.  After all,
> multi-conn isn't essential for it to work, it's needed if you want
> best performance.  (Maybe issue a warning in the code - below - where
> we set multi-conn back to 1?  I don't know what qemu thinks about
> warnings.)

QMP doesn't support warnings, so they go to stderr instead, where they
may or may not be seen.

When I instruct a program to do X, I prefer it to do exactly X, and fail
when that's not possible.  Correcting X behind my back may be friendly,
until the day I spent quality time figuring out WTF is going on.

Perhaps this one is a matter of documentation.  As is, @multi-conn feels
like "set the number of connections" to me, until I read the fine print,
which contradicts it.  We could perhaps phrase it as a limit instead:
enable multiple connections and simultaneously limit their number.

>> >  # Features:
>> >  #
>> >  # @unstable: Member @x-dirty-bitmap is experimental.
>> > @@ -4558,7 +4563,8 @@
>> >              '*tls-hostname': 'str',
>> >              '*x-dirty-bitmap': { 'type': 'str', 'features': [ 'unstable' 
>> > ] },
>> >              '*reconnect-delay': 'uint32',
>> > -            '*open-timeout': 'uint32' } }
>> > +            '*open-timeout': 'uint32',
>> > +            '*multi-conn': 'uint32' } }
>> >
>> >  ##
>> >  # @BlockdevOptionsRaw:
>> > diff --git a/block/nbd.c b/block/nbd.c
>> > index d5a2b21c6d1..5eb00e360af 100644
>> > --- a/block/nbd.c
>> > +++ b/block/nbd.c
>> > @@ -48,6 +48,7 @@
>> >
>> >  #define EN_OPTSTR ":exportname="
>> >  #define MAX_NBD_REQUESTS    16
>> > +#define MAX_MULTI_CONN      16
>> 
>> Out of curiosity: where does this value come from?
>
> So I should note first this is a maximum, not a recommendation.

Is it the arbitrarily chosen maximum for QEMU, or is it the well-known
maximum for NBD, or is it something else?

> nbdcopy defaults to 4, which was derived from testing on a high end
> (for 2024) AMD machine.  Above 4 performance doesn't increase any
> further on that machine.  It's going to very much depend on how many
> cores you have spare, how many TCP connections you want to open, and
> how effectively the client and server handle parallelism.
>
> And imponderables like one we hit in virt-v2v: If accessing a VMware
> server, the VMware server actually slows down as you add more
> connections, even though it should theoretically support multi-conn.
> We ended up forcing multi-conn to 1 in this case.  You can't know this
> in advance from the client side.
>
>> >
>> >  #define COOKIE_TO_INDEX(cookie) ((cookie) - 1)
>> >  #define INDEX_TO_COOKIE(index)  ((index) + 1)
>> > @@ -97,6 +98,7 @@ typedef struct BDRVNBDState {
>> >      /* Connection parameters */
>> >      uint32_t reconnect_delay;
>> >      uint32_t open_timeout;
>> > +    uint32_t multi_conn;
>> >      SocketAddress *saddr;
>> >      char *export;
>> >      char *tlscredsid;
>> > @@ -1840,6 +1842,15 @@ static QemuOptsList nbd_runtime_opts = {
>> >                      "attempts until successful or until @open-timeout 
>> > seconds "
>> >                      "have elapsed. Default 0",
>> >          },
>> > +        {
>> > +            .name = "multi-conn",
>> > +            .type = QEMU_OPT_NUMBER,
>> > +            .help = "If > 1 permit up to this number of connections to 
>> > the "
>> > +                    "server. The server must also advertise multi-conn "
>> > +                    "support.  If <= 1, only a single connection is made "
>> > +                    "to the server even if the server advertises 
>> > multi-conn. "
>> > +                    "Default 1",
>> 
>> This text implies the requested value is silently limited to the value
>> provided by the server, unlike the doc comment above.  Although the
>> "must" in "the sever must" could also be understood as "error when it
>> doesn't".
>
> I'll just note that multi-conn is a boolean flag advertised by the
> server.  Servers don't advertise any preferred number of connections.
> I don't know how to improve the text.

Let's get the QAPI schema doc comment right before we worry about this
one.

>> > +        },
>> >          { /* end of list */ }
>> >      },
>> >  };
>> > @@ -1895,6 +1906,10 @@ static int nbd_process_options(BlockDriverState 
>> > *bs, QDict *options,
>> >
>> >      s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0);
>> >      s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0);
>> > +    s->multi_conn = qemu_opt_get_number(opts, "multi-conn", 1);
>> > +    if (s->multi_conn > MAX_MULTI_CONN) {
>> > +        s->multi_conn = MAX_MULTI_CONN;
>> > +    }
>> 
>> We silently cap the user's requested number to 16.  Not clear from QAPI
>> schema doc comment; the "up to 16" there suggests more is an error.
>> Should we error out instead?
>> 
>> >
>> >      ret = 0;
>> >
>> > @@ -1949,6 +1964,15 @@ static int nbd_open(BlockDriverState *bs, QDict 
>> > *options, int flags,
>> >
>> >      nbd_client_connection_enable_retry(s->conn);
>> >
>> > +    /*
>> > +     * We set s->multi_conn in nbd_process_options above, but now that
>> > +     * we have connected if the server doesn't advertise that it is
>> > +     * safe for multi-conn, force it to 1.
>> > +     */
>> > +    if (!(s->info.flags & NBD_FLAG_CAN_MULTI_CONN)) {
>> > +        s->multi_conn = 1;
>> > +    }
>> > +
>> >      return 0;
>> >
>> >  fail:
>
> Rich.


Reply via email to