On Tue, Apr 29, 2025 at 01:01:34PM +0200, Markus Armbruster wrote: > "Richard W.M. Jones" <rjo...@redhat.com> writes: > > > On Tue, Apr 29, 2025 at 07:49:00AM +0200, Markus Armbruster wrote: > >> Eric Blake <ebl...@redhat.com> writes: > >> > >> > From: "Richard W.M. Jones" <rjo...@redhat.com> > >> > > >> > Add multi-conn option to the NBD client. This commit just adds the > >> > option, it is not functional. > >> > > >> > Setting this to a value > 1 permits multiple connections to the NBD > >> > server; a typical value might be 4. The default is 1, meaning only a > >> > single connection is made. If the NBD server does not advertise that > >> > it is safe for multi-conn then this setting is forced to 1. > >> > > >> > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com> > >> > [eblake: also expose it through QMP] > >> > Signed-off-by: Eric Blake <ebl...@redhat.com> > >> > --- > >> > qapi/block-core.json | 8 +++++++- > >> > block/nbd.c | 24 ++++++++++++++++++++++++ > >> > 2 files changed, 31 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/qapi/block-core.json b/qapi/block-core.json > >> > index 7f70ec6d3cb..5c10824f35b 100644 > >> > --- a/qapi/block-core.json > >> > +++ b/qapi/block-core.json > >> > @@ -4545,6 +4545,11 @@ > >> > # until successful or until @open-timeout seconds have elapsed. > >> > # Default 0 (Since 7.0) > >> > # > >> > +# @multi-conn: Request the number of parallel client connections to make > >> > +# to the server, up to 16. If the server does not advertise support > >> > +# for multiple connections, or if this value is 0 or 1, all traffic > >> > +# is sent through a single connection. Default 1 (Since 10.1) > >> > +# > >> > >> So we silently ignore @multi-conn when its value is (nonsensical) zero, > >> and when the server doesn't let us honor the value. Hmm. Silently > >> ignoring the user's wishes can result in confusion. Should we reject > >> instead? > > > > We could certainly reject 0. It's also possible to reject the case > > where multi-conn is not supported by the server, but is requested by > > the client, but I feel that's a bit user-unfriendly. After all, > > multi-conn isn't essential for it to work, it's needed if you want > > best performance. (Maybe issue a warning in the code - below - where > > we set multi-conn back to 1? I don't know what qemu thinks about > > warnings.) > > QMP doesn't support warnings, so they go to stderr instead, where they > may or may not be seen. > > When I instruct a program to do X, I prefer it to do exactly X, and fail > when that's not possible. Correcting X behind my back may be friendly, > until the day I spent quality time figuring out WTF is going on. > > Perhaps this one is a matter of documentation. As is, @multi-conn feels > like "set the number of connections" to me, until I read the fine print, > which contradicts it. We could perhaps phrase it as a limit instead: > enable multiple connections and simultaneously limit their number.
It is tricky. In nbdcopy we've preferred to go with "you suggest some numbers and we'll pick something that works": https://gitlab.com/nbdkit/libnbd/-/blob/master/copy/main.c?ref_type=heads#L446-L493 but also we do provide a way for you to find out what was selected: https://gitlab.com/nbdkit/libnbd/-/blob/master/copy/main.c?ref_type=heads#L521 (I'm not claiming this is the best approach or suitable for everyone.) In the context of qemu that might suggest having separate multi_conn_requested and multi_conn fields, where the latter can be queried over QMP to find out what is actually going on. Could even add multi_conn_max to allow MAX_MULTI_CONN constant to be read out. > >> > # Features: > >> > # > >> > # @unstable: Member @x-dirty-bitmap is experimental. > >> > @@ -4558,7 +4563,8 @@ > >> > '*tls-hostname': 'str', > >> > '*x-dirty-bitmap': { 'type': 'str', 'features': [ > >> > 'unstable' ] }, > >> > '*reconnect-delay': 'uint32', > >> > - '*open-timeout': 'uint32' } } > >> > + '*open-timeout': 'uint32', > >> > + '*multi-conn': 'uint32' } } > >> > > >> > ## > >> > # @BlockdevOptionsRaw: > >> > diff --git a/block/nbd.c b/block/nbd.c > >> > index d5a2b21c6d1..5eb00e360af 100644 > >> > --- a/block/nbd.c > >> > +++ b/block/nbd.c > >> > @@ -48,6 +48,7 @@ > >> > > >> > #define EN_OPTSTR ":exportname=" > >> > #define MAX_NBD_REQUESTS 16 > >> > +#define MAX_MULTI_CONN 16 > >> > >> Out of curiosity: where does this value come from? > > > > So I should note first this is a maximum, not a recommendation. > > Is it the arbitrarily chosen maximum for QEMU, or is it the well-known > maximum for NBD, or is it something else? I don't recall exactly, but it was probably an ass-pulled number. > > nbdcopy defaults to 4, which was derived from testing on a high end > > (for 2024) AMD machine. Above 4 performance doesn't increase any > > further on that machine. It's going to very much depend on how many > > cores you have spare, how many TCP connections you want to open, and > > how effectively the client and server handle parallelism. > > > > And imponderables like one we hit in virt-v2v: If accessing a VMware > > server, the VMware server actually slows down as you add more > > connections, even though it should theoretically support multi-conn. > > We ended up forcing multi-conn to 1 in this case. You can't know this > > in advance from the client side. > > > >> > > >> > #define COOKIE_TO_INDEX(cookie) ((cookie) - 1) > >> > #define INDEX_TO_COOKIE(index) ((index) + 1) > >> > @@ -97,6 +98,7 @@ typedef struct BDRVNBDState { > >> > /* Connection parameters */ > >> > uint32_t reconnect_delay; > >> > uint32_t open_timeout; > >> > + uint32_t multi_conn; > >> > SocketAddress *saddr; > >> > char *export; > >> > char *tlscredsid; > >> > @@ -1840,6 +1842,15 @@ static QemuOptsList nbd_runtime_opts = { > >> > "attempts until successful or until @open-timeout > >> > seconds " > >> > "have elapsed. Default 0", > >> > }, > >> > + { > >> > + .name = "multi-conn", > >> > + .type = QEMU_OPT_NUMBER, > >> > + .help = "If > 1 permit up to this number of connections to > >> > the " > >> > + "server. The server must also advertise multi-conn " > >> > + "support. If <= 1, only a single connection is > >> > made " > >> > + "to the server even if the server advertises > >> > multi-conn. " > >> > + "Default 1", > >> > >> This text implies the requested value is silently limited to the value > >> provided by the server, unlike the doc comment above. Although the > >> "must" in "the sever must" could also be understood as "error when it > >> doesn't". > > > > I'll just note that multi-conn is a boolean flag advertised by the > > server. Servers don't advertise any preferred number of connections. > > I don't know how to improve the text. > > Let's get the QAPI schema doc comment right before we worry about this > one. > > >> > + }, > >> > { /* end of list */ } > >> > }, > >> > }; > >> > @@ -1895,6 +1906,10 @@ static int nbd_process_options(BlockDriverState > >> > *bs, QDict *options, > >> > > >> > s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", > >> > 0); > >> > s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0); > >> > + s->multi_conn = qemu_opt_get_number(opts, "multi-conn", 1); > >> > + if (s->multi_conn > MAX_MULTI_CONN) { > >> > + s->multi_conn = MAX_MULTI_CONN; > >> > + } > >> > >> We silently cap the user's requested number to 16. Not clear from QAPI > >> schema doc comment; the "up to 16" there suggests more is an error. > >> Should we error out instead? > >> > >> > > >> > ret = 0; > >> > > >> > @@ -1949,6 +1964,15 @@ static int nbd_open(BlockDriverState *bs, QDict > >> > *options, int flags, > >> > > >> > nbd_client_connection_enable_retry(s->conn); > >> > > >> > + /* > >> > + * We set s->multi_conn in nbd_process_options above, but now that > >> > + * we have connected if the server doesn't advertise that it is > >> > + * safe for multi-conn, force it to 1. > >> > + */ > >> > + if (!(s->info.flags & NBD_FLAG_CAN_MULTI_CONN)) { > >> > + s->multi_conn = 1; > >> > + } > >> > + > >> > return 0; > >> > > >> > fail: > > > > Rich. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org