"Richard W.M. Jones" <rjo...@redhat.com> writes: > On Tue, Apr 29, 2025 at 07:49:00AM +0200, Markus Armbruster wrote: >> Eric Blake <ebl...@redhat.com> writes: >> >> > From: "Richard W.M. Jones" <rjo...@redhat.com> >> > >> > Add multi-conn option to the NBD client. This commit just adds the >> > option, it is not functional. >> > >> > Setting this to a value > 1 permits multiple connections to the NBD >> > server; a typical value might be 4. The default is 1, meaning only a >> > single connection is made. If the NBD server does not advertise that >> > it is safe for multi-conn then this setting is forced to 1. >> > >> > Signed-off-by: Richard W.M. Jones <rjo...@redhat.com> >> > [eblake: also expose it through QMP] >> > Signed-off-by: Eric Blake <ebl...@redhat.com> >> > --- >> > qapi/block-core.json | 8 +++++++- >> > block/nbd.c | 24 ++++++++++++++++++++++++ >> > 2 files changed, 31 insertions(+), 1 deletion(-) >> > >> > diff --git a/qapi/block-core.json b/qapi/block-core.json >> > index 7f70ec6d3cb..5c10824f35b 100644 >> > --- a/qapi/block-core.json >> > +++ b/qapi/block-core.json >> > @@ -4545,6 +4545,11 @@ >> > # until successful or until @open-timeout seconds have elapsed. >> > # Default 0 (Since 7.0) >> > # >> > +# @multi-conn: Request the number of parallel client connections to make >> > +# to the server, up to 16. If the server does not advertise support >> > +# for multiple connections, or if this value is 0 or 1, all traffic >> > +# is sent through a single connection. Default 1 (Since 10.1) >> > +# >> >> So we silently ignore @multi-conn when its value is (nonsensical) zero, >> and when the server doesn't let us honor the value. Hmm. Silently >> ignoring the user's wishes can result in confusion. Should we reject >> instead? > > We could certainly reject 0. It's also possible to reject the case > where multi-conn is not supported by the server, but is requested by > the client, but I feel that's a bit user-unfriendly. After all, > multi-conn isn't essential for it to work, it's needed if you want > best performance. (Maybe issue a warning in the code - below - where > we set multi-conn back to 1? I don't know what qemu thinks about > warnings.)
QMP doesn't support warnings, so they go to stderr instead, where they may or may not be seen. When I instruct a program to do X, I prefer it to do exactly X, and fail when that's not possible. Correcting X behind my back may be friendly, until the day I spent quality time figuring out WTF is going on. Perhaps this one is a matter of documentation. As is, @multi-conn feels like "set the number of connections" to me, until I read the fine print, which contradicts it. We could perhaps phrase it as a limit instead: enable multiple connections and simultaneously limit their number. >> > # Features: >> > # >> > # @unstable: Member @x-dirty-bitmap is experimental. >> > @@ -4558,7 +4563,8 @@ >> > '*tls-hostname': 'str', >> > '*x-dirty-bitmap': { 'type': 'str', 'features': [ 'unstable' >> > ] }, >> > '*reconnect-delay': 'uint32', >> > - '*open-timeout': 'uint32' } } >> > + '*open-timeout': 'uint32', >> > + '*multi-conn': 'uint32' } } >> > >> > ## >> > # @BlockdevOptionsRaw: >> > diff --git a/block/nbd.c b/block/nbd.c >> > index d5a2b21c6d1..5eb00e360af 100644 >> > --- a/block/nbd.c >> > +++ b/block/nbd.c >> > @@ -48,6 +48,7 @@ >> > >> > #define EN_OPTSTR ":exportname=" >> > #define MAX_NBD_REQUESTS 16 >> > +#define MAX_MULTI_CONN 16 >> >> Out of curiosity: where does this value come from? > > So I should note first this is a maximum, not a recommendation. Is it the arbitrarily chosen maximum for QEMU, or is it the well-known maximum for NBD, or is it something else? > nbdcopy defaults to 4, which was derived from testing on a high end > (for 2024) AMD machine. Above 4 performance doesn't increase any > further on that machine. It's going to very much depend on how many > cores you have spare, how many TCP connections you want to open, and > how effectively the client and server handle parallelism. > > And imponderables like one we hit in virt-v2v: If accessing a VMware > server, the VMware server actually slows down as you add more > connections, even though it should theoretically support multi-conn. > We ended up forcing multi-conn to 1 in this case. You can't know this > in advance from the client side. > >> > >> > #define COOKIE_TO_INDEX(cookie) ((cookie) - 1) >> > #define INDEX_TO_COOKIE(index) ((index) + 1) >> > @@ -97,6 +98,7 @@ typedef struct BDRVNBDState { >> > /* Connection parameters */ >> > uint32_t reconnect_delay; >> > uint32_t open_timeout; >> > + uint32_t multi_conn; >> > SocketAddress *saddr; >> > char *export; >> > char *tlscredsid; >> > @@ -1840,6 +1842,15 @@ static QemuOptsList nbd_runtime_opts = { >> > "attempts until successful or until @open-timeout >> > seconds " >> > "have elapsed. Default 0", >> > }, >> > + { >> > + .name = "multi-conn", >> > + .type = QEMU_OPT_NUMBER, >> > + .help = "If > 1 permit up to this number of connections to >> > the " >> > + "server. The server must also advertise multi-conn " >> > + "support. If <= 1, only a single connection is made " >> > + "to the server even if the server advertises >> > multi-conn. " >> > + "Default 1", >> >> This text implies the requested value is silently limited to the value >> provided by the server, unlike the doc comment above. Although the >> "must" in "the sever must" could also be understood as "error when it >> doesn't". > > I'll just note that multi-conn is a boolean flag advertised by the > server. Servers don't advertise any preferred number of connections. > I don't know how to improve the text. Let's get the QAPI schema doc comment right before we worry about this one. >> > + }, >> > { /* end of list */ } >> > }, >> > }; >> > @@ -1895,6 +1906,10 @@ static int nbd_process_options(BlockDriverState >> > *bs, QDict *options, >> > >> > s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0); >> > s->open_timeout = qemu_opt_get_number(opts, "open-timeout", 0); >> > + s->multi_conn = qemu_opt_get_number(opts, "multi-conn", 1); >> > + if (s->multi_conn > MAX_MULTI_CONN) { >> > + s->multi_conn = MAX_MULTI_CONN; >> > + } >> >> We silently cap the user's requested number to 16. Not clear from QAPI >> schema doc comment; the "up to 16" there suggests more is an error. >> Should we error out instead? >> >> > >> > ret = 0; >> > >> > @@ -1949,6 +1964,15 @@ static int nbd_open(BlockDriverState *bs, QDict >> > *options, int flags, >> > >> > nbd_client_connection_enable_retry(s->conn); >> > >> > + /* >> > + * We set s->multi_conn in nbd_process_options above, but now that >> > + * we have connected if the server doesn't advertise that it is >> > + * safe for multi-conn, force it to 1. >> > + */ >> > + if (!(s->info.flags & NBD_FLAG_CAN_MULTI_CONN)) { >> > + s->multi_conn = 1; >> > + } >> > + >> > return 0; >> > >> > fail: > > Rich.