On Tue, Sep 01, 2015 at 08:15:15PM +0800, Yuanhan Liu wrote: > On Tue, Sep 01, 2015 at 01:07:11PM +0300, Michael S. Tsirkin wrote: > > On Tue, Sep 01, 2015 at 05:13:50PM +0800, Yuanhan Liu wrote: > > > On Thu, Aug 13, 2015 at 12:18:38PM +0300, Michael S. Tsirkin wrote: > > > > On Wed, Aug 12, 2015 at 02:25:41PM +0800, Ouyang Changchun wrote: > > > > > Based on patch by Nikolay Nikolaev: > > > > > Vhost-user will implement the multi queue support in a similar way > > > > > to what vhost already has - a separate thread for each queue. > > > > > To enable the multi queue functionality - a new command line parameter > > > > > "queues" is introduced for the vhost-user netdev. > > > > > > > > > > The RESET_OWNER change is based on commit: > > > > > 294ce717e0f212ed0763307f3eab72b4a1bdf4d0 > > > > > If it is reverted, the patch need update for it accordingly. > > > > > > > > > > Signed-off-by: Nikolay Nikolaev <n.nikol...@virtualopensystems.com> > > > > > Signed-off-by: Changchun Ouyang <changchun.ouy...@intel.com> > > > [snip...] > > > > > @@ -198,7 +203,7 @@ Message types > > > > > > > > > > Id: 4 > > > > > Equivalent ioctl: VHOST_RESET_OWNER > > > > > - Master payload: N/A > > > > > + Master payload: vring state description > > > > > > > > > > Issued when a new connection is about to be closed. The Master > > > > > will no > > > > > longer own this connection (and will usually close it). > > > > > > > > This is an interface change, isn't it? > > > > We can't make it unconditionally, need to make it dependent > > > > on a protocol flag. > > > > > > Hi Michael, > > > > > > I'm wondering why we need a payload here, as we don't do that for > > > VHOST_SET_OWNER. I mean, stopping one or few queue pairs when a > > > connect is about to be close doesn't make sense to me. Instead, > > > we should clean up all queue pair when VHOST_RESET_OWNER message > > > is received, right? > > > > We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE. > > Yeah, second that. > > BTW, can we simply do the name convertion, just changing VHOST_RESET_OWNER > to VHOST_RESET_DEVICE(or VHOST_STOP_DEVICE). I guess it's doable in > theory as far as we don't change the number. I somehow feel it's not a > good practice.
I think just renaming is fine, we are not changing the protocol at all. > Maybe we could make it as a new vhost message, and mark the old one > as obsolete? That doesn't sound perfect, either, as it reserves a number > for a message we will not use any more. > > Also, we may rename VHOST_SET_OWNER to VHOST_INIT_DEVICE? I think VHOST_SET_OWNER specified who's the master? > > And I agree, I don't think it needs a payload. > > Good to know. > > > > > > > > > > > > > > > > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c > > > > > index 1f25cb3..9cd6c05 100644 > > > > > --- a/hw/net/vhost_net.c > > > > > +++ b/hw/net/vhost_net.c > > > [snip...] > > > > > static int net_vhost_user_init(NetClientState *peer, const char > > > > > *device, > > > > > - const char *name, CharDriverState > > > > > *chr) > > > > > + const char *name, CharDriverState > > > > > *chr, > > > > > + uint32_t queues) > > > > > { > > > > > NetClientState *nc; > > > > > VhostUserState *s; > > > > > + int i; > > > > > > > > > > - nc = qemu_new_net_client(&net_vhost_user_info, peer, device, > > > > > name); > > > > > + for (i = 0; i < queues; i++) { > > > > > + nc = qemu_new_net_client(&net_vhost_user_info, peer, device, > > > > > name); > > > > > > > > > > - snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s", > > > > > - chr->label); > > > > > + snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d > > > > > to %s", > > > > > + i, chr->label); > > > > > > > > > > - s = DO_UPCAST(VhostUserState, nc, nc); > > > > > + s = DO_UPCAST(VhostUserState, nc, nc); > > > > > > > > > > - /* We don't provide a receive callback */ > > > > > - s->nc.receive_disabled = 1; > > > > > - s->chr = chr; > > > > > - > > > > > - qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, > > > > > s); > > > > > + /* We don't provide a receive callback */ > > > > > + s->nc.receive_disabled = 1; > > > > > + s->chr = chr; > > > > > + s->nc.queue_index = i; > > > > > > > > > > + qemu_chr_add_handlers(s->chr, NULL, NULL, > > > > > net_vhost_user_event, s); > > > > > + } > > > > > return 0; > > > > > } > > > > > > > > > > @@ -225,6 +229,7 @@ static int net_vhost_check_net(QemuOpts *opts, > > > > > void *opaque) > > > > > > > > > > > > There are two problems here: > > > > > > > > 1. we don't really know that the backend > > > > is able to support the requested number of queues. > > > > If not, everything will fail, silently. > > > > A new message to query the # of queues could help, though > > > > I'm not sure what can be done on failure. Fail connection? > > > > > > What I'm thinking is we may do: > > > > > > - introduce a feature flag, for indicating we support MQ or not. > > > > > > We query this flag only when # of queues given is > 1. We exit > > > if it not matches. > > > > > > - invoke vhost_dev init repeatedly for # of queues given, unless > > > something wrong happened, which basically means the backend > > > can not support such # of queues; we then quit. > > > > > > We could, as you suggested, add an another message to query > > > the max # queues the backend support. However, judging we have > > > to check the return value of setting up a single queue pair, > > > which already gives feedback when the backed is not able to > > > support requested # of queues, we could save such message, > > > though it's easy to implement :) > > > > Problem is, we only setup queues when device is started, > > that is when guest is running. > > So we couldn't simply invoke 'exit()', right? > > > > > Doing this at connect would mean we don't start the VM > > that we can't then support. > > Sorry, I'm a bit confused then. You just said that we setup queues > when guest is running, but now you were saying that VM hasn't been > started yet at connect time. As far as I know, we setup queues when > the socket is connected. So, isn't it contradictory in your sayings? > > > > > > > > > > > 2. each message (e.g. set memory table) is sent multiple times, > > > > on the same socket. > > > > > > Yeah, for there is a single socket opening there, it's not necessary > > > to send messages like SET_MEM_TABLE multiple times. But for other > > > messages that relate to to a specific vring, we have to send N times, > > > don't we? > > > > We need to set up each vring, sure. > > > > > > > So, I'm wondering could we categorize the message in two types: vring > > > specific and none-vring specific. For vring specific, we send it N > > > times, with the vhost_dev->vq_index telling which one queue pair > > > we have interest. > > > > > > For none-vring specific, we just send it once for first queue pair > > > (vhost_dev->queue == 0), just like what we did for tap: we launch > > > qemu-ifup/down script only for the first queue pair. > > > > Sounds reasonable. Make this all internal to vhost user: > > no need for common vhost code to know about this distinction. > > Good to know and I'll keep it in mind. > > Thanks for your comments. > > > --yliu > > > > > Comments? (And sorry if I made some silly comments, as I'm pretty > > > new to this community, say just have read about 2 weeks code). > > > > > > --yliu > > > > > > > > > > > > > > > > > > > > int net_init_vhost_user(const NetClientOptions *opts, const char > > > > > *name, > > > > > NetClientState *peer) > > > > > { > > > > > + uint32_t queues; > > > > > const NetdevVhostUserOptions *vhost_user_opts; > > > > > CharDriverState *chr; > > > > > > > > > > @@ -243,6 +248,12 @@ int net_init_vhost_user(const NetClientOptions > > > > > *opts, const char *name, > > > > > return -1; > > > > > } > > > > > > > > > > + /* number of queues for multiqueue */ > > > > > + if (vhost_user_opts->has_queues) { > > > > > + queues = vhost_user_opts->queues; > > > > > + } else { > > > > > + queues = 1; > > > > > + } > > > > > > > > > > - return net_vhost_user_init(peer, "vhost_user", name, chr); > > > > > + return net_vhost_user_init(peer, "vhost_user", name, chr, > > > > > queues); > > > > > } > > > > > diff --git a/qapi-schema.json b/qapi-schema.json > > > > > index f97ffa1..51e40ce 100644 > > > > > --- a/qapi-schema.json > > > > > +++ b/qapi-schema.json > > > > > @@ -2444,12 +2444,16 @@ > > > > > # > > > > > # @vhostforce: #optional vhost on for non-MSIX virtio guests > > > > > (default: false). > > > > > # > > > > > +# @queues: #optional number of queues to be created for multiqueue > > > > > vhost-user > > > > > +# (default: 1) (Since 2.5) > > > > > +# > > > > > # Since 2.1 > > > > > ## > > > > > { 'struct': 'NetdevVhostUserOptions', > > > > > 'data': { > > > > > 'chardev': 'str', > > > > > - '*vhostforce': 'bool' } } > > > > > + '*vhostforce': 'bool', > > > > > + '*queues': 'uint32' } } > > > > > > > > > > ## > > > > > # @NetClientOptions > > > > > diff --git a/qemu-options.hx b/qemu-options.hx > > > > > index ec356f6..dad035e 100644 > > > > > --- a/qemu-options.hx > > > > > +++ b/qemu-options.hx > > > > > @@ -1942,13 +1942,14 @@ The hubport netdev lets you connect a NIC to > > > > > a QEMU "vlan" instead of a single > > > > > netdev. @code{-net} and @code{-device} with parameter @option{vlan} > > > > > create the > > > > > required hub automatically. > > > > > > > > > > -@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off] > > > > > +@item -netdev > > > > > vhost-user,chardev=@var{id}[,vhostforce=on|off][,queues=n] > > > > > > > > > > Establish a vhost-user netdev, backed by a chardev @var{id}. The > > > > > chardev should > > > > > be a unix domain socket backed one. The vhost-user uses a > > > > > specifically defined > > > > > protocol to pass vhost ioctl replacement messages to an application > > > > > on the other > > > > > end of the socket. On non-MSIX guests, the feature can be forced with > > > > > -@var{vhostforce}. > > > > > +@var{vhostforce}. Use 'queues=@var{n}' to specify the number of > > > > > queues to > > > > > +be created for multiqueue vhost-user. > > > > > > > > > > Example: > > > > > @example > > > > > -- > > > > > 1.8.4.2 > > > > >