On Tue, Aug 25, 2015 at 03:25:54AM +0000, Ouyang, Changchun wrote: > Hi Michael, > > > -----Original Message----- > > From: snabb-de...@googlegroups.com [mailto:snabb- > > de...@googlegroups.com] On Behalf Of Michael S. Tsirkin > > Sent: Thursday, August 13, 2015 5:19 PM > > To: Ouyang, Changchun > > Cc: qemu-devel@nongnu.org; snabb-de...@googlegroups.com; > > thibaut.col...@6wind.com; n.nikol...@virtualopensystems.com; > > l...@snabb.co; Long, Thomas > > Subject: [snabb-devel] Re: [PATCH v6 1/2] vhost-user: add multi queue > > support > > > > On Wed, Aug 12, 2015 at 02:25:41PM +0800, Ouyang Changchun wrote: > > > Based on patch by Nikolay Nikolaev: > > > Vhost-user will implement the multi queue support in a similar way to > > > what vhost already has - a separate thread for each queue. > > > To enable the multi queue functionality - a new command line parameter > > > "queues" is introduced for the vhost-user netdev. > > > > > > The RESET_OWNER change is based on commit: > > > 294ce717e0f212ed0763307f3eab72b4a1bdf4d0 > > > If it is reverted, the patch need update for it accordingly. > > > > > > Signed-off-by: Nikolay Nikolaev <n.nikol...@virtualopensystems.com> > > > Signed-off-by: Changchun Ouyang <changchun.ouy...@intel.com> > > > --- > > > Changes since v5: > > > - fix the message descption for VHOST_RESET_OWNER in vhost-user txt > > > > > > Changes since v4: > > > - remove the unnecessary trailing '\n' > > > > > > Changes since v3: > > > - fix one typo and wrap one long line > > > > > > Changes since v2: > > > - fix vq index issue for set_vring_call > > > When it is the case of VHOST_SET_VRING_CALL, The vq_index is not > > initialized before it is used, > > > thus it could be a random value. The random value leads to crash in > > > vhost > > after passing down > > > to vhost, as vhost use this random value to index an array index. > > > - fix the typo in the doc and description > > > - address vq index for reset_owner > > > > > > Changes since v1: > > > - use s->nc.info_str when bringing up/down the backend > > > > > > docs/specs/vhost-user.txt | 7 ++++++- > > > hw/net/vhost_net.c | 3 ++- > > > hw/virtio/vhost-user.c | 11 ++++++++++- > > > net/vhost-user.c | 37 ++++++++++++++++++++++++------------- > > > qapi-schema.json | 6 +++++- > > > qemu-options.hx | 5 +++-- > > > 6 files changed, 50 insertions(+), 19 deletions(-) > > > > > > diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt > > > index 70da3b1..9390f89 100644 > > > --- a/docs/specs/vhost-user.txt > > > +++ b/docs/specs/vhost-user.txt > > > @@ -135,6 +135,11 @@ As older slaves don't support negotiating > > > protocol features, a feature bit was dedicated for this purpose: > > > #define VHOST_USER_F_PROTOCOL_FEATURES 30 > > > > > > +Multi queue support > > > +------------------- > > > +The protocol supports multiple queues by setting all index fields in > > > +the sent messages to a properly calculated value. > > > + > > > Message types > > > ------------- > > > > > > @@ -198,7 +203,7 @@ Message types > > > > > > Id: 4 > > > Equivalent ioctl: VHOST_RESET_OWNER > > > - Master payload: N/A > > > + Master payload: vring state description > > > > > > Issued when a new connection is about to be closed. The Master > > > will no > > > longer own this connection (and will usually close it). > > > > This is an interface change, isn't it? > > We can't make it unconditionally, need to make it dependent on a protocol > > flag.
Pls remember to fix this one. > > > > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index > > > 1f25cb3..9cd6c05 100644 > > > --- a/hw/net/vhost_net.c > > > +++ b/hw/net/vhost_net.c > > > @@ -159,6 +159,7 @@ struct vhost_net *vhost_net_init(VhostNetOptions > > > *options) > > > > > > net->dev.nvqs = 2; > > > net->dev.vqs = net->vqs; > > > + net->dev.vq_index = net->nc->queue_index; > > > > > > r = vhost_dev_init(&net->dev, options->opaque, > > > options->backend_type, options->force); @@ > > > -269,7 +270,7 @@ static void vhost_net_stop_one(struct vhost_net *net, > > > for (file.index = 0; file.index < net->dev.nvqs; ++file.index) { > > > const VhostOps *vhost_ops = net->dev.vhost_ops; > > > int r = vhost_ops->vhost_call(&net->dev, VHOST_RESET_OWNER, > > > - NULL); > > > + &file); > > > assert(r >= 0); > > > } > > > } > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index > > > 27ba035..fb11d4c 100644 > > > --- a/hw/virtio/vhost-user.c > > > +++ b/hw/virtio/vhost-user.c > > > @@ -219,7 +219,12 @@ static int vhost_user_call(struct vhost_dev *dev, > > unsigned long int request, > > > break; > > > > > > case VHOST_USER_SET_OWNER: > > > + break; > > > + > > > case VHOST_USER_RESET_OWNER: > > > + memcpy(&msg.state, arg, sizeof(struct vhost_vring_state)); > > > + msg.state.index += dev->vq_index; > > > + msg.size = sizeof(m.state); > > > break; > > > > > > case VHOST_USER_SET_MEM_TABLE: > > > @@ -262,17 +267,20 @@ static int vhost_user_call(struct vhost_dev *dev, > > unsigned long int request, > > > case VHOST_USER_SET_VRING_NUM: > > > case VHOST_USER_SET_VRING_BASE: > > > memcpy(&msg.state, arg, sizeof(struct vhost_vring_state)); > > > + msg.state.index += dev->vq_index; > > > msg.size = sizeof(m.state); > > > break; > > > > > > case VHOST_USER_GET_VRING_BASE: > > > memcpy(&msg.state, arg, sizeof(struct vhost_vring_state)); > > > + msg.state.index += dev->vq_index; > > > msg.size = sizeof(m.state); > > > need_reply = 1; > > > break; > > > > > > case VHOST_USER_SET_VRING_ADDR: > > > memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr)); > > > + msg.addr.index += dev->vq_index; > > > msg.size = sizeof(m.addr); > > > break; > > > > > > @@ -280,7 +288,7 @@ static int vhost_user_call(struct vhost_dev *dev, > > unsigned long int request, > > > case VHOST_USER_SET_VRING_CALL: > > > case VHOST_USER_SET_VRING_ERR: > > > file = arg; > > > - msg.u64 = file->index & VHOST_USER_VRING_IDX_MASK; > > > + msg.u64 = (file->index + dev->vq_index) & > > > + VHOST_USER_VRING_IDX_MASK; > > > msg.size = sizeof(m.u64); > > > if (ioeventfd_enabled() && file->fd > 0) { > > > fds[fd_num++] = file->fd; @@ -322,6 +330,7 @@ static int > > > vhost_user_call(struct vhost_dev *dev, unsigned long int request, > > > error_report("Received bad msg size."); > > > return -1; > > > } > > > + msg.state.index -= dev->vq_index; > > > memcpy(arg, &msg.state, sizeof(struct vhost_vring_state)); > > > break; > > > default: > > > diff --git a/net/vhost-user.c b/net/vhost-user.c index > > > 1d86a2b..904d8af 100644 > > > --- a/net/vhost-user.c > > > +++ b/net/vhost-user.c > > > @@ -121,35 +121,39 @@ static void net_vhost_user_event(void *opaque, > > int event) > > > case CHR_EVENT_OPENED: > > > vhost_user_start(s); > > > net_vhost_link_down(s, false); > > > - error_report("chardev \"%s\" went up", s->chr->label); > > > + error_report("chardev \"%s\" went up", s->nc.info_str); > > > break; > > > case CHR_EVENT_CLOSED: > > > net_vhost_link_down(s, true); > > > vhost_user_stop(s); > > > - error_report("chardev \"%s\" went down", s->chr->label); > > > + error_report("chardev \"%s\" went down", s->nc.info_str); > > > break; > > > } > > > } BTW this seems pretty hacky: you get multiple messages when one client connects. Why add multiple event listeners to the same chat device? > > > > > > static int net_vhost_user_init(NetClientState *peer, const char *device, > > > - const char *name, CharDriverState *chr) > > > + const char *name, CharDriverState *chr, > > > + uint32_t queues) > > > { > > > NetClientState *nc; > > > VhostUserState *s; > > > + int i; > > > > > > - nc = qemu_new_net_client(&net_vhost_user_info, peer, device, > > name); > > > + for (i = 0; i < queues; i++) { > > > + nc = qemu_new_net_client(&net_vhost_user_info, peer, device, > > > + name); > > > > > > - snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user to %s", > > > - chr->label); > > > + snprintf(nc->info_str, sizeof(nc->info_str), "vhost-user%d to > > > %s", > > > + i, chr->label); > > > > > > - s = DO_UPCAST(VhostUserState, nc, nc); > > > + s = DO_UPCAST(VhostUserState, nc, nc); > > > > > > - /* We don't provide a receive callback */ > > > - s->nc.receive_disabled = 1; > > > - s->chr = chr; > > > - > > > - qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, > > s); > > > + /* We don't provide a receive callback */ > > > + s->nc.receive_disabled = 1; > > > + s->chr = chr; > > > + s->nc.queue_index = i; > > > > > > + qemu_chr_add_handlers(s->chr, NULL, NULL, net_vhost_user_event, > > s); > > > + } > > > return 0; > > > } > > > > > > @@ -225,6 +229,7 @@ static int net_vhost_check_net(QemuOpts *opts, > > > void *opaque) > > > > > > There are two problems here: > > > > 1. we don't really know that the backend > > is able to support the requested number of queues. > > If not, everything will fail, silently. > > A new message to query the # of queues could help, though > > I'm not sure what can be done on failure. Fail connection? > > > > 2. each message (e.g. set memory table) is sent multiple times, > > on the same socket. > > > I think it is tough to resolve these 2 comments, as the current message is > either vhost-dev based or virt-queue based, > The multiple queues(pair) feature use multiple vhost-devs to implement itself. > For #1 > So the queue number is something should be seen in the upper level of > vhost-dev rather than inside the vhost-dev. > For each vhost-net, there are 2 virt-queues, one is for Rx the other is for > Tx. > introduce the virt-queue pair number into the vhost-dev? But I don't think it > is good, as for each vhost-dev, there is only one > virt-queue pair. > > Where should I put the virt-queue pair number to? I don't get the perfect > answer till now. Any suggestion is welcome. > > Could we assume the vhost backend has the ability to create enough virt-queue > pair(e.g. 0x8000 is the max) if qemu require > Vhost backend to do it. If it is correct, we don't need get virt-queue pair > number from vhost backend, as vhost backend can > Create all virt-queue pair required by qemu. > The virtio frontend(on guest) has the flexibility to enable which virt-queue > according to its own capability, qemu can do it by using > Set_vring_flag message to notify vhost backend. I'm reluctant to agree to this. Implementations tend to get this wrong, e.g. they would only test with 2 queues and assume everything is OK. With an explicit message, this seems more robust. Why is it so hard to implement? User specifies queues=X. Can't we simply validate that backend supports this # of queues? > For #2 > The memory table message is also vhost-dev based, it wouldn't hurt we send it > a few times, vhost backend could > Keep it vhost-dev based too, or keep it once(keep it when first time and > ignore in rest messages from the same connected-fd) > Any other good suggestion is welcome too :-) Add code in vhost-user to skip sending the useless messages? Yes they seem harmless but implementations tend to develop dependencies on such bugs, then we get to maintain them forever. > > > > > > > int net_init_vhost_user(const NetClientOptions *opts, const char *name, > > > NetClientState *peer) { > > > + uint32_t queues; > > > const NetdevVhostUserOptions *vhost_user_opts; > > > CharDriverState *chr; > > > > > > @@ -243,6 +248,12 @@ int net_init_vhost_user(const NetClientOptions > > *opts, const char *name, > > > return -1; > > > } > > > > > > + /* number of queues for multiqueue */ > > > + if (vhost_user_opts->has_queues) { > > > + queues = vhost_user_opts->queues; > > > + } else { > > > + queues = 1; > > > + } > > > > > > - return net_vhost_user_init(peer, "vhost_user", name, chr); > > > + return net_vhost_user_init(peer, "vhost_user", name, chr, > > > + queues); > > > } > > > diff --git a/qapi-schema.json b/qapi-schema.json index > > > f97ffa1..51e40ce 100644 > > > --- a/qapi-schema.json > > > +++ b/qapi-schema.json > > > @@ -2444,12 +2444,16 @@ > > > # > > > # @vhostforce: #optional vhost on for non-MSIX virtio guests (default: > > false). > > > # > > > +# @queues: #optional number of queues to be created for multiqueue > > vhost-user > > > +# (default: 1) (Since 2.5) > > > +# > > > # Since 2.1 > > > ## > > > { 'struct': 'NetdevVhostUserOptions', > > > 'data': { > > > 'chardev': 'str', > > > - '*vhostforce': 'bool' } } > > > + '*vhostforce': 'bool', > > > + '*queues': 'uint32' } } > > > > > > ## > > > # @NetClientOptions > > > diff --git a/qemu-options.hx b/qemu-options.hx index ec356f6..dad035e > > > 100644 > > > --- a/qemu-options.hx > > > +++ b/qemu-options.hx > > > @@ -1942,13 +1942,14 @@ The hubport netdev lets you connect a NIC to a > > > QEMU "vlan" instead of a single netdev. @code{-net} and > > > @code{-device} with parameter @option{vlan} create the required hub > > automatically. > > > > > > -@item -netdev vhost-user,chardev=@var{id}[,vhostforce=on|off] > > > +@item -netdev > > > +vhost-user,chardev=@var{id}[,vhostforce=on|off][,queues=n] > > > > > > Establish a vhost-user netdev, backed by a chardev @var{id}. The > > > chardev should be a unix domain socket backed one. The vhost-user > > > uses a specifically defined protocol to pass vhost ioctl replacement > > > messages to an application on the other end of the socket. On > > > non-MSIX guests, the feature can be forced with -@var{vhostforce}. > > > +@var{vhostforce}. Use 'queues=@var{n}' to specify the number of > > > +queues to be created for multiqueue vhost-user. > > > > > > Example: > > > @example > > > -- > > > 1.8.4.2 > > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Snabb Switch development" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to snabb-devel+unsubscr...@googlegroups.com. > > To post to this group, send an email to snabb-de...@googlegroups.com. > > Visit this group at http://groups.google.com/group/snabb-devel.