Given that the branching is imminent I pushed the patch to master (after fixing some clang warnings), to get it into 2.6, as I'm happy with the interface.
Sean, if we settle on a better interface to suit your needs, I'm fine with changing it in the next few days. Thanks for the series, Daniele 2016-08-15 8:11 GMT-07:00 Ciara Loftus <ciara.lof...@intel.com>: > This commit removes the 'dpdkvhostcuse' port type from the userspace > datapath. vhost-cuse ports are quickly becoming obsolete as the > vhost-user port type begins to support a greater feature-set thanks to > the addition of things like vhost-user multiqueue and potential > upcoming features like vhost-user client-mode and vhost-user reconnect. > The feature is also expected to be removed from DPDK soon. > > One potential drawback of the removal of this support is that a > userspace vHost port type is not available in OVS for use with older > versions of QEMU (pre v2.2). Considering v2.2 is nearly two years old > this should however be a low impact change. > > Signed-off-by: Ciara Loftus <ciara.lof...@intel.com> > Acked-by: Flavio Leitner <f...@sysclose.org> > Acked-by: Daniele Di Proietto <diproiet...@vmware.com> > Acked-by: Ilya Maximets <i.maxim...@samsung.com> > --- > INSTALL.DPDK-ADVANCED.md | 242 ----------------- > NEWS | 1 + > acinclude.m4 | 12 - > lib/netdev-dpdk.c | 108 +------- > rhel/README.RHEL | 2 - > rhel/etc_sysconfig_network-scripts_ifup-ovs | 7 - > utilities/automake.mk | 1 - > utilities/qemu-wrap.py | 389 > ---------------------------- > vswitchd/vswitch.xml | 12 - > 9 files changed, 5 insertions(+), 769 deletions(-) > delete mode 100755 utilities/qemu-wrap.py > > diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md > index dd36ae4..5d19f2b 100755 > --- a/INSTALL.DPDK-ADVANCED.md > +++ b/INSTALL.DPDK-ADVANCED.md > @@ -461,13 +461,6 @@ For users wanting to do packet forwarding using > kernel stack below are the steps > ``` > > ## <a name="vhost"></a> 6. Vhost Walkthrough > - > -DPDK 16.07 supports two types of vhost: > - > -1. vhost-user - enabled default > - > -2. vhost-cuse - Legacy, disabled by default > - > ### 6.1 vhost-user > > - Prerequisites: > @@ -621,241 +614,6 @@ DPDK 16.07 supports two types of vhost: > > Note: For information on libvirt and further tuning refer [libvirt]. > > -### 6.2 vhost-cuse > - > - - Prerequisites: > - > - QEMU version >= 2.2 > - > - - Enable vhost-cuse support > - > - 1. Enable vhost cuse support in DPDK > - > - Set `CONFIG_RTE_LIBRTE_VHOST_USER=n` in config/common_linuxapp > and follow the > - steps in 2.2 section of INSTALL.DPDK guide to build DPDK with cuse > support. > - OVS will detect that DPDK has vhost-cuse libraries compiled and in > turn will enable > - support for it in the switch and disable vhost-user support. > - > - 2. Insert the Cuse module > - > - `modprobe cuse` > - > - 3. Build and insert the `eventfd_link` module > - > - ``` > - cd $DPDK_DIR/lib/librte_vhost/eventfd_link/ > - make > - insmod $DPDK_DIR/lib/librte_vhost/eventfd_link.ko > - ``` > - > - - Adding vhost-cuse ports to Switch > - > - Unlike DPDK ring ports, DPDK vhost-cuse ports can have arbitrary > names. > - For vhost-cuse, the name of the port type is `dpdkvhostcuse` > - > - ``` > - ovs-vsctl add-port br0 vhost-cuse-1 -- set Interface vhost-cuse-1 > - type=dpdkvhostcuse > - ``` > - > - When attaching vhost-cuse ports to QEMU, the name provided during the > - add-port operation must match the ifname parameter on the QEMU cmd > line. > - > - - Adding vhost-cuse ports to VM > - > - vhost-cuse ports use a Linux* character device to communicate with > QEMU. > - By default it is set to `/dev/vhost-net`. It is possible to reuse this > - standard device for DPDK vhost, which makes setup a little simpler > but it > - is better practice to specify an alternative character device in > order to > - avoid any conflicts if kernel vhost is to be used in parallel. > - > - 1. This step is only needed if using an alternative character device. > - > - ``` > - ./utilities/ovs-vsctl --no-wait set Open_vSwitch . \ > - other_config:cuse-dev-name=my-vhost-net > - ``` > - > - In the example above, the character device to be used will be > - `/dev/my-vhost-net`. > - > - 2. In case of reusing kernel vhost character device, there would be > conflict > - user should remove it. > - > - `rm -rf /dev/vhost-net` > - > - 3. Configure virtio-net adapters > - > - The following parameters must be passed to the QEMU binary, repeat > - the below parameters for multiple devices. > - > - ``` > - -netdev tap,id=<id>,script=no,downscript=no,ifname=<name>,vhost=on > - -device virtio-net-pci,netdev=net1,mac=<mac> > - ``` > - > - The DPDK vhost library will negotiate its own features, so they > - need not be passed in as command line params. Note that as offloads > - are disabled this is the equivalent of setting > - > - `csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off` > - > - When using an alternative character device, it must be explicitly > - passed to QEMU using the `vhostfd` argument > - > - ``` > - -netdev tap,id=<id>,script=no,downscript=no,ifname=<name>, > vhost=on, > - vhostfd=<open_fd> -device virtio-net-pci,netdev=net1,mac=<mac> > - ``` > - > - The open file descriptor must be passed to QEMU running as a child > - process. This could be done with a simple python script. > - > - ``` > - #!/usr/bin/python > - fd = os.open("/dev/usvhost", os.O_RDWR) > - subprocess.call("qemu-system-x86_64 .... -netdev > tap,id=vhostnet0,\ > - vhost=on,vhostfd=" + fd +"...", shell=True) > - ``` > - > - 4. Configure huge pages > - > - QEMU must allocate the VM's memory on hugetlbfs. Vhost ports > access a > - virtio-net device's virtual rings and packet buffers mapping the > VM's > - physical memory on hugetlbfs. To enable vhost-ports to map the VM's > - memory into their process address space, pass the following > parameters > - to QEMU > - > - `-object memory-backend-file,id=mem,size=4096M,mem-path=/dev/ > hugepages, > - share=on -numa node,memdev=mem -mem-prealloc` > - > - - VM Configuration with QEMU wrapper > - > - The QEMU wrapper script automatically detects and calls QEMU with the > - necessary parameters. It performs the following actions: > - > - * Automatically detects the location of the hugetlbfs and inserts this > - into the command line parameters. > - * Automatically open file descriptors for each virtio-net device and > - inserts this into the command line parameters. > - * Calls QEMU passing both the command line parameters passed to the > - script itself and those it has auto-detected. > - > - Before use, you **must** edit the configuration parameters section of > the > - script to point to the correct emulator location and set additional > - settings. Of these settings, `emul_path` and `us_vhost_path` **must** > be > - set. All other settings are optional. > - > - To use directly from the command line simply pass the wrapper some of > the > - QEMU parameters: it will configure the rest. For example: > - > - ``` > - qemu-wrap.py -cpu host -boot c -hda <disk image> -m 4096 -smp 4 > - --enable-kvm -nographic -vnc none -net none -netdev tap,id=net1, > - script=no,downscript=no,ifname=if1,vhost=on -device virtio-net-pci, > - netdev=net1,mac=00:00:00:00:00:01 > - ``` > - > - - VM Configuration with libvirt > - > - If you are using libvirt, you must enable libvirt to access the > character > - device by adding it to controllers cgroup for libvirtd using the > following > - steps. > - > - 1. In `/etc/libvirt/qemu.conf` add/edit the following lines: > - > - ``` > - clear_emulator_capabilities = 0 > - user = "root" > - group = "root" > - cgroup_device_acl = [ > - "/dev/null", "/dev/full", "/dev/zero", > - "/dev/random", "/dev/urandom", > - "/dev/ptmx", "/dev/kvm", "/dev/kqemu", > - "/dev/rtc", "/dev/hpet", "/dev/net/tun", > - "/dev/<my-vhost-device>", > - "/dev/hugepages"] > - ``` > - > - <my-vhost-device> refers to "vhost-net" if using the > `/dev/vhost-net` > - device. If you have specificed a different name in the database > - using the "other_config:cuse-dev-name" parameter, please specify > that > - filename instead. > - > - 2. Disable SELinux or set to permissive mode > - > - 3. Restart the libvirtd process > - For example, on Fedora: > - > - `systemctl restart libvirtd.service` > - > - After successfully editing the configuration, you may launch your > - vhost-enabled VM. The XML describing the VM can be configured like so > - within the <qemu:commandline> section: > - > - 1. Set up shared hugepages: > - > - ``` > - <qemu:arg value='-object'/> > - <qemu:arg value='memory-backend-file,id= > mem,size=4096M,mem-path=/dev/hugepages,share=on'/> > - <qemu:arg value='-numa'/> > - <qemu:arg value='node,memdev=mem'/> > - <qemu:arg value='-mem-prealloc'/> > - ``` > - > - 2. Set up your tap devices: > - > - ``` > - <qemu:arg value='-netdev'/> > - <qemu:arg value='type=tap,id=net1,script=no,downscript=no, > ifname=vhost0,vhost=on'/> > - <qemu:arg value='-device'/> > - <qemu:arg value='virtio-net-pci,netdev= > net1,mac=00:00:00:00:00:01'/> > - ``` > - > - Repeat for as many devices as are desired, modifying the id, ifname > - and mac as necessary. > - > - Again, if you are using an alternative character device (other than > - `/dev/vhost-net`), please specify the file descriptor like so: > - > - `<qemu:arg value='type=tap,id=net3,script=no,downscript=no, > ifname=vhost0,vhost=on,vhostfd=<open_fd>'/>` > - > - Where <open_fd> refers to the open file descriptor of the character > device. > - Instructions of how to retrieve the file descriptor can be found in > the > - "DPDK vhost VM configuration" section. > - Alternatively, the process is automated with the qemu-wrap.py script, > - detailed in the next section. > - > - Now you may launch your VM using virt-manager, or like so: > - > - `virsh create my_vhost_vm.xml` > - > - - VM Configuration with libvirt & QEMU wrapper > - > - To use the qemu-wrapper script in conjuntion with libvirt, follow the > - steps in the previous section before proceeding with the following > steps: > - > - 1. Place `qemu-wrap.py` in libvirtd binary search PATH ($PATH) > - Ideally in the same directory that the QEMU binary is located. > - > - 2. Ensure that the script has the same owner/group and file > permissions > - as the QEMU binary. > - > - 3. Update the VM xml file using "virsh edit VM.xml" > - > - Set the VM to use the launch script. > - Set the emulator path contained in the `<emulator><emulator/>` > tags. > - For example, replace `<emulator>/usr/bin/qemu-kvm<emulator/>` with > - `<emulator>/usr/bin/qemu-wrap.py<emulator/>` > - > - 4. Edit the Configuration Parameters section of the script to point to > - the correct emulator location and set any additional options. If > you are > - using a alternative character device name, please set > "us_vhost_path" to the > - location of that device. The script will automatically detect and > insert > - the correct "vhostfd" value in the QEMU command line arguments. > - > - 5. Use virt-manager to launch the VM > - > ### 6.3 DPDK backend inside VM > > Please note that additional configuration is required if you want to run > diff --git a/NEWS b/NEWS > index 2fd3958..5dbcd1d 100644 > --- a/NEWS > +++ b/NEWS > @@ -71,6 +71,7 @@ Post-v2.5.0 > * Support for DPDK 16.07 > * Optional support for DPDK pdump enabled. > * Jumbo frame support > + * Remove dpdkvhostcuse port type. > - Increase number of registers to 16. > - ovs-benchmark: This utility has been removed due to lack of use and > bitrot. > diff --git a/acinclude.m4 b/acinclude.m4 > index aa57b47..5a6dca7 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -201,18 +201,6 @@ AC_DEFUN([OVS_CHECK_DPDK], [ > AC_LANG_PROGRAM( > [ > #include <rte_config.h> > -#if !RTE_LIBRTE_VHOST_USER > -#error > -#endif > - ], []) > - ], [], > - [AC_DEFINE([VHOST_CUSE], [1], [DPDK vhost-cuse support enabled, > vhost-user disabled.]) > - DPDK_EXTRA_LIB="-lfuse"]) > - > - AC_COMPILE_IFELSE([ > - AC_LANG_PROGRAM( > - [ > - #include <rte_config.h> > #if RTE_LIBRTE_VHOST_NUMA > #error > #endif > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index c767fd4..6998452 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -140,9 +140,6 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / > ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF)) > #define OVS_VHOST_QUEUE_DISABLED (-2) /* Queue was disabled by guest > and not > * yet mapped to another queue. > */ > > -#ifdef VHOST_CUSE > -static char *cuse_dev_name = NULL; /* Character device cuse_dev_name. > */ > -#endif > static char *vhost_sock_dir = NULL; /* Location of vhost-user sockets */ > > #define VHOST_ENQ_RETRY_NUM 8 > @@ -878,34 +875,6 @@ dpdk_dev_parse_name(const char dev_name[], const char > prefix[], > } > > static int > -vhost_construct_helper(struct netdev *netdev) OVS_REQUIRES(dpdk_mutex) > -{ > - if (rte_eal_init_ret) { > - return rte_eal_init_ret; > - } > - > - return netdev_dpdk_init(netdev, -1, DPDK_DEV_VHOST); > -} > - > -static int > -netdev_dpdk_vhost_cuse_construct(struct netdev *netdev) > -{ > - struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > - int err; > - > - if (rte_eal_init_ret) { > - return rte_eal_init_ret; > - } > - > - ovs_mutex_lock(&dpdk_mutex); > - strncpy(CONST_CAST(char *, dev->vhost_id), netdev->name, > - sizeof dev->vhost_id); > - err = vhost_construct_helper(netdev); > - ovs_mutex_unlock(&dpdk_mutex); > - return err; > -} > - > -static int > netdev_dpdk_vhost_user_construct(struct netdev *netdev) > { > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > @@ -942,7 +911,7 @@ netdev_dpdk_vhost_user_construct(struct netdev > *netdev) > fatal_signal_add_file_to_unlink(dev->vhost_id); > VLOG_INFO("Socket %s created for vhost-user port %s\n", > dev->vhost_id, name); > - err = vhost_construct_helper(netdev); > + err = netdev_dpdk_init(netdev, -1, DPDK_DEV_VHOST); > } > > ovs_mutex_unlock(&dpdk_mutex); > @@ -2499,7 +2468,7 @@ static void * > start_vhost_loop(void *dummy OVS_UNUSED) > { > pthread_detach(pthread_self()); > - /* Put the cuse thread into quiescent state. */ > + /* Put the vhost thread into quiescent state. */ > ovsrcu_quiesce_start(); > rte_vhost_driver_session_start(); > return NULL; > @@ -2518,12 +2487,6 @@ dpdk_vhost_class_init(void) > } > > static int > -dpdk_vhost_cuse_class_init(void) > -{ > - return 0; > -} > - > -static int > dpdk_vhost_user_class_init(void) > { > return 0; > @@ -2985,29 +2948,6 @@ netdev_dpdk_vhost_user_reconfigure(struct netdev > *netdev) > return 0; > } > > -static int > -netdev_dpdk_vhost_cuse_reconfigure(struct netdev *netdev) > -{ > - struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > - > - ovs_mutex_lock(&dpdk_mutex); > - ovs_mutex_lock(&dev->mutex); > - > - netdev->n_txq = dev->requested_n_txq; > - netdev->n_rxq = 1; > - > - if (dev->requested_mtu != dev->mtu) { > - if (!netdev_dpdk_mempool_configure(dev)) { > - netdev_change_seq_changed(netdev); > - } > - } > - > - ovs_mutex_unlock(&dev->mutex); > - ovs_mutex_unlock(&dpdk_mutex); > - > - return 0; > -} > - > #define NETDEV_DPDK_CLASS(NAME, INIT, CONSTRUCT, DESTRUCT, \ > SET_CONFIG, SET_TX_MULTIQ, SEND, \ > GET_CARRIER, GET_STATS, \ > @@ -3091,8 +3031,8 @@ process_vhost_flags(char *flag, char *default_val, > int size, > > val = smap_get(ovs_other_config, flag); > > - /* Depending on which version of vhost is in use, process the > vhost-specific > - * flag if it is provided, otherwise resort to default value. > + /* Process the vhost-sock-dir flag if it is provided, otherwise > resort to > + * default value. > */ > if (val && (strlen(val) <= size)) { > changed = 1; > @@ -3316,9 +3256,7 @@ dpdk_init__(const struct smap *ovs_other_config) > bool auto_determine = true; > int err = 0; > cpu_set_t cpuset; > -#ifndef VHOST_CUSE > char *sock_dir_subcomponent; > -#endif > > if (!smap_get_bool(ovs_other_config, "dpdk-init", false)) { > VLOG_INFO("DPDK Disabled - to change this requires a restart.\n"); > @@ -3326,11 +3264,6 @@ dpdk_init__(const struct smap *ovs_other_config) > } > > VLOG_INFO("DPDK Enabled, initializing"); > - > -#ifdef VHOST_CUSE > - if (process_vhost_flags("cuse-dev-name", xstrdup("vhost-net"), > - PATH_MAX, ovs_other_config, &cuse_dev_name)) { > -#else > if (process_vhost_flags("vhost-sock-dir", xstrdup(ovs_rundir()), > NAME_MAX, ovs_other_config, > &sock_dir_subcomponent)) { > @@ -3353,7 +3286,6 @@ dpdk_init__(const struct smap *ovs_other_config) > free(sock_dir_subcomponent); > } else { > vhost_sock_dir = sock_dir_subcomponent; > -#endif > } > > argv = grow_argv(&argv, 0, 1); > @@ -3445,18 +3377,6 @@ dpdk_init__(const struct smap *ovs_other_config) > > ovs_thread_create("dpdk_watchdog", dpdk_watchdog, NULL); > > -#ifdef VHOST_CUSE > - /* Register CUSE device to handle IOCTLs. > - * Unless otherwise specified, cuse_dev_name is set to vhost-net. > - */ > - err = rte_vhost_driver_register(cuse_dev_name, 0); > - > - if (err != 0) { > - VLOG_ERR("CUSE device setup failure."); > - return; > - } > -#endif > - > dpdk_vhost_class_init(); > > #ifdef DPDK_PDUMP > @@ -3522,22 +3442,6 @@ static const struct netdev_class dpdk_ring_class = > netdev_dpdk_reconfigure, > netdev_dpdk_rxq_recv); > > -static const struct netdev_class OVS_UNUSED dpdk_vhost_cuse_class = > - NETDEV_DPDK_CLASS( > - "dpdkvhostcuse", > - dpdk_vhost_cuse_class_init, > - netdev_dpdk_vhost_cuse_construct, > - netdev_dpdk_vhost_destruct, > - NULL, > - NULL, > - netdev_dpdk_vhost_send, > - netdev_dpdk_vhost_get_carrier, > - netdev_dpdk_vhost_get_stats, > - NULL, > - NULL, > - netdev_dpdk_vhost_cuse_reconfigure, > - netdev_dpdk_vhost_rxq_recv); > - > static const struct netdev_class OVS_UNUSED dpdk_vhost_user_class = > NETDEV_DPDK_CLASS( > "dpdkvhostuser", > @@ -3560,11 +3464,7 @@ netdev_dpdk_register(void) > dpdk_common_init(); > netdev_register_provider(&dpdk_class); > netdev_register_provider(&dpdk_ring_class); > -#ifdef VHOST_CUSE > - netdev_register_provider(&dpdk_vhost_cuse_class); > -#else > netdev_register_provider(&dpdk_vhost_user_class); > -#endif > } > > void > diff --git a/rhel/README.RHEL b/rhel/README.RHEL > index 21b50cc..fec9c75 100644 > --- a/rhel/README.RHEL > +++ b/rhel/README.RHEL > @@ -36,8 +36,6 @@ assignments. The following OVS-specific variable names > are supported: > * "OVSDPDKRPort", if <name> is a DPDK ring port (name must > start with dpdkr and end with portid, eg "dpdkr0") > > - * "OVSDPDKVhostPort" if <name> is a DPDK vhost-cuse port > - > * "OVSDPDKVhostUserPort" if <name> is a DPDK vhost-user port > > * "OVSDPDKBond" if <name> is an OVS DPDK bond. > diff --git a/rhel/etc_sysconfig_network-scripts_ifup-ovs > b/rhel/etc_sysconfig_network-scripts_ifup-ovs > index be0f2dd..e49e6fe 100755 > --- a/rhel/etc_sysconfig_network-scripts_ifup-ovs > +++ b/rhel/etc_sysconfig_network-scripts_ifup-ovs > @@ -179,13 +179,6 @@ case "$TYPE" in > -- add-port "$OVS_BRIDGE" "$DEVICE" $OVS_OPTIONS \ > -- set Interface "$DEVICE" type=dpdkr > ${OVS_EXTRA+-- $OVS_EXTRA} > ;; > - OVSDPDVhostPort) > - ifup_ovs_bridge > - ovs-vsctl -t ${TIMEOUT} \ > - -- --if-exists del-port "$OVS_BRIDGE" "$DEVICE" \ > - -- add-port "$OVS_BRIDGE" "$DEVICE" $OVS_OPTIONS \ > - -- set Interface "$DEVICE" type=dpdkvhost > ${OVS_EXTRA+-- $OVS_EXTRA} > - ;; > OVSDPDKVhostUserPort) > ifup_ovs_bridge > ovs-vsctl -t ${TIMEOUT} \ > diff --git a/utilities/automake.mk b/utilities/automake.mk > index 9d5b425..380418a 100644 > --- a/utilities/automake.mk > +++ b/utilities/automake.mk > @@ -58,7 +58,6 @@ EXTRA_DIST += \ > utilities/ovs-test.in \ > utilities/ovs-vlan-test.in \ > utilities/ovs-vsctl-bashcomp.bash \ > - utilities/qemu-wrap.py \ > utilities/checkpatch.py > MAN_ROOTS += \ > utilities/ovs-appctl.8.in \ > diff --git a/utilities/qemu-wrap.py b/utilities/qemu-wrap.py > deleted file mode 100755 > index 7847c8c..0000000 > --- a/utilities/qemu-wrap.py > +++ /dev/null > @@ -1,389 +0,0 @@ > -#! /usr/bin/env python > -# > -# BSD LICENSE > -# > -# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > -# All rights reserved. > -# > -# Redistribution and use in source and binary forms, with or without > -# modification, are permitted provided that the following conditions > -# are met: > -# > -# * Redistributions of source code must retain the above copyright > -# notice, this list of conditions and the following disclaimer. > -# * Redistributions in binary form must reproduce the above copyright > -# notice, this list of conditions and the following disclaimer in > -# the documentation and/or other materials provided with the > -# distribution. > -# * Neither the name of Intel Corporation nor the names of its > -# contributors may be used to endorse or promote products derived > -# from this software without specific prior written permission. > -# > -# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS > -# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT > -# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR > -# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT > -# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > -# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > -# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > -# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > -# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > -# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE > -# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > -# > - > -##################################################################### > -# This script is designed to modify the call to the QEMU emulator > -# to support userspace vhost when starting a guest machine through > -# libvirt with vhost enabled. The steps to enable this are as follows > -# and should be run as root: > -# > -# 1. Place this script in a libvirtd's binary search PATH ($PATH) > -# A good location would be in the same directory that the QEMU > -# binary is located > -# > -# 2. Ensure that the script has the same owner/group and file > -# permissions as the QEMU binary > -# > -# 3. Update the VM xml file using "virsh edit VM.xml" > -# > -# 3.a) Set the VM to use the launch script > -# > -# Set the emulator path contained in the > -# <emulator><emulator/> tags > -# > -# e.g replace <emulator>/usr/bin/qemu-kvm<emulator/> > -# with <emulator>/usr/bin/qemu-wrap.py<emulator/> > -# > -# 3.b) Set the VM's device's to use vhost-net offload > -# > -# <interface type="network"> > -# <model type="virtio"/> > -# <driver name="vhost"/> > -# <interface/> > -# > -# 4. Enable libvirt to access our userpace device file by adding it to > -# controllers cgroup for libvirtd using the following steps > -# > -# 4.a) In /etc/libvirt/qemu.conf add/edit the following lines: > -# 1) cgroup_controllers = [ ... "devices", ... ] > -# 2) clear_emulator_capabilities = 0 > -# 3) user = "root" > -# 4) group = "root" > -# 5) cgroup_device_acl = [ > -# "/dev/null", "/dev/full", "/dev/zero", > -# "/dev/random", "/dev/urandom", > -# "/dev/ptmx", "/dev/kvm", "/dev/kqemu", > -# "/dev/rtc", "/dev/hpet", "/dev/net/tun", > -# "/dev/<devbase-name>-<index>", > -# "/dev/hugepages" > -# ] > -# > -# 4.b) Disable SELinux or set to permissive mode > -# > -# 4.c) Mount cgroup device controller > -# "mkdir /dev/cgroup" > -# "mount -t cgroup none /dev/cgroup -o devices" > -# > -# 4.d) Set hugetlbfs_mount variable - ( Optional ) > -# VMs using userspace vhost must use hugepage backed > -# memory. This can be enabled in the libvirt XML > -# config by adding a memory backing section to the > -# XML config e.g. > -# <memoryBacking> > -# <hugepages/> > -# </memoryBacking> > -# This memory backing section should be added after the > -# <memory> and <currentMemory> sections. This will add > -# flags "-mem-prealloc -mem-path <path>" to the QEMU > -# command line. The hugetlbfs_mount variable can be used > -# to override the default <path> passed through by libvirt. > -# > -# if "-mem-prealloc" or "-mem-path <path>" are not passed > -# through and a vhost device is detected then these options will > -# be automatically added by this script. This script will detect > -# the system hugetlbfs mount point to be used for <path>. The > -# default <path> for this script can be overidden by the > -# hugetlbfs_dir variable in the configuration section of this > script. > -# > -# > -# 4.e) Restart the libvirtd system process > -# e.g. on Fedora "systemctl restart libvirtd.service" > -# > -# > -# 4.f) Edit the Configuration Parameters section of this script > -# to point to the correct emulator location and set any > -# addition options > -# > -# The script modifies the libvirtd Qemu call by modifying/adding > -# options based on the configuration parameters below. > -# NOTE: > -# emul_path and us_vhost_path must be set > -# All other parameters are optional > -##################################################################### > - > - > -############################################# > -# Configuration Parameters > -############################################# > -#Path to QEMU binary > -emul_path = "/usr/local/bin/qemu-system-x86_64" > - > -#Path to userspace vhost device file > -# This filename should match the --dev-basename --dev-index parameters of > -# the command used to launch the userspace vhost sample application e.g. > -# if the sample app lauch command is: > -# ./build/vhost-switch ..... --dev-basename usvhost --dev-index 1 > -# then this variable should be set to: > -# us_vhost_path = "/dev/usvhost-1" > -us_vhost_path = "/dev/usvhost-1" > - > -#List of additional user defined emulation options. These options will > -#be added to all Qemu calls > -emul_opts_user = [] > - > -#List of additional user defined emulation options for vhost only. > -#These options will only be added to vhost enabled guests > -emul_opts_user_vhost = [] > - > -#For all VHOST enabled VMs, the VM memory is preallocated from hugetlbfs > -# Set this variable to one to enable this option for all VMs > -use_huge_all = 0 > - > -#Instead of autodetecting, override the hugetlbfs directory by setting > -#this variable > -hugetlbfs_dir = "" > - > -############################################# > - > - > -############################################# > -# ****** Do Not Modify Below this Line ****** > -############################################# > - > -import sys, os, subprocess > -import time > -import signal > - > - > -#List of open userspace vhost file descriptors > -fd_list = [] > - > -#additional virtio device flags when using userspace vhost > -vhost_flags = [ "csum=off", > - "gso=off", > - "guest_tso4=off", > - "guest_tso6=off", > - "guest_ecn=off" > - ] > - > -#String of the path to the Qemu process pid > -qemu_pid = "/tmp/%d-qemu.pid" % os.getpid() > - > -############################################# > -# Signal haldler to kill Qemu subprocess > -############################################# > -def kill_qemu_process(signum, stack): > - pidfile = open(qemu_pid, 'r') > - pid = int(pidfile.read()) > - os.killpg(pid, signal.SIGTERM) > - pidfile.close() > - > - > -############################################# > -# Find the system hugefile mount point. > -# Note: > -# if multiple hugetlbfs mount points exist > -# then the first one found will be used > -############################################# > -def find_huge_mount(): > - > - if (len(hugetlbfs_dir)): > - return hugetlbfs_dir > - > - huge_mount = "" > - > - if (os.access("/proc/mounts", os.F_OK)): > - f = open("/proc/mounts", "r") > - line = f.readline() > - while line: > - line_split = line.split(" ") > - if line_split[2] == 'hugetlbfs': > - huge_mount = line_split[1] > - break > - line = f.readline() > - else: > - print "/proc/mounts not found" > - exit (1) > - > - f.close > - if len(huge_mount) == 0: > - print "Failed to find hugetlbfs mount point" > - exit (1) > - > - return huge_mount > - > - > -############################################# > -# Get a userspace Vhost file descriptor > -############################################# > -def get_vhost_fd(): > - > - if (os.access(us_vhost_path, os.F_OK)): > - fd = os.open( us_vhost_path, os.O_RDWR) > - else: > - print ("US-Vhost file %s not found" %us_vhost_path) > - exit (1) > - > - return fd > - > - > -############################################# > -# Check for vhostfd. if found then replace > -# with our own vhost fd and append any vhost > -# flags onto the end > -############################################# > -def modify_netdev_arg(arg): > - > - global fd_list > - vhost_in_use = 0 > - s = '' > - new_opts = [] > - netdev_opts = arg.split(",") > - > - for opt in netdev_opts: > - #check if vhost is used > - if "vhost" == opt[:5]: > - vhost_in_use = 1 > - else: > - new_opts.append(opt) > - > - #if using vhost append vhost options > - if vhost_in_use == 1: > - #append vhost on option > - new_opts.append('vhost=on') > - #append vhostfd ption > - new_fd = get_vhost_fd() > - new_opts.append('vhostfd=' + str(new_fd)) > - fd_list.append(new_fd) > - > - #concatenate all options > - for opt in new_opts: > - if len(s) > 0: > - s+=',' > - > - s+=opt > - > - return s > - > - > -############################################# > -# Main > -############################################# > -def main(): > - > - global fd_list > - global vhost_in_use > - new_args = [] > - num_cmd_args = len(sys.argv) > - emul_call = '' > - mem_prealloc_set = 0 > - mem_path_set = 0 > - num = 0; > - > - #parse the parameters > - while (num < num_cmd_args): > - arg = sys.argv[num] > - > - #Check netdev +1 parameter for vhostfd > - if arg == '-netdev': > - num_vhost_devs = len(fd_list) > - new_args.append(arg) > - > - num+=1 > - arg = sys.argv[num] > - mod_arg = modify_netdev_arg(arg) > - new_args.append(mod_arg) > - > - #append vhost flags if this is a vhost device > - # and -device is the next arg > - # i.e -device -opt1,-opt2,...,-opt3,%vhost > - if (num_vhost_devs < len(fd_list)): > - num+=1 > - arg = sys.argv[num] > - if arg == '-device': > - new_args.append(arg) > - num+=1 > - new_arg = sys.argv[num] > - for flag in vhost_flags: > - new_arg = ''.join([new_arg,',',flag]) > - new_args.append(new_arg) > - else: > - new_args.append(arg) > - elif arg == '-mem-prealloc': > - mem_prealloc_set = 1 > - new_args.append(arg) > - elif arg == '-mem-path': > - mem_path_set = 1 > - new_args.append(arg) > - > - else: > - new_args.append(arg) > - > - num+=1 > - > - #Set Qemu binary location > - emul_call+=emul_path > - emul_call+=" " > - > - #Add prealloc mem options if using vhost and not already added > - if ((len(fd_list) > 0) and (mem_prealloc_set == 0)): > - emul_call += "-mem-prealloc " > - > - #Add mempath mem options if using vhost and not already added > - if ((len(fd_list) > 0) and (mem_path_set == 0)): > - #Detect and add hugetlbfs mount point > - mp = find_huge_mount() > - mp = "".join(["-mem-path ", mp]) > - emul_call += mp > - emul_call += " " > - > - #add user options > - for opt in emul_opts_user: > - emul_call += opt > - emul_call += " " > - > - #Add add user vhost only options > - if len(fd_list) > 0: > - for opt in emul_opts_user_vhost: > - emul_call += opt > - emul_call += " " > - > - #Add updated libvirt options > - iter_args = iter(new_args) > - #skip 1st arg i.e. call to this script > - next(iter_args) > - for arg in iter_args: > - emul_call+=str(arg) > - emul_call+= " " > - > - emul_call += "-pidfile %s " % qemu_pid > - #Call QEMU > - process = subprocess.Popen(emul_call, shell=True, > preexec_fn=os.setsid) > - > - for sig in [signal.SIGTERM, signal.SIGINT, signal.SIGHUP, > signal.SIGQUIT]: > - signal.signal(sig, kill_qemu_process) > - > - process.wait() > - > - #Close usvhost files > - for fd in fd_list: > - os.close(fd) > - #Cleanup temporary files > - if os.access(qemu_pid, os.F_OK): > - os.remove(qemu_pid) > - > - > - > -if __name__ == "__main__": > - main() > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 780bd2d..8331b49 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -286,18 +286,6 @@ > </p> > </column> > > - <column name="other_config" key="cuse-dev-name" > - type='{"type": "string"}'> > - <p> > - Specifies the name of the vhost-cuse character device to open > for > - vhost-cuse support. > - </p> > - <p> > - The default is vhost-net. Changing this value requires > restarting > - the daemon. > - </p> > - </column> > - > <column name="other_config" key="vhost-sock-dir" > type='{"type": "string"}'> > <p> > -- > 2.4.3 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev