"Traynor, Kevin" <kevin.tray...@intel.com> writes: >> -----Original Message----- >> From: Aaron Conole [mailto:acon...@redhat.com] >> Sent: Monday, January 4, 2016 9:47 PM >> To: dev@openvswitch.org; Flavio Leitner; Traynor, Kevin >> Subject: [PATCH v2 2/3] netdev-dpdk: Convert initialization from cmdline to >> db >> >> Existing DPDK integration is provided by use of command line options which >> must be split out and passed to librte in a special manner. However, this >> forces any configuration to be passed by way of a special DPDK flag, and >> interferes with ovs+dpdk packaging solutions. >> >> This commit delays dpdk initialization until the first DPDK netdev is added >> to the bridge, at which point ovs initializes librte. It pulls all of >> the config data from the OVS database, and assembles a new argv/argc >> pair to be passed along. >> >> Signed-off-by: Aaron Conole <acon...@redhat.com> >> --- >> v2: >> * Removed trailing whitespace >> * Followed for() loop brace coding style >> * Automatically enable DPDK when adding a DPDK enabled port >> * Fixed an issue on startup when DPDK enabled ports are present >> * Updated the documentation (including vswitch.xml) and documented all >> new parameters >> * Dropped the premature initialization test > > Hi, mostly very minor comments below,
Kevin, thanks for the review! > >> >> INSTALL.DPDK.md | 75 ++++++++++++---- >> lib/netdev-dpdk.c | 224 +++++++++++++++++++++++++++++++++++----------- >> -- >> lib/netdev-dpdk.h | 19 ++-- >> vswitchd/bridge.c | 3 + >> vswitchd/ovs-vswitchd.c | 25 +----- >> vswitchd/vswitch.xml | 102 +++++++++++++++++++++- >> 6 files changed, 341 insertions(+), 107 deletions(-) >> >> diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md >> index 96b686c..2dd2120 100644 >> --- a/INSTALL.DPDK.md >> +++ b/INSTALL.DPDK.md >> @@ -143,22 +143,59 @@ Using the DPDK with ovs-vswitchd: >> >> 5. Start vswitchd: >> >> - DPDK configuration arguments can be passed to vswitchd via `--dpdk` >> - argument. This needs to be first argument passed to vswitchd process. >> - dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter >> - for dpdk initialization. >> + DPDK configuration arguments can be passed to vswitchd via Open_vSwitch >> + other_config database. The recognized configuration options are listed. > > 'Defaults will be provided for all values not set.' Thanks, will add this. > >> + >> + * dpdk-lcore-mask >> + Specifies the CPU cores on which dpdk lcore threads should be spawned. >> + The DPDK lcore threads are used for DPDK library tasks, such as >> + library internal message processing, logging, etc. Value should be in >> + the form of a hex string (so '0x123') similar to the 'taskset' mask >> + input. > > 'CPU cores' and '0x123' imply it will be multiple cores in this case - better > to change to CPU core and 0x1 > > Also, I don't think the 0x prefix is accepted based on using pmd-cpu-mask. Hrrm.. okay. I used the documentation on pmd-cpu-mask as my guideline, so if that's broken it's a bug against pmd-cpu-mask as well (it's documented as a <hex-string>). >> + If not specified, the value will be determined by choosing the lowest >> + CPU core from initial cpu affinity list. Otherwise, the value will be >> + passed directly to the DPDK library. >> + For performance reasons, it is best to set this to a single core on >> + the system, rather than allow lcore threads to float. > > I suppose it depends on the system load and vswitch usage as to which will > give > better performance but I agree setting a dedicated core is at least more > deterministic and less risky, so statement is probably fine. Well, I used the advice in rings as my guideline here. I think it's correct to say that the workload is important. Okay, I'll think if there's a better comment to use here. >> + >> + * dpdk-mem-channels >> + This sets the number of memory spread channels per CPU socket. It is >> purely >> + an optimization flag. >> + >> + * dpdk-alloc-mem >> + This sets the total memory to preallocate from hugepages regardless of >> + processor socket. It is recommended to use dpdk-socket-mem instead. >> + >> + * dpdk-socket-mem >> + Comma separated list of memory to pre-allocate from hugepages on specific >> + sockets. >> + >> + * dpdk-hugepage-dir >> + Directory where hugetlbfs is mounted >> + >> + * cuse-dev-name >> + Option to set the vhost_cuse character device name. >> + >> + * vhost-sock-dir >> + Option to set the path to the vhost_user unix socket files. >> + >> + NOTE: Changing any of these options requires restarting the ovs-vswitchd >> + application. >> + >> + Open vSwitch can be started as normal. DPDK will not be initialized until >> + the first DPDK-enabled port is added to the bridge. >> >> ``` >> export DB_SOCK=/usr/local/var/run/openvswitch/db.sock >> - ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile --detach >> + ovs-vswitchd unix:$DB_SOCK --pidfile --detach >> ``` >> >> If allocated more than one GB hugepage (as for IVSHMEM), set amount and >> use NUMA node 0 memory: >> >> ``` >> - ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \ >> - -- unix:$DB_SOCK --pidfile --detach >> + ovs-vsctl --no-wait set Open_vSwitch . >> other_config:dpdk_socket_mem="1024,0" >> + ovs-vswitchd unix:$DB_SOCK --pidfile --detach >> ``` >> >> 6. Add bridge & ports >> @@ -521,11 +558,12 @@ have arbitrary names. >> `/usr/local/var/run/openvswitch/vhost-user-1`, which you must provide >> to your VM on the QEMU command line. More instructions on this can be >> found in the next section "DPDK vhost-user VM configuration" >> - Note: If you wish for the vhost-user sockets to be created in a >> - directory other than `/usr/local/var/run/openvswitch`, you may specify >> - another location on the ovs-vswitchd command line like so: >> >> - `./vswitchd/ovs-vswitchd --dpdk -vhost_sock_dir /my-dir -c 0x1 ...` >> + - If you wish for the vhost-user sockets to be created in a directory >> other >> + than `/usr/local/var/run/openvswitch`, you may specify another location >> + in the ovsdb like: >> + >> + `./vswitchd/ovs-vsctl set Open_vSwitch . other_config:vhost-sock- >> dir=path` > > --no-wait d'oh! >> >> DPDK vhost-user VM configuration: >> --------------------------------- >> @@ -638,14 +676,13 @@ DPDK vhost-cuse VM configuration: >> >> 1. This step is only needed if using an alternative character device. >> >> - The new character device filename must be specified on the vswitchd >> - commandline: >> + The new character device filename must be specified in the ovsdb: >> >> - `./vswitchd/ovs-vswitchd --dpdk --cuse_dev_name my-vhost-net -c 0x1 >> ...` >> + `./utilities/ovs-vsctl set Open_vSwitch . \ >> + other_config:cuse-dev-name=my-vhost-net` > > --no-wait d'oh! >> - Note that the `--cuse_dev_name` argument and associated string must be >> the first >> - arguments after `--dpdk` and come before the EAL arguments. In the >> example >> - above, the character device to be used will be `/dev/my-vhost-net`. >> + In the example above, the character device to be used will be >> + `/dev/my-vhost-net`. >> >> 2. This step is only needed if reusing the standard character device. It >> will >> conflict with the kernel vhost character device so the user must first >> @@ -758,8 +795,8 @@ steps. >> >> <my-vhost-device> refers to "vhost-net" if using the `/dev/vhost- >> net` >> device. If you have specificed a different name on the ovs-vswitchd >> - commandline using the "--cuse_dev_name" parameter, please specify >> that >> - filename instead. >> + database using the "other_config:cuse-dev-name" parameter, please >> + specify that filename instead. >> >> 2. Disable SELinux or set to permissive mode >> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >> index b42712f..2ce9f71 100644 >> --- a/lib/netdev-dpdk.c >> +++ b/lib/netdev-dpdk.c >> @@ -29,6 +29,7 @@ >> #include <stdio.h> >> #include <sys/types.h> >> #include <sys/stat.h> >> +#include <getopt.h> >> >> #include "dirs.h" >> #include "dp-packet.h" >> @@ -49,6 +50,8 @@ >> #include "timeval.h" >> #include "unixctl.h" >> #include "openvswitch/vlog.h" >> +#include "smap.h" >> +#include "vswitch-idl.h" >> >> #include "rte_config.h" >> #include "rte_mbuf.h" >> @@ -138,6 +141,10 @@ enum dpdk_dev_type { >> DPDK_DEV_VHOST = 1, >> }; >> >> +static const struct ovsrec_open_vswitch *ovs_last_cfg; >> + >> +void dpdk_init(void); >> + >> static int rte_eal_init_ret = ENODEV; >> >> static struct ovs_mutex dpdk_mutex = OVS_MUTEX_INITIALIZER; >> @@ -551,8 +558,20 @@ netdev_dpdk_cast(const struct netdev *netdev) >> static struct netdev * >> netdev_dpdk_alloc(void) >> { >> - struct netdev_dpdk *netdev = dpdk_rte_mzalloc(sizeof *netdev); >> - return &netdev->up; >> + struct netdev_dpdk *netdev; >> + >> + if (rte_eal_init_ret) { >> + dpdk_init(); > > This may be gone from the next patch, but there is protection inside > and outside of dpdk_init(); to prevent it executing twice. There might > be a way to get rid of one of them while still getting the return values > you want? I wanted to skip the dpdk_init path altogether, but you're probably right. Anyway, you're correct that it will need to be re-done from the patch series. >> + if (rte_eal_init_ret) { >> + return NULL; >> + } >> + } >> + >> + netdev = dpdk_rte_mzalloc(sizeof *netdev); >> + if (netdev) { >> + return &netdev->up; >> + } >> + return NULL; >> } >> >> static void >> @@ -657,7 +676,10 @@ vhost_construct_helper(struct netdev *netdev_) >> OVS_REQUIRES(dpdk_mutex) >> struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_); >> >> if (rte_eal_init_ret) { >> - return rte_eal_init_ret; >> + dpdk_init(); >> + if (rte_eal_init_ret) { >> + return rte_eal_init_ret; >> + } >> } >> >> rte_spinlock_init(&netdev->vhost_tx_lock); >> @@ -670,6 +692,13 @@ netdev_dpdk_vhost_cuse_construct(struct netdev *netdev_) >> struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_); >> int err; >> >> + if (rte_eal_init_ret) { >> + dpdk_init(); >> + if (rte_eal_init_ret) { >> + return rte_eal_init_ret; >> + } >> + } >> + >> ovs_mutex_lock(&dpdk_mutex); >> strncpy(netdev->vhost_id, netdev->up.name, sizeof(netdev->vhost_id)); >> err = vhost_construct_helper(netdev_); >> @@ -683,6 +712,13 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev_) >> struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_); >> int err; >> >> + if (rte_eal_init_ret) { >> + dpdk_init(); >> + if (rte_eal_init_ret) { >> + return rte_eal_init_ret; >> + } >> + } >> + >> ovs_mutex_lock(&dpdk_mutex); >> /* Take the name of the vhost-user port and append it to the location >> where >> * the socket is to be created, then register the socket. >> @@ -707,7 +743,10 @@ netdev_dpdk_construct(struct netdev *netdev) >> int err; >> >> if (rte_eal_init_ret) { >> - return rte_eal_init_ret; >> + dpdk_init(); >> + if (rte_eal_init_ret) { >> + return rte_eal_init_ret; >> + } >> } >> >> /* Names always start with "dpdk" */ >> @@ -1776,6 +1815,8 @@ new_device(struct virtio_net *dev) >> struct netdev_dpdk *netdev; >> bool exists = false; >> >> + >> + >> ovs_mutex_lock(&dpdk_mutex); >> /* Add device to the vhost port with the same name as that passed down. >> */ >> LIST_FOR_EACH(netdev, list_node, &dpdk_list) { >> @@ -2023,7 +2064,10 @@ netdev_dpdk_ring_construct(struct netdev *netdev) >> int err = 0; >> >> if (rte_eal_init_ret) { >> - return rte_eal_init_ret; >> + dpdk_init(); >> + if (rte_eal_init_ret) { >> + return rte_eal_init_ret; >> + } >> } >> >> ovs_mutex_lock(&dpdk_mutex); >> @@ -2111,10 +2155,14 @@ unlock_dpdk: >> >> static int >> process_vhost_flags(char *flag, char *default_val, int size, >> - char **argv, char **new_val) >> + const struct ovsrec_open_vswitch *ovs_cfg, >> + char **new_val) >> { >> + const char *val; >> int changed = 0; >> >> + val = smap_get(&ovs_cfg->other_config, flag); >> + >> /* Depending on which version of vhost is in use, process the vhost- >> specific >> * flag if it is provided on the vswitchd command line, otherwise resort >> to >> * a default value. >> @@ -2124,9 +2172,9 @@ process_vhost_flags(char *flag, char *default_val, int >> size, >> * For vhost-cuse: Process "-cuse_dev_name" to set the custom name of >> the >> * vhost-cuse character device. >> */ >> - if (!strcmp(argv[1], flag) && (strlen(argv[2]) <= size)) { >> + if (val && (strlen(val) <= size)) { >> changed = 1; >> - *new_val = strdup(argv[2]); >> + *new_val = strdup(val); >> VLOG_INFO("User-provided %s in use: %s", flag, *new_val); >> } else { >> VLOG_INFO("No %s provided - defaulting to %s", flag, default_val); >> @@ -2136,68 +2184,115 @@ process_vhost_flags(char *flag, char *default_val, >> int size, >> return changed; >> } >> >> -int >> -dpdk_init(int argc, char **argv) >> +static char ** >> +grow_argv(char ***argv, size_t cur_siz, size_t grow_by) >> { >> - int result; >> - int base = 0; >> - char *pragram_name = argv[0]; >> - int err; >> - int isset; >> - cpu_set_t cpuset; >> - >> - if (argc < 2 || strcmp(argv[1], "--dpdk")) >> - return 0; >> + char **new_argv = realloc(*argv, sizeof(char *) * (cur_siz + grow_by)); >> + return new_argv; >> +} >> >> - /* Remove the --dpdk argument from arg list.*/ >> - argc--; >> - argv++; >> +static int >> +get_dpdk_args(const struct ovsrec_open_vswitch *ovs_cfg, char ***argv) >> +{ >> + struct dpdk_options_map { >> + const char *ovs_configuration; >> + const char *dpdk_option; >> + bool default_enabled; >> + const char *default_value; >> + } opts[] = { >> + {"dpdk-lcore-mask", "-c", true, "0x1"}, >> + /* XXX: DPDK 2.2.0 support, the true should become false for -n */ >> + {"dpdk-mem-channels", "-n", true, "4"}, >> + {"dpdk-alloc-mem", "-m", false, NULL}, >> + {"dpdk-socket-mem", "--socket-mem", true, "1024,0"}, >> + {"dpdk-hugepage-dir", "--huge-dir", false, NULL}, >> + }; >> + int i, ret = 1; >> + >> + for(i = 0; i < (sizeof(opts) / sizeof(opts[0])); ++i) { >> + const char *lookup = smap_get(&ovs_cfg->other_config, >> + opts[i].ovs_configuration); >> + if(!lookup && opts[i].default_enabled) >> + lookup = opts[i].default_value; >> + >> + if(lookup) { >> + char **newargv = grow_argv(argv, ret, 2); >> + >> + if (!newargv) { >> + ovs_abort(0, "grow_argv() failed to allocate memory."); >> + } >> >> - /* Reject --user option */ >> - int i; >> - for (i = 0; i < argc; i++) { >> - if (!strcmp(argv[i], "--user")) { >> - VLOG_ERR("Can not mix --dpdk and --user options, aborting."); >> + *argv = newargv; >> + (*argv)[ret++] = strdup(opts[i].dpdk_option); >> + (*argv)[ret++] = strdup(lookup); >> } >> } >> >> + return ret; >> +} >> + >> +static char **dpdk_argv; >> +static int dpdk_argc; >> + >> +static void >> +deferred_argv_release(void) >> +{ >> + int result; >> + for (result = 0; result < dpdk_argc; ++result) { >> + free(dpdk_argv[result]); >> + } >> + >> + free(dpdk_argv); >> +} >> + >> +static void >> +__dpdk_init(const struct ovsrec_open_vswitch *ovs_cfg) >> +{ >> + char **argv = NULL; >> + int result; >> + int argc; >> + int err; >> + cpu_set_t cpuset; >> + >> + VLOG_INFO("DPDK Enabled, initializing"); >> + >> #ifdef VHOST_CUSE >> - if (process_vhost_flags("-cuse_dev_name", strdup("vhost-net"), >> - PATH_MAX, argv, &cuse_dev_name)) { >> + if (process_vhost_flags("cuse-dev-name", strdup("vhost-net"), >> + PATH_MAX, ovs_cfg, &cuse_dev_name)) { >> #else >> - if (process_vhost_flags("-vhost_sock_dir", strdup(ovs_rundir()), >> - NAME_MAX, argv, &vhost_sock_dir)) { >> + if (process_vhost_flags("vhost-sock-dir", strdup(ovs_rundir()), >> + NAME_MAX, ovs_cfg, &vhost_sock_dir)) { >> struct stat s; >> - int err; >> >> err = stat(vhost_sock_dir, &s); >> if (err) { >> - VLOG_ERR("vHostUser socket DIR '%s' does not exist.", >> - vhost_sock_dir); >> - return err; >> + ovs_abort(0, "vhost-user sock directory '%s' does not exist.", >> + vhost_sock_dir); >> } >> #endif >> - /* Remove the vhost flag configuration parameters from the argument >> - * list, so that the correct elements are passed to the DPDK >> - * initialization function >> - */ >> - argc -= 2; >> - argv += 2; /* Increment by two to bypass the vhost flag arguments >> */ >> - base = 2; >> } >> >> /* Get the main thread affinity */ >> CPU_ZERO(&cpuset); >> err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t), >> &cpuset); >> if (err) { >> - VLOG_ERR("Thread getaffinity error %d.", err); >> - return err; >> + ovs_abort(0, "Thread getaffinity error %d.", err); >> } >> >> - /* Keep the program name argument as this is needed for call to >> - * rte_eal_init() >> - */ >> - argv[0] = pragram_name; >> + argv = grow_argv(&argv, 0, 1); >> + if (!argv) { >> + ovs_abort(0, "Unable to allocate an initial argv."); >> + } >> + argv[0] = strdup("ovs"); /* TODO use prctl to get process name */ >> + argc = get_dpdk_args(ovs_cfg, &argv); >> + >> + argv = grow_argv(&argv, argc, argc+1); >> + if (!argv) { >> + ovs_abort(0, "Unable to make final argv allocation."); >> + } >> + argv[argc] = 0; >> + >> + optind = 1; >> >> /* Make sure things are initialized ... */ >> result = rte_eal_init(argc, argv); >> @@ -2208,21 +2303,36 @@ dpdk_init(int argc, char **argv) >> /* Set the main thread affinity back to pre rte_eal_init() value */ >> err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), >> &cpuset); >> if (err) { >> - VLOG_ERR("Thread setaffinity error %d", err); >> - return err; >> + ovs_abort(0, "Thread getaffinity error %d.", err); >> } >> >> + dpdk_argv = argv; >> + dpdk_argc = argc; >> + >> + atexit(deferred_argv_release); >> + >> rte_memzone_dump(stdout); >> rte_eal_init_ret = 0; >> >> - if (argc > result) { >> - argv[result] = argv[0]; >> - } >> - >> /* We are called from the main thread here */ >> RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID; >> +} >> >> - return result + 1 + base; >> +void >> +dpdk_config(const struct ovsrec_open_vswitch *ovs_cfg) >> +{ >> + ovs_last_cfg = ovs_cfg; > > I think there would need to be locking around ovs_last_cfg as we > could also be in the middle of initialization here. This won't be an issue with the next series I post. >> +} >> + >> +void >> +dpdk_init(void) >> +{ >> + static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; >> + >> + if(ovs_last_cfg && ovsthread_once_start(&once)) { >> + __dpdk_init(ovs_last_cfg); >> + ovsthread_once_done(&once); >> + } >> } >> >> static const struct netdev_class dpdk_class = >> @@ -2286,10 +2396,6 @@ netdev_dpdk_register(void) >> { >> static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; >> >> - if (rte_eal_init_ret) { >> - return; >> - } >> - >> if (ovsthread_once_start(&once)) { >> dpdk_common_init(); >> netdev_register_provider(&dpdk_class); >> diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h >> index 646d3e2..e699f37 100644 >> --- a/lib/netdev-dpdk.h >> +++ b/lib/netdev-dpdk.h >> @@ -22,7 +22,9 @@ struct dp_packet; >> >> #define NON_PMD_CORE_ID LCORE_ID_ANY >> >> -int dpdk_init(int argc, char **argv); >> +struct ovsrec_open_vswitch; >> + >> +void dpdk_config(const struct ovsrec_open_vswitch *ovs_cfg); >> void netdev_dpdk_register(void); >> void free_dpdk_buf(struct dp_packet *); >> int pmd_thread_setaffinity_cpu(unsigned cpu); >> @@ -30,15 +32,20 @@ int pmd_thread_setaffinity_cpu(unsigned cpu); >> #else >> >> #define NON_PMD_CORE_ID UINT32_MAX >> - >> #include "util.h" >> >> -static inline int >> -dpdk_init(int argc, char **argv) >> +static inline void >> +dpdk_config(const struct ovsrec_open_vswitch *ovs_cfg) >> { >> - if (argc >= 2 && !strcmp(argv[1], "--dpdk")) { >> - ovs_fatal(0, "DPDK support not built into this copy of Open >> vSwitch."); >> + static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; >> + >> + if (ovs_cfg && ovsthread_once_start(&once)) { >> + if(smap_get_bool(&ovs_cfg->other_config, "dpdk", false)) { >> + ovs_fatal(0, "DPDK not supported in this copy of Open >> vSwitch."); > > "dpdk" db entry is not in this version of the patchset. Good catch... I should have tested it more thoroughly. > Might be best for this function to just do nothing now. A print doesn't > even seem to make sense as this will be called for OVS without DPDK. Okay. >> + } >> + ovsthread_once_done(&once); >> } >> + >> return 0; >> } >> >> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c >> index f8afe55..f7aab07 100644 >> --- a/vswitchd/bridge.c >> +++ b/vswitchd/bridge.c >> @@ -68,6 +68,7 @@ >> #include "sflow_api.h" >> #include "vlan-bitmap.h" >> #include "packets.h" >> +#include "lib/netdev-dpdk.h" >> >> VLOG_DEFINE_THIS_MODULE(bridge); >> >> @@ -2922,6 +2923,8 @@ bridge_run(void) >> } >> cfg = ovsrec_open_vswitch_first(idl); >> >> + dpdk_config(cfg); >> + >> /* Initialize the ofproto library. This only needs to run once, but >> * it must be done after the configuration is set. If the >> * initialization has already occurred, bridge_init_ofproto() >> diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c >> index e78ecda..c448107 100644 >> --- a/vswitchd/ovs-vswitchd.c >> +++ b/vswitchd/ovs-vswitchd.c >> @@ -48,7 +48,6 @@ >> #include "openvswitch/vconn.h" >> #include "openvswitch/vlog.h" >> #include "lib/vswitch-idl.h" >> -#include "lib/netdev-dpdk.h" >> >> VLOG_DEFINE_THIS_MODULE(vswitchd); >> >> @@ -71,13 +70,6 @@ main(int argc, char *argv[]) >> int retval; >> >> set_program_name(argv[0]); >> - retval = dpdk_init(argc,argv); >> - if (retval < 0) { >> - return retval; >> - } >> - >> - argc -= retval; >> - argv += retval; >> >> ovs_cmdl_proctitle_init(argc, argv); >> service_start(&argc, &argv); >> @@ -166,7 +158,7 @@ parse_options(int argc, char *argv[], char >> **unixctl_pathp) >> {"bootstrap-ca-cert", required_argument, NULL, >> OPT_BOOTSTRAP_CA_CERT}, >> {"enable-dummy", optional_argument, NULL, OPT_ENABLE_DUMMY}, >> {"disable-system", no_argument, NULL, OPT_DISABLE_SYSTEM}, >> - {"dpdk", required_argument, NULL, OPT_DPDK}, >> + {"dpdk", optional_argument, NULL, OPT_DPDK}, >> {NULL, 0, NULL, 0}, >> }; >> char *short_options = >> ovs_cmdl_long_options_to_short_options(long_options); >> @@ -219,7 +211,7 @@ parse_options(int argc, char *argv[], char >> **unixctl_pathp) >> exit(EXIT_FAILURE); >> >> case OPT_DPDK: >> - ovs_fatal(0, "--dpdk must be given at beginning of command >> line."); >> + ovs_fatal(0, "Using --dpdk to configure DPDK is not >> supported."); >> break; >> >> default: >> @@ -256,17 +248,8 @@ usage(void) >> daemon_usage(); >> vlog_usage(); >> printf("\nDPDK options:\n" >> - " --dpdk [VHOST] [DPDK] Initialize DPDK datapath.\n" >> - " where DPDK are options for initializing DPDK lib and VHOST >> is\n" >> -#ifdef VHOST_CUSE >> - " option to override default character device name used for\n" >> - " for use with userspace vHost\n" >> - " -cuse_dev_name NAME\n" >> -#else >> - " option to override default directory where vhost-user\n" >> - " sockets are created.\n" >> - " -vhost_sock_dir DIR\n" >> -#endif >> + "Configuration of DPDK via command-line is removed from this\n" >> + "version of Open vSwitch. DPDK is configured through ovsdb.\n" >> ); >> printf("\nOther options:\n" >> " --unixctl=SOCKET override default control socket >> name\n" >> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml >> index ce0dbc1..355da95 100644 >> --- a/vswitchd/vswitch.xml >> +++ b/vswitchd/vswitch.xml >> @@ -176,11 +176,38 @@ >> </p> >> </column> >> >> - <column name="other_config" key="pmd-cpu-mask"> >> + <column name="other_config" key="dpdk-lcore-mask" >> + type='{"type": "integer", "minInteger": 1}'> >> + <p> >> + Specifies the CPU cores on which dpdk lcore threads should be >> spawned. >> + The DPDK lcore threads are used for DPDK library tasks, such as >> + library internal message processing, logging, etc. Value should be >> in >> + the form of a hex string (so '0x123') similar to the 'taskset' >> mask >> + input. >> + </p> >> + <p> >> + The lowest order bit corresponds to the first CPU core. A set bit >> + means the corresponding core is available and a pmd thread will be >> + created and pinned to it. If the input does not cover all cores, >> + those uncovered cores are considered not set. >> + </p> >> + <p> >> + For performance reasons, it is best to set this to a single core >> on >> + the system, rather than allow lcore threads to float. >> + </p> >> + <p> >> + If not specified, the value will be determined by choosing the >> lowest >> + CPU core from initial cpu affinity list. Otherwise, the value will >> be >> + passed directly to the DPDK library. >> + </p> >> + </column> >> + >> + <column name="other_config" key="pmd-cpu-mask" >> + type='{"type": "integer", "minInteger": 1}'> >> <p> >> Specifies CPU mask for setting the cpu affinity of PMD (Poll >> Mode Driver) threads. Value should be in the form of hex string, >> - similar to the dpdk EAL '-c COREMASK' option input or the >> 'taskset' >> + similar to the dpdk-lcore-mask option input or the 'taskset' >> mask input. >> </p> >> <p> >> @@ -195,6 +222,77 @@ >> </p> >> </column> >> >> + <column name="other_config" key="dpdk-mem-channels" >> + type='{"type": "integer", "minInteger": 1}'> >> + <p> >> + Specifies the number of NUMA memory spread channels in the CPU to >> + be used by DPDK. It is purely optimization. >> + </p> >> + <p> >> + The default value is 4. >> + </p> >> + </column> >> + >> + <column name="other_config" key="dpdk-alloc-mem" >> + type='{"type": "integer", "minInteger": 0}'> >> + <p> >> + Specifies the amount of memory to preallocate from the hugepage >> pool, >> + regardless of socket. It is recommended that dpdk-socket-mem is >> used >> + instead. >> + </p> >> + <p> >> + If not specified, the value is 0. >> + </p> >> + </column> >> + >> + <column name="other_config" key="dpdk-socket-mem" >> + type='{"type": "string"}'> >> + <p> >> + Specifies the amount of memory to preallocate from the hugepage >> pool, >> + on a per-socket basis. >> + </p> >> + <p> >> + The specifier is a comma-separated string, in ascending order of >> CPU >> + socket (ex: 1024,2048,4096,8192 would set socket 0 to preallocate >> + 1024MB, socket 1 to preallocate 2048MB, etc.) >> + </p> >> + <p> >> + If not specified, the default value is 1024,0. >> + </p> >> + </column> >> + >> + <column name="other_config" key="dpdk-hugepage-dir" >> + type='{"type": "string"}'> >> + <p> >> + Specifies the path to the hugetlbfs mount point. >> + </p> >> + <p> >> + If not specified, this will be guessed by the DPDK library >> (default >> + is /dev/hugepages). >> + </p> >> + </column> >> + >> + <column name="other_config" key="cuse-dev-name" >> + type='{"type": "string"}'> >> + <p> >> + Specifies the name of the vhost-cuse character device to open for >> + vhost-cuse support. >> + </p> >> + <p> >> + The default is vhost-net >> + </p> >> + </column> >> + >> + <column name="other_config" key="vhost-sock-dir" >> + type='{"type": "string"}'> >> + <p> >> + Specifies the path to the vhost-user unix domain socket files. >> + </p> >> + <p> >> + The default is the working directory of the application. >> + </p> >> + </column> >> + >> <column name="other_config" key="n-handler-threads" >> type='{"type": "integer", "minInteger": 1}'> >> <p> >> -- >> 2.6.1.133.gf5b6079 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev