> -----Original Message----- > From: Aaron Conole [mailto:acon...@redhat.com] > Sent: Monday, December 21, 2015 7:27 PM > To: Traynor, Kevin > Cc: dev@openvswitch.org; Flavio Leitner > Subject: Re: [PATCH 3/5] netdev-dpdk: Autofill lcore coremask if absent > > "Traynor, Kevin" <kevin.tray...@intel.com> writes: > >> -----Original Message----- > >> From: Aaron Conole [mailto:acon...@redhat.com] > >> Sent: Friday, December 18, 2015 6:28 PM > >> To: dev@openvswitch.org > >> Cc: Flavio Leitner; Traynor, Kevin > >> Subject: [PATCH 3/5] netdev-dpdk: Autofill lcore coremask if absent > >> > >> The user has control over the DPDK internal lcore coremask, but this > >> parameter can be autofilled with a bit more intelligence. If the user > >> does not fill this parameter in, we use the lowest set bit in the > >> current task CPU affinity. > >> > >> Signed-off-by: Aaron Conole <acon...@redhat.com> > >> Cc: Kevin Traynor <kevin.tray...@intel.com> > >> --- > >> lib/netdev-dpdk.c | 47 ++++++++++++++++++++++++++++++++++++++++------- > >> 1 file changed, 40 insertions(+), 7 deletions(-) > >> > >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > >> index 2a81058..696430f 100644 > >> --- a/lib/netdev-dpdk.c > >> +++ b/lib/netdev-dpdk.c > >> @@ -65,6 +65,8 @@ static struct vlog_rate_limit rl = > VLOG_RATE_LIMIT_INIT(5, > >> 20); > >> #define OVS_CACHE_LINE_SIZE CACHE_LINE_SIZE > >> #define OVS_VPORT_DPDK "ovs_dpdk" > >> > >> +#define MAX_BUFSIZ 256 > >> + > >> /* > >> * need to reserve tons of extra space in the mbufs so we can align the > >> * DMA addresses to 4KB. > >> @@ -2147,7 +2149,8 @@ grow_argv(char ***argv, size_t cur_siz, size_t > grow_by) > >> } > >> > >> static int > >> -get_dpdk_args(const struct ovsrec_open_vswitch *ovs_cfg, char ***argv) > >> +get_dpdk_args(const struct ovsrec_open_vswitch *ovs_cfg, char ***argv, > >> + int argc) > >> { > >> struct dpdk_options_map { > >> const char *ovs_configuration; > >> @@ -2155,13 +2158,13 @@ get_dpdk_args(const struct ovsrec_open_vswitch > >> *ovs_cfg, char ***argv) > >> bool default_enabled; > >> const char *default_value; > >> } opts[] = { > >> - {"dpdk_lcore_mask", "-c", true, "0x1"}, > >> + {"dpdk_lcore_mask", "-c", false, NULL}, > >> {"dpdk_mem_channels", "-n", true, "4"}, > >> {"dpdk_alloc_mem", "-m", false, NULL}, > >> {"dpdk_socket_mem", "--socket-mem", true, "1024,0"}, > >> {"dpdk_hugepage_dir", "--huge-dir", false, NULL}, > >> }; > >> - int i, ret = 1; > >> + int i, ret = argc; > >> > >> for(i = 0; i < (sizeof(opts) / sizeof(opts[0])); ++i) { > >> const char *lookup = smap_get(&ovs_cfg->other_config, > >> @@ -2203,7 +2206,8 @@ __dpdk_init(const struct ovsrec_open_vswitch > *ovs_cfg) > >> { > >> char **argv = NULL; > >> int result; > >> - int argc; > >> + int argc = 0, argc_tmp; > >> + bool auto_determine = true; > >> int err; > >> cpu_set_t cpuset; > >> > >> @@ -2236,12 +2240,41 @@ __dpdk_init(const struct ovsrec_open_vswitch > >> *ovs_cfg) > >> ovs_abort(0, "Thread getaffinity error %d.", err); > >> } > >> > >> - argv = grow_argv(&argv, 0, 1); > >> + argv = grow_argv(&argv, argc, argc+1); > >> if (!argv) { > >> ovs_abort(0, "Unable to allocate an initial argv."); > >> } > >> - argv[0] = strdup("ovs"); /* TODO use prctl to get process name */ > >> - argc = get_dpdk_args(ovs_cfg, &argv); > >> + argv[argc++] = strdup("ovs"); /* TODO use prctl to get process name > */ > >> + > >> + argc_tmp = get_dpdk_args(ovs_cfg, &argv, argc); > >> + > >> + while(argc_tmp != argc) { > >> + if (!strcmp("-c", argv[argc++])) { > >> + auto_determine = false;
we can break; here. If we are not going to autodetermine, later we still set the affinities back to the non-isolcpu'd cores. This means that the -c is used for the rte_eal_init() only and the non-pmd threads will not use it. I haven't seen any issues with this in my testing but the question is - do we want to allow a user to set a specific core that the non-pmd threads will run on? If we don't allow and the non-isolcpu cores are heavily loaded it may mean a slowdown for non-pmd threads. Based on that possibility, I'm inclined to think that if the user gives us a -c option, we should not reset the affinities back to the non-isolcpu'd cores. What do you think? > >> + } > >> + } > >> + > >> + /** > >> + * NOTE: This is an unsophisticated mechanism for determining the > DPDK > >> + * lcore for the DPDK Master. > >> + */ > >> + if (auto_determine) > >> + { > > > > coding std if { > > D'oh! Thanks. > > >> + int i; > >> + for (i = 0; i < CPU_SETSIZE; i++) { > >> + if (CPU_ISSET(i, &cpuset)) { > > > > I had been thinking we could put in a check here to ensure that the > > core we have selected is suitable for the socket-mem but I'm not sure > > if it's really needed. We could always add it later, it won't change > > the user interface. > > I think such a change is an optimization, at best, and is really > difficult to get right. I'd rather do the naive but very predictable > thing, and give the option to override, than guess at a more complicated > solution on a first cut. > > Plus, the socket-mem is only relevent for the PMD threads, > iirc. Even if I'm wrong though, if someone is playing with coremasks, > I think they should probably just be setting the affinity manually, > anyway. Sure - I agree. > > >> + char buf[MAX_BUFSIZ]; > >> + snprintf(buf, MAX_BUFSIZ, "0x%08llX", (1ULL<<i)); > >> + argv = grow_argv(&argv, argc, argc+2); > >> + if (!argv) { > >> + ovs_abort(0, "Unable to grow argv for coremask"); > >> + } > >> + argv[argc++] = strdup("-c"); > >> + argv[argc++] = strdup(buf); > >> + i = CPU_SETSIZE; > >> + } > >> + } > >> + } > >> > >> argv = grow_argv(&argv, argc, argc+1); > >> if (!argv) { > >> -- > >> 2.6.1.133.gf5b6079 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev