> -----Original Message----- > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Zoltan Kiss > Sent: Monday, December 14, 2015 5:21 PM > To: Aaron Conole > Cc: dev@openvswitch.org > Subject: Re: [ovs-dev] [PATCH] INSTALL.DPDK.md: Clarify DPDK arguments. > > Hi, > > On 14/12/15 16:02, Aaron Conole wrote: > > Just some more activity on this topic, after spending time going through > > the code a bit. > > > > Zoltan Kiss <zoltan.k...@linaro.org> writes: > >> On 08/12/15 17:31, Gray, Mark D wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Ben Pfaff [mailto:b...@ovn.org] > >>>> Sent: Tuesday, December 8, 2015 4:50 PM > >>>> To: Gray, Mark D > >>>> Cc: Traynor, Kevin; dev@openvswitch.org > >>>> Subject: Re: [ovs-dev] [PATCH] INSTALL.DPDK.md: Clarify DPDK arguments. > >>>> > >>>> On Tue, Dec 08, 2015 at 04:41:37PM +0000, Gray, Mark D wrote: > >>>>>> -----Original Message----- > >>>>>> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Kevin > >>>>>> Traynor > >>>>>> Sent: Monday, December 7, 2015 5:58 PM > >>>>>> To: dev@openvswitch.org > >>>>>> Subject: [ovs-dev] [PATCH] INSTALL.DPDK.md: Clarify DPDK arguments. > >>>>>> > >>>>>> Add some information about the DPDK -c and -n parameters. > >>>>>> > >>>>>> Signed-off-by: Kevin Traynor <kevin.tray...@intel.com> > >>>>>> Reported-by: Zoltan Kiss <zoltan.k...@linaro.org> > >>>>>> --- > >>>>>> INSTALL.DPDK.md | 14 ++++++++++++-- > >>>>>> 1 files changed, 12 insertions(+), 2 deletions(-) > >>>>>> > >>>>>> diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md index > >>>>>> 96b686c..ee016da > >>>>>> 100644 > >>>>>> --- a/INSTALL.DPDK.md > >>>>>> +++ b/INSTALL.DPDK.md > >>>>>> @@ -145,8 +145,18 @@ Using the DPDK with ovs-vswitchd: > >>>>>> > >>>>>> DPDK configuration arguments can be passed to vswitchd via `-- > dpdk` > >>>>>> argument. This needs to be first argument passed to vswitchd > process. > >>>>>> - dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter > >>>>>> - for dpdk initialization. > >>>>>> + The DPDK configuration arguments are passed to DPDK during DPDK > >>>>>> + initialization. > >>>>>> + > >>>>>> + The DPDK -c coremask is a required argument. To avoid wasted > >>>> resources > >>>>>> + only one core should be set. The standard OVS threads (e.g. main > >>>>>> + process, handler, revalidator) will run on the core that is > specified. > >>>>> > >>>>> Might be worth mentioning that then there is a corresponding potential > >>>>> decrease in performance of revalidation and flow handling. > >>>> > >>>> With the kernel datapath, OVS sets up flows and revalidates them on > >>>> multiple cores. You're saying that with DPDK it only uses one core? > >>>> Why? > >>> > >>> In my opinion, the core mask should define the affinities of the other > >>> threads (main, process, handler, revalidator) and the pmd-cpu-mask > >> > >> That would sound good, but I'm afraid you can't really influence how > >> DPDK interprets the -c parameter. Maybe you can delegate the tasks of > >> the non-pmd threads to the lcore threads (via > >> rte_eal_[mp_]remote_launch() functions), but I'm not sure if its > >> viable. > > > > I'm not sure how good that sounds, really. From a reading of the code, > > the coremask here is used _solely_ for the EAL 'lcore' thread(s) which > > process EAL 'rpc' messages. There'd be a lot of intrusive conversion to > > integrate with DPDK in that space. > > I think the current way of handling this almost OK: > > - OVS creates the threads by itself, based on pmd-cpu-mask > - the DPDK code makes sure rte_lcore_id() is set in > pmd_thread_setaffinity_cpu(). So other DPDK libraries will see these OVS > PMD threads as perfectly fine DPDK lcore threads > - the only thing I would change is how the "-c coremask" parameter is > handled. Currently the user passes is, but it really should have one bit > set in there, for the master DPDK thread. On every other core in that > list DPDK will create an lcore thread, which won't do any work at all. > It's not a big issue, but they still waste some resources. > > > > > > I think it may be a good idea to rename it here from a generic > > "coremask," but as you indicate, we can't impose any additional > > semantics without getting everyone really confused. > > > >>> should define the affinities of the packet processing threads. However, > >>> I don't know if this was the intended behavior because the name is a > little > >>> too generic ("core mask"). If this was the intended behavior, Kevin and > I > >>> just did some tests, and it is not behaving like this. > > > > I think the behavior intention was to leave it as 'dpdk lcore only.' > > Maybe there's a better way of doing it that I don't see, though; > > otherwise you've kindof overloaded the dpdk argument with ovs argument > > as well. Sorry if that doesn't make sense. > > > > -Aaron
How about letting the control threads just float on the non-isolcpu'd cores. We could then potentially remove the -c argument which would simplify setup as the user would only need to think about one mask - pmd-cpu-mask (which of course we could also default). Strawman for it would be something like this... diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index e3a0771..031f405 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2135,12 +2135,15 @@ process_vhost_flags(char *flag, char *default_val, int size, int dpdk_init(int argc, char **argv) { int result; int base = 0; char *pragram_name = argv[0]; + int err; + int isset; + cpu_set_t cpuset; if (argc < 2 || strcmp(argv[1], "--dpdk")) return 0; /* Remove the --dpdk argument from arg list.*/ argc--; @@ -2176,23 +2179,51 @@ dpdk_init(int argc, char **argv) */ argc -= 2; argv += 2; /* Increment by two to bypass the vhost flag arguments */ base = 2; } + /*NOTE: Assumes -c option removed from cmdline/db etc. */ + + /* Get the main thread affinity */ + CPU_ZERO(&cpuset); + err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset); + if (err) { + VLOG_ERR("Thread getaffinity error %d",err); + return err; + } + + /* Extract lowest core affinity and set the -c */ + for (i = 0; i < CPU_SETSIZE; i++) { + isset = CPU_ISSET(i, &cpuset); + if (isset) { + /* TODO: check for any numa inconsistencies with memory and embedded -c + * option in argv for rte_eal_init() */ + break; + } + } + /* Keep the program name argument as this is needed for call to * rte_eal_init() */ argv[0] = pragram_name; /* Make sure things are initialized ... */ result = rte_eal_init(argc, argv); if (result < 0) { ovs_abort(result, "Cannot init EAL"); } + /* Set the main thread affinity back to pre rte_eal_init() value */ + err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset); + if (err) { + VLOG_ERR("Thread setaffinity error %d",err); + return err; + } + > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev