> -----Original Message-----
> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Zoltan Kiss
> Sent: Monday, December 14, 2015 5:21 PM
> To: Aaron Conole
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] INSTALL.DPDK.md: Clarify DPDK arguments.
> 
> Hi,
> 
> On 14/12/15 16:02, Aaron Conole wrote:
> > Just some more activity on this topic, after spending time going through
> > the code a bit.
> >
> > Zoltan Kiss <zoltan.k...@linaro.org> writes:
> >> On 08/12/15 17:31, Gray, Mark D wrote:
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Ben Pfaff [mailto:b...@ovn.org]
> >>>> Sent: Tuesday, December 8, 2015 4:50 PM
> >>>> To: Gray, Mark D
> >>>> Cc: Traynor, Kevin; dev@openvswitch.org
> >>>> Subject: Re: [ovs-dev] [PATCH] INSTALL.DPDK.md: Clarify DPDK arguments.
> >>>>
> >>>> On Tue, Dec 08, 2015 at 04:41:37PM +0000, Gray, Mark D wrote:
> >>>>>> -----Original Message-----
> >>>>>> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Kevin
> >>>>>> Traynor
> >>>>>> Sent: Monday, December 7, 2015 5:58 PM
> >>>>>> To: dev@openvswitch.org
> >>>>>> Subject: [ovs-dev] [PATCH] INSTALL.DPDK.md: Clarify DPDK arguments.
> >>>>>>
> >>>>>> Add some information about the DPDK -c and -n parameters.
> >>>>>>
> >>>>>> Signed-off-by: Kevin Traynor <kevin.tray...@intel.com>
> >>>>>> Reported-by: Zoltan Kiss <zoltan.k...@linaro.org>
> >>>>>> ---
> >>>>>>    INSTALL.DPDK.md |   14 ++++++++++++--
> >>>>>>    1 files changed, 12 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md index
> >>>>>> 96b686c..ee016da
> >>>>>> 100644
> >>>>>> --- a/INSTALL.DPDK.md
> >>>>>> +++ b/INSTALL.DPDK.md
> >>>>>> @@ -145,8 +145,18 @@ Using the DPDK with ovs-vswitchd:
> >>>>>>
> >>>>>>       DPDK configuration arguments can be passed to vswitchd via `--
> dpdk`
> >>>>>>       argument. This needs to be first argument passed to vswitchd
> process.
> >>>>>> -   dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter
> >>>>>> -   for dpdk initialization.
> >>>>>> +   The DPDK configuration arguments are passed to DPDK during DPDK
> >>>>>> +   initialization.
> >>>>>> +
> >>>>>> +   The DPDK -c coremask is a required argument. To avoid wasted
> >>>> resources
> >>>>>> +   only one core should be set. The standard OVS threads (e.g. main
> >>>>>> +   process, handler, revalidator) will run on the core that is
> specified.
> >>>>>
> >>>>> Might be worth mentioning that then there is a corresponding potential
> >>>>> decrease in performance of revalidation and flow handling.
> >>>>
> >>>> With the kernel datapath, OVS sets up flows and revalidates them on
> >>>> multiple cores.  You're saying that with DPDK it only uses one core?
> >>>> Why?
> >>>
> >>> In my opinion, the core mask should define the affinities of the other
> >>> threads (main, process, handler, revalidator) and the pmd-cpu-mask
> >>
> >> That would sound good, but I'm afraid you can't really influence how
> >> DPDK interprets the -c parameter. Maybe you can delegate the tasks of
> >> the non-pmd threads to the lcore threads (via
> >> rte_eal_[mp_]remote_launch() functions), but I'm not sure if its
> >> viable.
> >
> > I'm not sure how good that sounds, really. From a reading of the code,
> > the coremask here is used _solely_ for the EAL 'lcore' thread(s) which
> > process EAL 'rpc' messages. There'd be a lot of intrusive conversion to
> > integrate with DPDK in that space.
> 
> I think the current way of handling this almost OK:
> 
> - OVS creates the threads by itself, based on pmd-cpu-mask
> - the DPDK code makes sure rte_lcore_id() is set in
> pmd_thread_setaffinity_cpu(). So other DPDK libraries will see these OVS
> PMD threads as perfectly fine DPDK lcore threads
> - the only thing I would change is how the "-c coremask" parameter is
> handled. Currently the user passes is, but it really should have one bit
> set in there, for the master DPDK thread. On every other core in that
> list DPDK will create an lcore thread, which won't do any work at all.
> It's not a big issue, but they still waste some resources.
> 
> 
> >
> > I think it may be a good idea to rename it here from a generic
> > "coremask," but as you indicate, we can't impose any additional
> > semantics without getting everyone really confused.
> >
> >>> should define the affinities of the packet processing threads. However,
> >>> I don't know if this was the intended behavior because the name is a
> little
> >>> too generic ("core mask").  If this was the intended behavior, Kevin and
> I
> >>> just did some tests, and it is not behaving like this.
> >
> > I think the behavior intention was to leave it as 'dpdk lcore only.'
> > Maybe there's a better way of doing it that I don't see, though;
> > otherwise you've kindof overloaded the dpdk argument with ovs argument
> > as well. Sorry if that doesn't make sense.
> >
> > -Aaron

How about letting the control threads just float on the non-isolcpu'd cores. 
We could then potentially remove the -c argument which would simplify
setup as the user would only need to think about one mask - pmd-cpu-mask
(which of course we could also default). 

Strawman for it would be something like this...

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index e3a0771..031f405 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2135,12 +2135,15 @@ process_vhost_flags(char *flag, char *default_val, int 
size,
 int
 dpdk_init(int argc, char **argv)
 {
     int result;
     int base = 0;
     char *pragram_name = argv[0];
+    int err;
+    int isset;
+    cpu_set_t cpuset;

     if (argc < 2 || strcmp(argv[1], "--dpdk"))
         return 0;

     /* Remove the --dpdk argument from arg list.*/
     argc--;
@@ -2176,23 +2179,51 @@ dpdk_init(int argc, char **argv)
          */
         argc -= 2;
         argv += 2;    /* Increment by two to bypass the vhost flag arguments */
         base = 2;
     }

+    /*NOTE: Assumes -c option removed from cmdline/db etc. */
+
+    /* Get the main thread affinity */
+    CPU_ZERO(&cpuset);
+    err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+    if (err) {
+        VLOG_ERR("Thread getaffinity error %d",err);
+        return err;
+    }
+
+    /* Extract lowest core affinity and set the -c */
+    for (i = 0; i < CPU_SETSIZE; i++) {
+        isset = CPU_ISSET(i, &cpuset);
+        if (isset) {
+            /* TODO: check for any numa inconsistencies with memory and 
embedded -c
+             * option in argv for rte_eal_init() */
+            break;
+        }
+    }
+
     /* Keep the program name argument as this is needed for call to
      * rte_eal_init()
      */
     argv[0] = pragram_name;

     /* Make sure things are initialized ... */
     result = rte_eal_init(argc, argv);
     if (result < 0) {
         ovs_abort(result, "Cannot init EAL");
     }

+    /* Set the main thread affinity back to pre rte_eal_init() value */
+    err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+    if (err) {
+        VLOG_ERR("Thread setaffinity error %d",err);
+        return err;
+    }
+


> >
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to