> -----Original Message-----
> From: Aaron Conole [mailto:acon...@redhat.com]
> Sent: Monday, December 21, 2015 7:27 PM
> To: Traynor, Kevin
> Cc: dev@openvswitch.org; Flavio Leitner
> Subject: Re: [PATCH 3/5] netdev-dpdk: Autofill lcore coremask if absent
> 
> "Traynor, Kevin" <kevin.tray...@intel.com> writes:
> >> -----Original Message-----
> >> From: Aaron Conole [mailto:acon...@redhat.com]
> >> Sent: Friday, December 18, 2015 6:28 PM
> >> To: dev@openvswitch.org
> >> Cc: Flavio Leitner; Traynor, Kevin
> >> Subject: [PATCH 3/5] netdev-dpdk: Autofill lcore coremask if absent
> >>
> >> The user has control over the DPDK internal lcore coremask, but this
> >> parameter can be autofilled with a bit more intelligence. If the user
> >> does not fill this parameter in, we use the lowest set bit in the
> >> current task CPU affinity.
> >>
> >> Signed-off-by: Aaron Conole <acon...@redhat.com>
> >> Cc: Kevin Traynor <kevin.tray...@intel.com>
> >> ---
> >>  lib/netdev-dpdk.c | 47 ++++++++++++++++++++++++++++++++++++++++-------
> >>  1 file changed, 40 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> >> index 2a81058..696430f 100644
> >> --- a/lib/netdev-dpdk.c
> >> +++ b/lib/netdev-dpdk.c
> >> @@ -65,6 +65,8 @@ static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5,
> >> 20);
> >>  #define OVS_CACHE_LINE_SIZE CACHE_LINE_SIZE
> >>  #define OVS_VPORT_DPDK "ovs_dpdk"
> >>
> >> +#define MAX_BUFSIZ 256
> >> +
> >>  /*
> >>   * need to reserve tons of extra space in the mbufs so we can align the
> >>   * DMA addresses to 4KB.
> >> @@ -2147,7 +2149,8 @@ grow_argv(char ***argv, size_t cur_siz, size_t
> grow_by)
> >>  }
> >>
> >>  static int
> >> -get_dpdk_args(const struct ovsrec_open_vswitch *ovs_cfg, char ***argv)
> >> +get_dpdk_args(const struct ovsrec_open_vswitch *ovs_cfg, char ***argv,
> >> +              int argc)
> >>  {
> >>      struct dpdk_options_map {
> >>          const char *ovs_configuration;
> >> @@ -2155,13 +2158,13 @@ get_dpdk_args(const struct ovsrec_open_vswitch
> >> *ovs_cfg, char ***argv)
> >>          bool default_enabled;
> >>          const char *default_value;
> >>      } opts[] = {
> >> -        {"dpdk_lcore_mask", "-c", true, "0x1"},
> >> +        {"dpdk_lcore_mask", "-c", false, NULL},
> >>          {"dpdk_mem_channels", "-n", true, "4"},
> >>          {"dpdk_alloc_mem", "-m", false, NULL},
> >>          {"dpdk_socket_mem", "--socket-mem", true, "1024,0"},
> >>          {"dpdk_hugepage_dir", "--huge-dir", false, NULL},
> >>      };
> >> -    int i, ret = 1;
> >> +    int i, ret = argc;
> >>
> >>      for(i = 0; i < (sizeof(opts) / sizeof(opts[0])); ++i) {
> >>          const char *lookup = smap_get(&ovs_cfg->other_config,
> >> @@ -2203,7 +2206,8 @@ __dpdk_init(const struct ovsrec_open_vswitch
> *ovs_cfg)
> >>  {
> >>      char **argv = NULL;
> >>      int result;
> >> -    int argc;
> >> +    int argc = 0, argc_tmp;
> >> +    bool auto_determine = true;
> >>      int err;
> >>      cpu_set_t cpuset;
> >>
> >> @@ -2236,12 +2240,41 @@ __dpdk_init(const struct ovsrec_open_vswitch
> >> *ovs_cfg)
> >>          ovs_abort(0, "Thread getaffinity error %d.", err);
> >>      }
> >>
> >> -    argv = grow_argv(&argv, 0, 1);
> >> +    argv = grow_argv(&argv, argc, argc+1);
> >>      if (!argv) {
> >>          ovs_abort(0, "Unable to allocate an initial argv.");
> >>      }
> >> -    argv[0] = strdup("ovs"); /* TODO use prctl to get process name */
> >> -    argc = get_dpdk_args(ovs_cfg, &argv);
> >> +    argv[argc++] = strdup("ovs"); /* TODO use prctl to get process name
> */
> >> +
> >> +    argc_tmp = get_dpdk_args(ovs_cfg, &argv, argc);
> >> +
> >> +    while(argc_tmp != argc) {
> >> +        if (!strcmp("-c", argv[argc++])) {
> >> +            auto_determine = false;

we can break; here.

If we are not going to autodetermine, later we still set the affinities
back to the non-isolcpu'd cores. This means that the -c is used for the
rte_eal_init() only and the non-pmd threads will not use it. I haven't
seen any issues with this in my testing but the question is - do we want
to allow a user to set a specific core that the non-pmd threads will run
on? If we don't allow and the non-isolcpu cores are heavily loaded it may
mean a slowdown for non-pmd threads. Based on that possibility, I'm inclined
to think that if the user gives us a -c option, we should not reset the
affinities back to the non-isolcpu'd cores.

What do you think?

> >> +        }
> >> +    }
> >> +
> >> +    /**
> >> +     * NOTE: This is an unsophisticated mechanism for determining the
> DPDK
> >> +     * lcore for the DPDK Master.
> >> +     */
> >> +    if (auto_determine)
> >> +    {
> >
> > coding std if {
> 
> D'oh! Thanks.
> 
> >> +        int i;
> >> +        for (i = 0; i < CPU_SETSIZE; i++) {
> >> +            if (CPU_ISSET(i, &cpuset)) {
> >
> > I had been thinking we could put in a check here to ensure that the
> > core we have selected is suitable for the socket-mem but I'm not sure
> > if it's really needed. We could always add it later, it won't change
> > the user interface.
> 
> I think such a change is an optimization, at best, and is really
> difficult to get right. I'd rather do the naive but very predictable
> thing, and give the option to override, than guess at a more complicated
> solution on a first cut.
> 
> Plus, the socket-mem is only relevent for the PMD threads,
> iirc. Even if I'm wrong though, if someone is playing with coremasks,
> I think they should probably just be setting the affinity manually,
> anyway.

Sure - I agree.

> 
> >> +                char buf[MAX_BUFSIZ];
> >> +                snprintf(buf, MAX_BUFSIZ, "0x%08llX", (1ULL<<i));
> >> +                argv = grow_argv(&argv, argc, argc+2);
> >> +                if (!argv) {
> >> +                    ovs_abort(0, "Unable to grow argv for coremask");
> >> +                }
> >> +                argv[argc++] = strdup("-c");
> >> +                argv[argc++] = strdup(buf);
> >> +                i = CPU_SETSIZE;
> >> +            }
> >> +        }
> >> +    }
> >>
> >>      argv = grow_argv(&argv, argc, argc+1);
> >>      if (!argv) {
> >> --
> >> 2.6.1.133.gf5b6079
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to