>-----Original Message----- >From: Daniele Di Proietto [mailto:diproiet...@ovn.org] >Sent: Tuesday, August 16, 2016 1:44 AM >To: Bodireddy, Bhanuprakash <bhanuprakash.bodire...@intel.com> >Cc: dev@openvswitch.org; Flavio Leitner <f...@sysclose.org> >Subject: Re: [PATCH V6] netdev-dpdk: Set pmd thread priority > >I found a crash if apply this patch, "dpdk-lcore-mask" is not set and "-c 0x1" >is >passed to "dpdk-extra". My bad, I didn't test with dpdk-extra options. I see that the crash was due to strtol.
>Also, I believe Flavio had a comment on the previous version of this >patch. Would it be enough to use setpriority(2)? I sent out my comments in another mail. I agree to Flavio's suggestion as this seems less dangerous and is guaranteed to work even in case of misconfiguration. I tested this approach and have a concern with setpriority(). To apply the new nice value to the thread, thread id is needed and due to absence of glibc wrapper for gettid, I have to use syscall(SYS_gettid). I want to know if this is acceptable in OVS or better way to handle this? Void ovs_numa_thread_setpriority(int nice OVS_UNUSED) { .... #if defined(__linux__) && defined(SYS_gettid) pid_t tid = syscall(SYS_gettid); err = setpriority(PRIO_PROCESS, tid, nice); .... #endif } Without priority patch: $ ps -eLo tid,pri,psr,comm | grep -e lcore -e revalidator -e ovs-vswitchd -e pmd 22509 19 4 ovs-vswitchd 22512 19 5 lcore-slave-5 22513 19 6 lcore-slave-6 22514 19 7 lcore-slave-7 22589 19 4 revalidator37 22590 19 4 revalidator52 22591 19 4 revalidator42 22592 19 4 revalidator38 22593 19 4 revalidator39 22594 19 4 revalidator45 22595 19 4 revalidator53 22596 19 4 revalidator54 22598 19 4 pmd61 [Default priority] With priority patch: $ ps -eLo tid,pri,psr,comm | grep -e lcore -e revalidator -e ovs-vswitchd -e pmd 24879 19 4 ovs-vswitchd 24881 19 5 lcore-slave-5 24882 19 6 lcore-slave-6 24883 19 7 lcore-slave-7 24951 19 4 revalidator55 24952 19 4 revalidator37 24953 19 4 revalidator52 24954 19 4 revalidator42 24955 19 4 revalidator38 24956 19 4 revalidator39 24957 19 4 revalidator45 24958 19 4 revalidator53 24964 39 4 pmd61 [Higher priority set] Regards, Bhanu Prakash. >Thanks, >Daniele > >2016-08-15 8:19 GMT-07:00 Bhanuprakash Bodireddy ><bhanuprakash.bodire...@intel.com>: >Set the DPDK pmd thread scheduling policy to SCHED_RR and static >priority to highest priority value of the policy. This is to deal with >pmd thread starvation case where another cpu hogging process can get >scheduled/affinitized on to the same core the pmd thread is running >there by significantly impacting the datapath performance. To reproduce >the pmd thread starvation case: > > $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6 > $ taskset 0x2 cat /dev/zero > /dev/null & > >Setting the realtime scheduling policy to the pmd threads is one step >towards Fastpath Service Assurance in OVS DPDK. > >The realtime scheduling policy is applied only when CPU mask is passed >to 'pmd-cpu-mask'. For example: > > * In the absence of pmd-cpu-mask, one pmd thread shall be created > and default scheduling policy and priority gets applied. > > * If pmd-cpu-mask is specified, one or more pmd threads shall be > spawned on the corresponding core(s) in the mask and real time > scheduling policy SCHED_RR and highest priority of the policy is > applied to the pmd thread(s). > >With this commit, it is recommended that the OVS control thread and pmd >thread shouldn't be pinned to same core ('dpdk-lcore-mask','pmd-cpu-mask' >should be non-overlapping). If dpdk-lcore-mask is set same as pmd-cpu-mask >the pmd thread is not spawned on lowest core of the dpdk-lcore-mask. >Also other processes with same affinity as PMD thread will be unresponsive. > >Signed-off-by: Bhanuprakash Bodireddy ><bhanuprakash.bodire...@intel.com> >--- >v5->v6: >* Prohibit spawning pmd thread on the lowest core in dpdk-lcore-mask if > lcore-mask and pmd-mask affinity are identical. >* Updated Note section in INSTALL.DPDK-ADVANCED doc. >* Tested below cases to verify system stability with pmd priority patch > > dpdk-lcore-mask | pmd-cpu-mask | Comment >1. Not set | Not set | control threads affinity: 0-27 > pmd thread: core 0 >2. 1 | 1 | pmd thread isn't spawned and warning > logged in logfile. >3. 1 | c | >4. F0 | F0 | control threads pinned to core 4. > 3 pmd threads created on core 5,6,7 but 4. > >v4->v5: >* Reword Note section in DPDK-ADVANCED.md > >v3->v4: >* Document update >* Use ovs_strerror for reporting errors in lib-numa.c > >v2->v3: >* Move set_priority() function to lib/ovs-numa.c >* Apply realtime scheduling policy and priority to pmd thread only if > pmd-cpu-mask is passed. >* Update INSTALL.DPDK-ADVANCED. > >v1->v2: >* Removed #ifdef and introduced dummy function "pmd_thread_setpriority" > in netdev-dpdk.h >* Rebase > > INSTALL.DPDK-ADVANCED.md | 24 ++++++++++++++++++++---- > lib/dpif-netdev.c | 12 ++++++++++++ > lib/netdev-dpdk.c | 22 ++++++++++++++++++++++ > lib/ovs-numa.c | 18 ++++++++++++++++++ > lib/ovs-numa.h | 1 + > 5 files changed, 73 insertions(+), 4 deletions(-) > >diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md >index dd36ae4..f039723 100755 >--- a/INSTALL.DPDK-ADVANCED.md >+++ b/INSTALL.DPDK-ADVANCED.md >@@ -208,8 +208,10 @@ needs to be affinitized accordingly. > pmd thread is CPU bound, and needs to be affinitized to isolated > cores for optimum performance. > >- By setting a bit in the mask, a pmd thread is created and pinned >- to the corresponding CPU core. e.g. to run a pmd thread on core 2 >+ By setting a bit in the mask, a pmd thread is created, pinned >+ to the corresponding CPU core and the scheduling policy SCHED_RR >+ along with maximum priority of the policy applied to the pmd thread. >+ e.g. to pin a pmd thread on core 2 > > `ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=4` > >@@ -237,8 +239,10 @@ needs to be affinitized accordingly. > responsible for different ports/rxq's. Assignment of ports/rxq's to > pmd threads is done automatically. > >- A set bit in the mask means a pmd thread is created and pinned >- to the corresponding CPU core. e.g. to run pmd threads on core 1 and 2 >+ A set bit in the mask means a pmd thread is created, pinned to the >+ corresponding CPU core and the scheduling policy SCHED_RR with highest >+ priority of the scheduling policy applied to pmd thread. >+ e.g. to run pmd threads on core 1 and 2 > > `ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6` > >@@ -249,6 +253,18 @@ needs to be affinitized accordingly. > > NIC port0 <-> OVS <-> VM <-> OVS <-> NIC port 1 > >+ Note: 'dpdk-lcore-mask' and 'pmd-cpu-mask' cpu mask settings should be >+ non-overlapping. If you set dpdk-lcore-mask same as pmd-cpu-mask, the >pmd >+ thread is not spawned on lowest core of the dpdk-lcore-mask and in case of >+ cpu mask setting where both the control and the pmd thread pinned to >same >+ core, pmd thread is not spawned at all if it is the only cpu in the affinity >+ mask. >+ >+ For example, if dpdk-lcore-mask=F0 and pmd-cpu-mask=F0, pmd threads >are >+ created on cores 5, 6, and 7 but not on 4. Setting both lcore,pmd mask to >+ core 4 shall pin control threads to core 4 but will prevent pmd thread from >+ spawning. >+ > ### 4.3 DPDK physical port Rx Queues > > `ovs-vsctl set Interface <DPDK interface> options:n_rxq=<integer>` >diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >index 96504f5..abe1fde 100644 >--- a/lib/dpif-netdev.c >+++ b/lib/dpif-netdev.c >@@ -3100,6 +3100,18 @@ pmd_thread_main(void *f_) > ovs_numa_thread_setaffinity_core(pmd->core_id); > dpdk_set_lcore_id(pmd->core_id); > poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); >+ >+ /* When cpu affinity mask explicitly set using pmd-cpu-mask, pmd thread's >+ * scheduling policy is set to SCHED_RR and priority to highest priority >+ * of SCHED_RR policy. In the absence of pmd-cpu-mask, default >scheduling >+ * policy and priority shall apply to pmd thread. >+ * >+ * If dpdk-lcore-mask is set same as pmd-cpu-mask, the pmd thread is not >+ * spawned on lowest core of the dpdk-lcore-mask. >+ */ >+ if (pmd->dp->pmd_cmask) { >+ ovs_numa_thread_setpriority(SCHED_RR); >+ } > reload: > emc_cache_init(&pmd->flow_cache); > >diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >index c767fd4..5d606a2 100644 >--- a/lib/netdev-dpdk.c >+++ b/lib/netdev-dpdk.c >@@ -3432,6 +3432,28 @@ dpdk_init__(const struct smap *ovs_other_config) > } > } > >+ /* If dpdk-lcore-mask is explicitly specified, auto_determine is 'false' >+ * and control threads are pinned to lowest core of the affinity mask. >+ * Mark the corresponding control thread 'core->pinned' status to 'true' >+ * so that pmd thread can't be spawned on the same core. >+ */ >+ if (!auto_determine) { >+ const char *lcore_mask; >+ unsigned int core_id; >+ int dl_mask; >+ >+ lcore_mask = smap_get(ovs_other_config, "dpdk-lcore-mask"); >+ dl_mask = strtol(lcore_mask, NULL, 16); >+ >+ /* Get the lowest core_id of dpdk-lcore-mask */ >+ core_id = rightmost_1bit_idx(dl_mask); >+ >+ /* Set the control thread core 'pinned' status to true */ >+ if (!ovs_numa_try_pin_core_specific(core_id)) { >+ VLOG_ERR("Unable to pin core %d", core_id); >+ } >+ } >+ > dpdk_argv = argv; > dpdk_argc = argc; > >diff --git a/lib/ovs-numa.c b/lib/ovs-numa.c >index c8173e0..428f274 100644 >--- a/lib/ovs-numa.c >+++ b/lib/ovs-numa.c >@@ -613,3 +613,21 @@ int ovs_numa_thread_setaffinity_core(unsigned >core_id OVS_UNUSED) > return EOPNOTSUPP; > #endif /* __linux__ */ > } >+ >+void >+ovs_numa_thread_setpriority(int policy) >+{ >+ if (dummy_numa) { >+ return; >+ } >+ >+ struct sched_param threadparam; >+ int err; >+ >+ memset(&threadparam, 0, sizeof(threadparam)); >+ threadparam.sched_priority = sched_get_priority_max(policy); >+ err = pthread_setschedparam(pthread_self(), policy, &threadparam); >+ if (err) { >+ VLOG_ERR("Thread priority error %s",ovs_strerror(err)); >+ } >+} >diff --git a/lib/ovs-numa.h b/lib/ovs-numa.h >index be836b2..94f0884 100644 >--- a/lib/ovs-numa.h >+++ b/lib/ovs-numa.h >@@ -56,6 +56,7 @@ void ovs_numa_unpin_core(unsigned core_id); > struct ovs_numa_dump *ovs_numa_dump_cores_on_numa(int numa_id); > void ovs_numa_dump_destroy(struct ovs_numa_dump *); > int ovs_numa_thread_setaffinity_core(unsigned core_id); >+void ovs_numa_thread_setpriority(int policy); > > #define FOR_EACH_CORE_ON_NUMA(ITER, DUMP) \ > LIST_FOR_EACH((ITER), list_node, &(DUMP)->dump) >-- >2.4.11 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev