There was a reason for 2 calls for dp_netdev_pmd_reload_done() inside pmd_thread_main(). The reason is that we must wait until PMD thread completely done with reloading. This patch introduces race condition for pmd->exit_latch. While removing last port on numa node dp_netdev_reload_pmd__(pmd) will be called twice for each port. First call to remove port and second to destroy PMD thread. pmd->exit_latch setted between this two calls. This leads to probable situation when PMD thread will exit while processing first reloading. Main thread will wait forever on cond_wait in second reload in this case. Situation is easily reproducible by addition/deletion of last port (may be after few iterations in a cycle).
Best regards, Ilya Maximets. On 08.04.2016 06:13, Daniele Di Proietto wrote: > The pmds and the main threads are synchronized using a condition > variable. The main thread writes a new configuration, then it waits on > the condition variable. A pmd thread reads the new configuration, then > it calls signal() on the condition variable. To make sure that the pmds > and the main thread have a consistent view, each signal() should be > backed by a wait(). > > Currently the first signal() doesn't have a corresponding wait(). If > the pmd thread takes a long time to start and the signal() is received > by a later wait, the threads will have an inconsistent view. > > The commit fixes the problem by removing the first signal() from the > pmd thread. > > This is hardly a problem on current master, because the main thread > will call the first wait() a long time after the creation of a pmd > thread. It becomes a problem with the next commits. > > Signed-off-by: Daniele Di Proietto <diproiet...@vmware.com> > --- > lib/dpif-netdev.c | 21 +++++++++------------ > 1 file changed, 9 insertions(+), 12 deletions(-) > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 9c32c64..2424d3e 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -2652,21 +2652,22 @@ dpif_netdev_wait(struct dpif *dpif) > > static int > pmd_load_queues(struct dp_netdev_pmd_thread *pmd, struct rxq_poll > **ppoll_list) > - OVS_REQUIRES(pmd->poll_mutex) > { > struct rxq_poll *poll_list = *ppoll_list; > struct rxq_poll *poll; > int i; > > + ovs_mutex_lock(&pmd->poll_mutex); > poll_list = xrealloc(poll_list, pmd->poll_cnt * sizeof *poll_list); > > i = 0; > LIST_FOR_EACH (poll, node, &pmd->poll_list) { > poll_list[i++] = *poll; > } > + ovs_mutex_unlock(&pmd->poll_mutex); > > *ppoll_list = poll_list; > - return pmd->poll_cnt; > + return i; > } > > static void * > @@ -2685,13 +2686,10 @@ pmd_thread_main(void *f_) > /* Stores the pmd thread's 'pmd' to 'per_pmd_key'. */ > ovsthread_setspecific(pmd->dp->per_pmd_key, pmd); > pmd_thread_setaffinity_cpu(pmd->core_id); > + poll_cnt = pmd_load_queues(pmd, &poll_list); > reload: > emc_cache_init(&pmd->flow_cache); > > - ovs_mutex_lock(&pmd->poll_mutex); > - poll_cnt = pmd_load_queues(pmd, &poll_list); > - ovs_mutex_unlock(&pmd->poll_mutex); > - > /* List port/core affinity */ > for (i = 0; i < poll_cnt; i++) { > VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n", > @@ -2699,10 +2697,6 @@ reload: > netdev_rxq_get_queue_id(poll_list[i].rx)); > } > > - /* Signal here to make sure the pmd finishes > - * reloading the updated configuration. */ > - dp_netdev_pmd_reload_done(pmd); > - > for (;;) { > for (i = 0; i < poll_cnt; i++) { > dp_netdev_process_rxq_port(pmd, poll_list[i].port, > poll_list[i].rx); > @@ -2725,14 +2719,17 @@ reload: > } > } > > + poll_cnt = pmd_load_queues(pmd, &poll_list); > + /* Signal here to make sure the pmd finishes > + * reloading the updated configuration. */ > + dp_netdev_pmd_reload_done(pmd); > + > emc_cache_uninit(&pmd->flow_cache); > > if (!latch_is_set(&pmd->exit_latch)){ > goto reload; > } > > - dp_netdev_pmd_reload_done(pmd); > - > free(poll_list); > return NULL; > } > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev