On 02/02/2016 10:39, "Jarno Rajahalme" <ja...@ovn.org> wrote:
>With the small nits below: > >Acked-by: Jarno Rajahalme <ja...@ovn.org> Thanks Jarno and Ilya for the reviews! I applied this on master and branch-2.5 > >> On Jan 26, 2016, at 9:13 PM, Daniele Di Proietto >><diproiet...@vmware.com> wrote: >> >> It is ok to iterate a cmap with CMAP_FOR_EACH and remove elements with >> cmap_remove(), but having quiescent states inside the loop might create >> problems, since some of the postponed cleanup done inside the cmap might >> be executed, leaving the iterator in an inconsisted state. > >³inconsisted² -> ³inconsistent² > >It seems to me that the root of the problem is that when CMAP is rehashed >by cmap_remove() and the thread quiesces during the iteration, the old >cmap_impl that is still referred to by the iterator is freed and either >trashed by the allocator or allocated for some other purpose. The rest of >the iteration can then follow random pointers, which is not good. Hence >the ³leaving the iterator in an inconsistent state², while correct, is a >bit of an understatement. I changed it to "freeing the memory that the iterator is using" > >I think we should add a comment about not quiescing while iterating to >lib/cmap.h. Good idea, I sent a patch: http://openvswitch.org/pipermail/dev/2016-February/065560.html > >> >> We had several of these errors in dpif-netdev, because when we rearrange >> ports or threads we ofter need to wait on a condition variable (which >> implies a quiescent state). >> >> This problem caused iterations to skip elements or to list them twice, >> resulting in the main thread waiting on a condition without anyone else >> to signal. >> >> Fix these cases by moving the possible quiescent states outside >> CMAP_FOR_EACH loops. >> >> Signed-off-by: Daniele Di Proietto <diproiet...@vmware.com> >> --- >> lib/dpif-netdev.c | 32 +++++++++++++++++++++++++++++--- >> 1 file changed, 29 insertions(+), 3 deletions(-) >> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >> index ad6b202..1ec04d5 100644 >> --- a/lib/dpif-netdev.c >> +++ b/lib/dpif-netdev.c >> @@ -933,6 +933,7 @@ dp_netdev_free(struct dp_netdev *dp) >> >> ovs_mutex_lock(&dp->port_mutex); >> CMAP_FOR_EACH (port, node, &dp->ports) { >> + /* PMD threads are destroyed here. do_del_port() cannot >>quiesce */ >> do_del_port(dp, port); >> } >> ovs_mutex_unlock(&dp->port_mutex); >> @@ -2914,10 +2915,24 @@ static void >> dp_netdev_destroy_all_pmds(struct dp_netdev *dp) >> { >> struct dp_netdev_pmd_thread *pmd; >> + struct dp_netdev_pmd_thread **pmd_list; >> + size_t i = 0, n_pmds; >> + >> + n_pmds = cmap_count(&dp->poll_threads); >> + pmd_list = xcalloc(n_pmds, sizeof *pmd_list); >> >> CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { >> - dp_netdev_del_pmd(dp, pmd); >> + /* We cannot call dp_netdev_del_pmd(), since it alters >> + * 'dp->poll_threads' (while we're iterating it) and it >> + * might quiesce. */ >> + ovs_assert(i < n_pmds); >> + pmd_list[i++] = pmd; >> + } >> + > > >So ŒI¹ should be equal to ¹n_pmd¹ here? If you are asserting inside the >loop above, maybe assert here, too, or then remove also the assert above? >Or better yet, use another iteration variable for the second loop, like >you do for dp_netdev_del_pmds_on_numa() below. Makes sense, thanks. I've introduced another variable and made it like the loop below. > >> + for (i = 0; i < n_pmds; i++) { >> + dp_netdev_del_pmd(dp, pmd_list[i]); >> } >> + free(pmd_list); >> } >> >> /* Deletes all pmd threads on numa node 'numa_id' and >> @@ -2928,18 +2943,28 @@ dp_netdev_del_pmds_on_numa(struct dp_netdev >>*dp, int numa_id) >> struct dp_netdev_pmd_thread *pmd; >> int n_pmds_on_numa, n_pmds; >> int *free_idx, k = 0; >> + struct dp_netdev_pmd_thread **pmd_list; >> >> n_pmds_on_numa = get_n_pmd_threads_on_numa(dp, numa_id); >> - free_idx = xmalloc(n_pmds_on_numa * sizeof *free_idx); >> + free_idx = xcalloc(n_pmds_on_numa, sizeof *free_idx); >> + pmd_list = xcalloc(n_pmds_on_numa, sizeof *pmd_list); >> >> CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { >> + /* We cannot call dp_netdev_del_pmd(), since it alters >> + * 'dp->poll_threads' (while we're iterating it) and it >> + * might quiesce. */ >> if (pmd->numa_id == numa_id) { >> atomic_read_relaxed(&pmd->tx_qid, &free_idx[k]); >> + pmd_list[k] = pmd; >> + ovs_assert(k < n_pmds_on_numa); >> k++; >> - dp_netdev_del_pmd(dp, pmd); >> } >> } >> >> + for (int i = 0; i < k; i++) { >> + dp_netdev_del_pmd(dp, pmd_list[i]); >> + } >> + >> n_pmds = get_n_pmd_threads(dp); >> CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { >> int old_tx_qid; >> @@ -2953,6 +2978,7 @@ dp_netdev_del_pmds_on_numa(struct dp_netdev *dp, >>int numa_id) >> } >> } >> >> + free(pmd_list); >> free(free_idx); >> } >> >> -- >> 2.1.4 >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev