Just post a proposed fix here, http://openvswitch.org/pipermail/dev/2015-August/058760.html
Any comments are welcomed, On Mon, Aug 10, 2015 at 11:12 AM, Alex Wang <al...@nicira.com> wrote: > Hey IIya, > > Thx for the reply, Daniele helped me reproduced the issue offline and we > confirmed that this is not DPDK issue. > > However, this is also not a ufid issue either... > > Let me explain, > > First to clear up, with "packets from the same flow (or stream)", I mean > packets with same src/dst MAC, IP, and TCP/UDP port... Packets with such > same info must be hashed to the same rx queue of the receiving port and be > handled by the same pmd threads,... Otherwise, we may have serious issue > with reordering. > > Then, turns out this issue can be reproduced whenever we reconfigure the > pmd threads (i.e. change number of pmd threads or change cpu pin) or the > port receiving queues... And when such reconfiguration happens, the > following two things happen: > > - all pmd threads will be recreated, so after recreation, there is no > flow > flow exists in any pmd thread's classifier and exact match cache. > > - the rx queues of each port will be re-mapped (highly likely to a > different > pmd thread). > > So, after the queue re-mapping, the pmd thread will fire a new upcall > handling request to the ofproto-dpif-upcall module. However, the module > has > no knowledge of the reconfiguration in the dpif-netdev layer, and still > think > the same packet has already been installed to in dpif-netdev module > (even though the old pmd thread has been reset). And thusly, the ofproto- > dpif-upcall module will fail the flow installation and increment the > handler_duplicate_upcall coverage counter. > > Expectedly, this will only be a transient state, since the revalidator > should > cleans up the old ukey (with the ufid) since it can no longer be dumped > from the dpif-netdev module. But there is a bug that makes the > revalidators > skip doing so. > > I'd like to explain more if things are still unclear. Meanwhile, we will > discuss further on the fix and post the patch asap... > > Thanks, > Alex Wang, > > > On Mon, Aug 10, 2015 at 7:57 AM, Ilya Maximets <i.maxim...@samsung.com> > wrote: > >> Hello, Alex. >> I think, that this problem doesn't depend on dpdk. It is generic problem >> with ukeys. >> I found it while testing ODP-OVS from (Linaro Networking Group) without >> dpdk. >> IMHO, there must be opportunity to install ukeys for same flow by >> different pmd threads >> anyway, because they are reading from different rx queues. >> >> Sorry for delay, was busy fixing coverage to show it to you.) >> ( http://openvswitch.org/pipermail/dev/2015-August/058743.html ) >> >> # ./bin/ovs-appctl coverage/show >> Event coverage, avg rate over last: 5 seconds, last minute, last hour, >> hash=09437d7d: >> bridge_reconfigure 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> ofproto_flush 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> ofproto_recv_openflow 0.0/sec 0.000/sec 0.0017/sec >> total: 6 >> ofproto_update_port 0.0/sec 0.000/sec 0.0011/sec >> total: 4 >> rev_reconfigure 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> rev_flow_table 0.0/sec 0.000/sec 0.0011/sec >> total: 4 >> handler_duplicate_upcall 243016.0/sec 268180.317/sec 5728.5375/sec >> total: 20622735 >> xlate_actions 243016.2/sec 259283.650/sec 5580.2606/sec >> total: 20088938 >> cmap_expand 0.0/sec 0.000/sec 0.0019/sec >> total: 7 >> cmap_shrink 0.0/sec 0.000/sec 0.0008/sec >> total: 3 >> dpif_port_add 0.0/sec 0.000/sec 0.0008/sec >> total: 3 >> dpif_flow_flush 0.0/sec 0.000/sec 0.0006/sec >> total: 2 >> dpif_flow_get 0.0/sec 0.000/sec 0.0014/sec >> total: 5 >> dpif_flow_put 0.0/sec 0.000/sec 0.0014/sec >> total: 5 >> dpif_flow_del 0.0/sec 0.000/sec 0.0014/sec >> total: 5 >> dpif_execute 0.0/sec 0.000/sec 0.0006/sec >> total: 2 >> flow_extract 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> hmap_pathological 0.0/sec 0.000/sec 0.0050/sec >> total: 18 >> hmap_expand 13.4/sec 19.467/sec 0.8011/sec >> total: 2884 >> netdev_received 71024.6/sec 77401.583/sec 1678.5072/sec >> total: 6042626 >> netdev_sent 306446.4/sec 328582.933/sec 7084.3894/sec >> total: 25503802 >> netdev_get_stats 0.6/sec 0.600/sec 0.0200/sec >> total: 72 >> txn_unchanged 0.2/sec 0.200/sec 0.0067/sec >> total: 24 >> txn_incomplete 0.0/sec 0.000/sec 0.0006/sec >> total: 2 >> txn_success 0.0/sec 0.000/sec 0.0006/sec >> total: 2 >> poll_create_node 63.8/sec 89.917/sec 3.0958/sec >> total: 11145 >> poll_zero_timeout 0.0/sec 0.000/sec 0.0017/sec >> total: 6 >> rconn_queued 0.0/sec 0.000/sec 0.0008/sec >> total: 3 >> rconn_sent 0.0/sec 0.000/sec 0.0008/sec >> total: 3 >> pstream_open 0.0/sec 0.000/sec 0.0008/sec >> total: 3 >> stream_open 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> unixctl_received 0.0/sec 0.383/sec 0.0131/sec >> total: 47 >> unixctl_replied 0.0/sec 0.383/sec 0.0131/sec >> total: 47 >> util_xalloc 729151.8/sec 778037.800/sec 16750.5969/sec >> total: 60302149 >> vconn_received 0.0/sec 0.000/sec 0.0025/sec >> total: 9 >> vconn_sent 0.0/sec 0.000/sec 0.0017/sec >> total: 6 >> netdev_set_policing 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> netdev_get_ifindex 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> netdev_get_hwaddr 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> netdev_set_hwaddr 0.0/sec 0.000/sec 0.0003/sec >> total: 1 >> netdev_get_ethtool 0.0/sec 0.000/sec 0.0008/sec >> total: 3 >> netlink_received 0.2/sec 0.200/sec 0.0128/sec >> total: 46 >> netlink_recv_jumbo 0.2/sec 0.200/sec 0.0067/sec >> total: 24 >> netlink_sent 0.2/sec 0.200/sec 0.0097/sec >> total: 35 >> nln_changed 0.0/sec 0.000/sec 0.0014/sec >> total: 5 >> 54 events never hit >> >> But also, I think, that "packets from the same flow (or stream) should >> only be >> handled by one pmd thread" is a very bad behavior. I don't know exactly >> how dpdk works, >> but it shouldn't work so. >> >> Best regards, Ilya Maximets. >> >> On 08.08.2015 01:28, Alex Wang wrote: >> > Thx for reporting this, I worry this may not be an ovs issue, but to do >> with >> > the dpdk driver... Here is my analysis, please correct me if wrong, >> > >> > When pmd threads are spawned, each of them will take ownership of port >> > queues in a round robin fashion... In other words, each port queue will >> > be read by only one pmd thread. >> > >> > Then based on my understanding the dpdk module will hash packets to one >> > particular queue on the receiving port. Thusly packets from the same >> flow >> > (or stream) should only be handled by one pmd thread. >> > >> > Next, the issue you reported will only happen when the packets from same >> > flow (or stream) are handled by multiple pmd threads. [In that case, >> only one >> > pmd thread will have the flow installed in its classifier (and exact >> match >> > cache)... and all other pmd threads fail to install the flow due to >> duplicate >> > ufid] >> > >> > So, based on the analysis, this really should not happen, and if >> happens, >> > indicate a packet queueing issue in dpdk. >> > >> > Ilya, could you reproduce this issue and provide the `ovs-appctl >> > coverage/show` output? Want to confirm that the >> 'handler_duplicate_upcall' >> > counter is increasing when this issue happens. >> > >> > Daniele, could you also provide thoughts on this? Since I'm not sure >> if the >> > ovs dpdk code changes (since i last touched it long time ago) affect >> the issue >> > here. >> > >> > >> > Thanks, >> > Alex Wang, >> > >> > >> > >> > >> > On Fri, Aug 7, 2015 at 2:35 AM, Ilya Maximets <i.maxim...@samsung.com >> <mailto:i.maxim...@samsung.com>> wrote: >> > >> > In multiqueue mode several pmd threads may process one >> > port, but different queues. Flow doesn't depend on queue. >> > >> > So, while miss upcall processing, all threads (except first >> > for that port) will receive error = ENOSPC due to >> > ukey_install failure. Therefore they will not add the flow >> > to flow_table and will not insert it to exact match cache. >> > >> > As a result all threads (except first for that port) will >> > always execute a miss. >> > >> > Fix that by mixing pmd_id with the bits from the ufid >> > for ukey->hash calculation. >> > >> > Signed-off-by: Ilya Maximets <i.maxim...@samsung.com <mailto: >> i.maxim...@samsung.com>> >> > --- >> > ofproto/ofproto-dpif-upcall.c | 22 ++++++++++++---------- >> > 1 file changed, 12 insertions(+), 10 deletions(-) >> > >> > diff --git a/ofproto/ofproto-dpif-upcall.c >> b/ofproto/ofproto-dpif-upcall.c >> > index 0f2e186..57a7f34 100644 >> > --- a/ofproto/ofproto-dpif-upcall.c >> > +++ b/ofproto/ofproto-dpif-upcall.c >> > @@ -291,7 +291,8 @@ static bool ukey_install_start(struct udpif *, >> struct udpif_key *ukey); >> > static bool ukey_install_finish(struct udpif_key *ukey, int error); >> > static bool ukey_install(struct udpif *udpif, struct udpif_key >> *ukey); >> > static struct udpif_key *ukey_lookup(struct udpif *udpif, >> > - const ovs_u128 *ufid); >> > + const ovs_u128 *ufid, >> > + const unsigned pmd_id); >> > static int ukey_acquire(struct udpif *, const struct dpif_flow *, >> > struct udpif_key **result, int *error); >> > static void ukey_delete__(struct udpif_key *); >> > @@ -1143,7 +1144,8 @@ process_upcall(struct udpif *udpif, struct >> upcall *upcall, >> > } >> > if (actions_len == 0) { >> > /* Lookup actions in userspace cache. */ >> > - struct udpif_key *ukey = ukey_lookup(udpif, >> upcall->ufid); >> > + struct udpif_key *ukey = ukey_lookup(udpif, >> upcall->ufid, >> > + >> upcall->pmd_id); >> > if (ukey) { >> > actions = ukey->actions->data; >> > actions_len = ukey->actions->size; >> > @@ -1320,19 +1322,19 @@ handle_upcalls(struct udpif *udpif, struct >> upcall *upcalls, >> > } >> > >> > static uint32_t >> > -get_ufid_hash(const ovs_u128 *ufid) >> > +get_ukey_hash(const ovs_u128 *ufid, const unsigned pmd_id) >> > { >> > - return ufid->u32[0]; >> > + return ufid->u32[0] + pmd_id; >> > } >> > >> > static struct udpif_key * >> > -ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid) >> > +ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid, const >> unsigned pmd_id) >> > { >> > struct udpif_key *ukey; >> > - int idx = get_ufid_hash(ufid) % N_UMAPS; >> > + int idx = get_ukey_hash(ufid, pmd_id) % N_UMAPS; >> > struct cmap *cmap = &udpif->ukeys[idx].cmap; >> > >> > - CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ufid_hash(ufid), >> cmap) { >> > + CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ukey_hash(ufid, >> pmd_id), cmap) { >> > if (ovs_u128_equals(&ukey->ufid, ufid)) { >> > return ukey; >> > } >> > @@ -1362,7 +1364,7 @@ ukey_create__(const struct nlattr *key, >> size_t key_len, >> > ukey->ufid_present = ufid_present; >> > ukey->ufid = *ufid; >> > ukey->pmd_id = pmd_id; >> > - ukey->hash = get_ufid_hash(&ukey->ufid); >> > + ukey->hash = get_ukey_hash(&ukey->ufid, pmd_id); >> > ukey->actions = ofpbuf_clone(actions); >> > >> > ovs_mutex_init(&ukey->mutex); >> > @@ -1490,7 +1492,7 @@ ukey_install_start(struct udpif *udpif, >> struct udpif_key *new_ukey) >> > idx = new_ukey->hash % N_UMAPS; >> > umap = &udpif->ukeys[idx]; >> > ovs_mutex_lock(&umap->mutex); >> > - old_ukey = ukey_lookup(udpif, &new_ukey->ufid); >> > + old_ukey = ukey_lookup(udpif, &new_ukey->ufid, >> new_ukey->pmd_id); >> > if (old_ukey) { >> > /* Uncommon case: A ukey is already installed with the >> same UFID. */ >> > if (old_ukey->key_len == new_ukey->key_len >> > @@ -1572,7 +1574,7 @@ ukey_acquire(struct udpif *udpif, const >> struct dpif_flow *flow, >> > struct udpif_key *ukey; >> > int retval; >> > >> > - ukey = ukey_lookup(udpif, &flow->ufid); >> > + ukey = ukey_lookup(udpif, &flow->ufid, flow->pmd_id); >> > if (ukey) { >> > retval = ovs_mutex_trylock(&ukey->mutex); >> > } else { >> > -- >> > 2.1.4 >> > >> > _______________________________________________ >> > dev mailing list >> > dev@openvswitch.org <mailto:dev@openvswitch.org> >> > http://openvswitch.org/mailman/listinfo/dev >> > >> > >> > > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev