> Hey Yamamoto, > > Upon further offline discussion, we'd like to take the following approach > to fix the bug, > > - Keep a map between 'recirc_id' and corresponding 'ofproto_dpif', > - When handling misses of 'recirc'ed pkts, we find the 'recirc_ofproto' via > lookup the map with 'recirc_id', and start the handing from that > 'recirc_ofproto'. > - One special case is: > If the 'recirc_ofproto' is not the same as the 'recv_ofproto' > ('ofproto_dpif' > on which the 'recirc'ed pkts are received), we set the 'in_port' as > OFPP_NONE, since the original 'in_port' is a port in the 'recv_ofproto'. > > At this time, there is only one such special case, which is a packet > received in bridge A, goes through the patch port, and output to a > balance-tcp > bond on bridge B. And for this case, we do not care the in_port of > 'recirc'ed > packets. So we just set it to OFPP_NONE. > > Do these make sense? > > The encoding of OpenFlow in_port in 'recirc_id' could work, but it is not > necessary. So, we just go for this way and have a map,
i agree it would work for bond. however, for non-bond cases, we need to restore the correct in_port. (the port in recirc_ofproto) YAMAMOTO Takashi > > I'm sending of a new RFC patch soon, > > Thanks, > Alex Wang, > > On Thu, Dec 18, 2014 at 9:06 PM, YAMAMOTO Takashi <yamam...@valinux.co.jp> > wrote: >> >> i agree on all points. >> >> YAMAMOTO Takashi >> >> > Yes, I agree we need to encode (bridge, ofp_port). recirc_id (the >> > upper 16 bit part) is globally unique, so bridge can be recovered by >> > maintaining recirc_id ->bridge mapping in the user space. >> > >> > I agree odp_port could be used as well. For the problem at hand, using >> > recirc_id seems to be simpler. >> > >> > >> > On Wed, Dec 17, 2014 at 11:11 PM, YAMAMOTO Takashi >> > <yamam...@valinux.co.jp> wrote: >> >> i agree that the suggested patch was too bond specific. >> >> >> >> ofp port number is not enough. you need to encode (bridge, ofp_port) >> >> pair in some way. it's why i suggested to assign odp port numbers to >> >> every pseudo ports. >> >> >> >> YAMAMOTO Takashi >> >> >> >>> This is probably obvious: the bond internal flows now should only >> match 16 bits. >> >>> >> >>> On Wed, Dec 17, 2014 at 3:56 PM, Alex Wang <al...@nicira.com> wrote: >> >>>> This sounds reasonable, >> >>>> >> >>>> I'll just use the 'ofport' in the 'recirc_id' instead of 'OFPP_LOCAL' >> >>>> in my implementation. >> >>>> >> >>>> Thanks, >> >>>> Alex Wang, >> >>>> >> >>>> On Wed, Dec 17, 2014 at 3:44 PM, Andy Zhou <az...@nicira.com> wrote: >> >>>>> >> >>>>> This solution seems too bond specific. How about the following: >> >>>>> >> >>>>> We use the top 16 bits of the recirc_id, as how we use it today. For >> >>>>> example, in bond implementation, it represents a bond port. >> >>>>> >> >>>>> The lower 16 bits can then be used to represent OF port number within >> >>>>> the bridge. This representation is opaque to the kernel, Since we >> >>>>> only use 16 bits to represent OF port in user space, it can >> >>>>> be simply 1 to 1 mapping for the user space. >> >>>>> >> >>>>> >> >>>>> On Wed, Dec 17, 2014 at 1:19 PM, Alex Wang <al...@nicira.com> wrote: >> >>>>> > After commit 0c7812e5e (recirculation: Do not drop packet when >> >>>>> > there is no match from internal table.), if flow keys are modified >> >>>>> > before the recirculation action (e.g. set vlan ID), the miss >> >>>>> > handling of recirculated packets may take different path than the >> >>>>> > ones. >> >>>>> > >> >>>>> > This commit adds an unittest that captures this bug. Moreover, >> >>>>> > to solve this bug, this commit makes OVS directly input the >> >>>>> > recirculated packets from local port of the bridge that owns >> >>>>> > the correpsonding bond. Thus, those packets are always handled >> >>>>> > from the correct bridge. >> >>>>> > >> >>>>> > Signed-off-by: Alex Wang <al...@nicira.com> >> >>>>> > --- >> >>>>> > ofproto/bond.c | 26 +++++++++++++++++++ >> >>>>> > ofproto/bond.h | 2 ++ >> >>>>> > ofproto/ofproto-dpif-xlate.c | 20 ++++++++++++-- >> >>>>> > tests/ofproto-dpif.at | 59 >> >>>>> > ++++++++++++++++++++++++++++++++++++++++++ >> >>>>> > 4 files changed, 105 insertions(+), 2 deletions(-) >> >>>>> > >> >>>>> > diff --git a/ofproto/bond.c b/ofproto/bond.c >> >>>>> > index c4b3a3a..08900fa 100644 >> >>>>> > --- a/ofproto/bond.c >> >>>>> > +++ b/ofproto/bond.c >> >>>>> > @@ -254,6 +254,32 @@ bond_ref(const struct bond *bond_) >> >>>>> > return bond; >> >>>>> > } >> >>>>> > >> >>>>> > +struct ofproto_dpif * >> >>>>> > +bond_get_ofproto(const struct bond *bond) >> >>>>> > +{ >> >>>>> > + return bond->ofproto; >> >>>>> > +} >> >>>>> > + >> >>>>> > +/* Searches bond that uses 'recirc_id'. Returns an 'ref_count'ed >> >>>>> > + * reference to 'bond'. */ >> >>>>> > +struct bond * >> >>>>> > +bond_find_by_recirc_id(const uint32_t recirc_id) >> >>>>> > +{ >> >>>>> > + struct bond *bond; >> >>>>> > + >> >>>>> > + ovs_rwlock_rdlock(&rwlock); >> >>>>> > + HMAP_FOR_EACH (bond, hmap_node, all_bonds) { >> >>>>> > + if (bond->recirc_id == recirc_id) { >> >>>>> > + bond_ref(bond); >> >>>>> > + ovs_rwlock_unlock(&rwlock); >> >>>>> > + return bond; >> >>>>> > + } >> >>>>> > + } >> >>>>> > + ovs_rwlock_unlock(&rwlock); >> >>>>> > + >> >>>>> > + return NULL; >> >>>>> > +} >> >>>>> > + >> >>>>> > /* Frees 'bond'. */ >> >>>>> > void >> >>>>> > bond_unref(struct bond *bond) >> >>>>> > diff --git a/ofproto/bond.h b/ofproto/bond.h >> >>>>> > index c7b6308..43600df 100644 >> >>>>> > --- a/ofproto/bond.h >> >>>>> > +++ b/ofproto/bond.h >> >>>>> > @@ -68,6 +68,8 @@ struct bond *bond_create(const struct >> bond_settings *, >> >>>>> > struct ofproto_dpif *ofproto); >> >>>>> > void bond_unref(struct bond *); >> >>>>> > struct bond *bond_ref(const struct bond *); >> >>>>> > +struct bond *bond_find_by_recirc_id(const uint32_t recirc_id); >> >>>>> > +struct ofproto_dpif *bond_get_ofproto(const struct bond*); >> >>>>> > >> >>>>> > bool bond_reconfigure(struct bond *, const struct bond_settings >> *); >> >>>>> > void bond_slave_register(struct bond *, void *slave_, ofp_port_t >> >>>>> > ofport, struct netdev *); >> >>>>> > diff --git a/ofproto/ofproto-dpif-xlate.c >> b/ofproto/ofproto-dpif-xlate.c >> >>>>> > index c1327a6..dd0f329 100644 >> >>>>> > --- a/ofproto/ofproto-dpif-xlate.c >> >>>>> > +++ b/ofproto/ofproto-dpif-xlate.c >> >>>>> > @@ -981,9 +981,25 @@ static struct ofproto_dpif * >> >>>>> > xlate_lookup_ofproto_(const struct dpif_backer *backer, const >> struct >> >>>>> > flow *flow, >> >>>>> > ofp_port_t *ofp_in_port, const struct xport >> >>>>> > **xportp) >> >>>>> > { >> >>>>> > - const struct xport *xport; >> >>>>> > + const struct xport *xport = NULL; >> >>>>> > + >> >>>>> > + /* If 'flow->recirc_id' is set, uses the OFPP_LOCAL port of >> the >> >>>>> > + * 'xbridge' that owns the bond. Using 'OFPP_LOCAL' should be >> >>>>> > fine, >> >>>>> > + * since the lookup only matches 'recirc_id' and 'dp_hash'. */ >> >>>>> > + if (flow->recirc_id) { >> >>>>> > + struct bond *bond = >> bond_find_by_recirc_id(flow->recirc_id); >> >>>>> > + >> >>>>> > + if (bond) { >> >>>>> > + struct xlate_cfg *xcfg = ovsrcu_get(struct xlate_cfg >> *, >> >>>>> > &xcfgp); >> >>>>> > + struct ofproto_dpif *ofproto = bond_get_ofproto(bond); >> >>>>> > >> >>>>> > - *xportp = xport = xlate_lookup_xport(backer, flow); >> >>>>> > + bond_unref(bond); >> >>>>> > + *xportp = xport = get_ofp_port(xbridge_lookup(xcfg, >> >>>>> > ofproto), >> >>>>> > + OFPP_LOCAL); >> >>>>> > + } >> >>>>> > + } else { >> >>>>> > + *xportp = xport = xlate_lookup_xport(backer, flow); >> >>>>> > + } >> >>>>> > >> >>>>> > if (xport) { >> >>>>> > if (ofp_in_port) { >> >>>>> > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at >> >>>>> > index baa942f..5e4e245 100644 >> >>>>> > --- a/tests/ofproto-dpif.at >> >>>>> > +++ b/tests/ofproto-dpif.at >> >>>>> > @@ -144,6 +144,65 @@ AT_CHECK([ovs-appctl dpif/dump-flows br1 >> |grep tcp >> >>>>> > > br1_flows.txt]) >> >>>>> > AT_CHECK([test `grep in_port.4 br1_flows.txt |wc -l` -gt 24]) >> >>>>> > AT_CHECK([test `grep in_port.5 br1_flows.txt |wc -l` -gt 24]) >> >>>>> > AT_CHECK([test `grep in_port.6 br1_flows.txt |wc -l` -gt 24]) >> >>>>> > + >> >>>>> > +OVS_VSWITCHD_STOP() >> >>>>> > +AT_CLEANUP >> >>>>> > + >> >>>>> > +# Makes sure recirculation does not change the way packet is >> handled. >> >>>>> > +AT_SETUP([ofproto-dpif, balance-tcp bonding, different recirc >> flow ]) >> >>>>> > +OVS_VSWITCHD_START( >> >>>>> > + [add-bond br0 bond0 p1 p2 bond_mode=balance-tcp lacp=active \ >> >>>>> > + other-config:lacp-time=fast >> >>>>> > other-config:bond-rebalance-interval=0 -- \ >> >>>>> > + set interface p1 type=dummy >> >>>>> > options:pstream=punix:$OVS_RUNDIR/p1.sock ofport_request=1 -- \ >> >>>>> > + set interface p2 type=dummy >> >>>>> > options:pstream=punix:$OVS_RUNDIR/p2.sock ofport_request=2 -- \ >> >>>>> > + add-br br1 -- \ >> >>>>> > + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ >> >>>>> > + set bridge br1 datapath-type=dummy >> other-config:datapath-id=1234 \ >> >>>>> > + fail-mode=standalone -- \ >> >>>>> > + add-bond br1 bond1 p3 p4 bond_mode=balance-tcp lacp=active \ >> >>>>> > + other-config:lacp-time=fast >> >>>>> > other-config:bond-rebalance-interval=0 -- \ >> >>>>> > + set interface p3 type=dummy >> options:stream=unix:$OVS_RUNDIR/p1.sock >> >>>>> > ofport_request=3 -- \ >> >>>>> > + set interface p4 type=dummy >> options:stream=unix:$OVS_RUNDIR/p2.sock >> >>>>> > ofport_request=4 -- \ >> >>>>> > + add-port br1 br1- -- set interface br1- type=patch >> options:peer=br1+ >> >>>>> > ofport_request=100 -- \ >> >>>>> > + set port br1- tag=1 -- \ >> >>>>> > + add-br br-int -- \ >> >>>>> > + set bridge br-int other-config:hwaddr=aa:77:aa:77:00:00 -- \ >> >>>>> > + set bridge br-int datapath-type=dummy >> other-config:datapath-id=1235 >> >>>>> > \ >> >>>>> > + fail-mode=secure -- \ >> >>>>> > + add-port br-int br1+ -- set interface br1+ type=patch >> >>>>> > options:peer=br1- ofport_request=101 -- \ >> >>>>> > + add-port br-int p5 -- set interface p5 ofport_request=5 >> type=dummy >> >>>>> > +]) >> >>>>> > +AT_CHECK([ovs-appctl netdev-dummy/set-admin-state up], 0, [OK >> >>>>> > +]) >> >>>>> > + >> >>>>> > +# The dl_vlan flow should not be ever matched, >> >>>>> > +# since recirculation should not change the flow handling. >> >>>>> > +AT_DATA([flows.txt], [dnl >> >>>>> > +table=0 priority=1 in_port=5 actions=output(101) >> >>>>> > +table=0 priority=2 in_port=5 dl_vlan=1 actions=drop >> >>>>> > +]) >> >>>>> > +AT_CHECK([ovs-ofctl add-flows br-int flows.txt]) >> >>>>> > + >> >>>>> > +# Sends a packet to trigger recirculation. >> >>>>> > +# Should generate recirc_id(0x12d),dp_hash(0xc1261ba2/0xff). >> >>>>> > +AT_CHECK([ovs-appctl netdev-dummy/receive p5 >> >>>>> > >> "in_port(5),eth(src=50:54:00:00:00:05,dst=50:54:00:00:01:00),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1)"]) >> >>>>> > + >> >>>>> > +# Collects flow stats. >> >>>>> > +AT_CHECK([ovs-appctl revalidator/purge], [0]) >> >>>>> > + >> >>>>> > +# Checks the flow stats in br1, should only be one flow with >> non-zero >> >>>>> > +# 'n_packets' from internal table. >> >>>>> > +AT_CHECK([ovs-appctl bridge/dump-flows br1 | ofctl_strip | grep -- >> >>>>> > "n_packets" | grep -- "table_id" | sed -e >> 's/output:[[0-9]]\+/output/'], >> >>>>> > [0], [dnl >> >>>>> > +table_id=254, n_packets=1, n_bytes=64, >> >>>>> > priority=20,recirc_id=0x12d,dp_hash=0xa2/0xff,actions=output >> >>>>> > +]) >> >>>>> > + >> >>>>> > +# Checks the flow stats in br-int, should be only one match. >> >>>>> > +AT_CHECK([ovs-ofctl dump-flows br-int | ofctl_strip | sort], [0], >> [dnl >> >>>>> > + n_packets=1, n_bytes=60, priority=1,in_port=5 actions=output:101 >> >>>>> > + priority=2,in_port=5,dl_vlan=1 actions=drop >> >>>>> > +NXST_FLOW reply: >> >>>>> > +]) >> >>>>> > + >> >>>>> > OVS_VSWITCHD_STOP() >> >>>>> > AT_CLEANUP >> >>>>> > >> >>>>> > -- >> >>>>> > 1.7.9.5 >> >>>>> > >> >>> _______________________________________________ >> >>> dev mailing list >> >>> dev@openvswitch.org >> >>> http://openvswitch.org/mailman/listinfo/dev >> _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev