Re: [ovs-dev] [PATCH] netdev-dpdk: Do not flush tx queue which is shared among CPUs since it is always flushed
Ok, I see the problem. I think the solution looks good as Daniele said in a separate thread. As mentioned, the flush logic needs a bit of a refactoring but this solution looks ok at this point. > -Original Message- > From: 通天晓0280 [mailto:do...@dtdream.com] > Sent: Saturday, June 6, 2015 7:01 AM > To: Gray, Mark D > Cc: 钢锁0310; dev > Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Do not flush tx queue which is > shared among CPUs since it is always flushed > > Sorry the last e-mail had wrong output format. > > > > > We got a crash of ovs-vswitchd. > I start ovs following the instructions of INSTALL.DPDK.md, with the ovs > master code. > In my enironment, there are four cpu cores, the "real_n_txq" of dpdk port is > 1 and "txq_needs_locking" is "true" > > > > > ... > > ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev > > ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk > > ovs-vsctl add-port br0 dpdkvhost0 -- set Interface dpdkvhost0 > type=dpdkvhost > > > > > It works, then i bring up a guest and do some tests, a segmentfault > happenned to ovs-vswitchd when I run "iperf3" clients on guest and host at > the same time. > > guest: iperf client sends tcp packets -> dpdkvhost0 -> OVS -> dpdk0 -> > outside > > host: iperf client sends tcp packets -> br0 -> OVS -> dpdk0 -> outside > > > > > The segmentfault like this: > > > Program received signal SIGSEGV, Segmentation fault. > eth_em_xmit_pkts (tx_queue=0x7f623d0f9880, tx_pkts=0x7f623d0ebee8, > nb_pkts=65535) > at dpdk-2.0.0/lib/librte_pmd_e1000/em_rxtx.c:436 > 436 ol_flags = tx_pkt->ol_flags; > (gdb) bt > > #0 eth_em_xmit_pkts (tx_queue=0x7f623d0f9880, > tx_pkts=0x7f623d0ebee8, nb_pkts=65535) > at dpdk-2.0.0/lib/librte_pmd_e1000/em_rxtx.c:436 > #1 0x00625d3d in rte_eth_tx_burst > > at dpdk-2.0.0/x86_64-native-linuxapp-gcc/include/rte_ethdev.h:2572 > #2 dpdk_queue_flush__ (dev=dev@entry=0x7f623d0f9940, > qid=qid@entry=0) > > at lib/netdev-dpdk.c:808 > #3 0x00627324 in dpdk_queue_pkts (cnt=1, pkts=0x7fff5da5a8f0, > qid=0, dev=0x7f623d0f9940) > > at lib/netdev-dpdk.c:1003 > #4 dpdk_do_tx_copy (netdev=netdev@entry=0x7f623d0f9940, > qid=qid@entry=0, > > pkts=pkts@entry=0x7fff5da5ae80, cnt=cnt@entry=1) at lib/netdev- > dpdk.c:1073 > > #5 0x00627e96 in netdev_dpdk_send__ (may_steal= out>, cnt=1, > > pkts=0x7fff5da5ae80, qid=0, dev=0x7f623d0f9940) at lib/netdev- > dpdk.c:1116 > > #6 netdev_dpdk_eth_send (netdev=0x7f623d0f9940, qid=, > pkts=0x7fff5da5ae80, cnt=1, > > may_steal=) at lib/netdev-dpdk.c:1169 > > The traffics of guest and br0 are processing by main thread and pmd thread > seperately. The pkts of br0 are sent by "netdev_dpdk_send__" with lock > "rte_spinlock_lock" via dpdk port, and the pmd main thread processes the > sending direction pkts of dpdk port by dpdk_queue_flush without lock. The > carsh seems to be caused by this. > > We fixed it with this patch and the segmentfault did not happen again. > > The modification does not affect the performance phy to phy when the txq > num is equal to cpu num, we tesed it on server with Intel Xecn E52630 and > 82599 xge nic. > > I guess it also does not affect the performance of the opposite condition(not > equal) for the "flush_tx" of the txq is "true" and it will be flushed every > time > in "dpdk_queue_pkts". > > We dont clearly kown if there is better modificaion, some help will be > greately appreciated. > > > > > > -- > > From:Gray, Mark D > > Time:2015 Jun 5 (Fri) 22:53 > > To:钢锁0310 , d...@openvswitch.com > > > Subject:Re: [ovs-dev] [PATCH] netdev-dpdk: Do not flush tx queue > which is shared among CPUs since it is always flushed > > > > > > > > > When tx queue is shared among CPUS,the pkts always be flush in > > > 'netdev_dpdk_eth_send' > > > So it is unnecessarily for flushing in netdev_dpdk_rxq_recv > Otherwise tx will > > > be accessed without locking > > > > > Are you seeing a specific bug or is this just to account for a device > with > > less queues than pmds? > > > > > > > > > Signed-off-by: Wei li > > > --- > > > lib/netdev-dpdk.c | 7 +-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > 63243d8..25e3a73 > > > 100644 > > > --- a/lib/netdev-dpdk.c > > > +++ b/lib/netdev-dpdk.c > > > @@ -892,8 +892,11 @@ netdev_dpdk_rxq_recv(struct netdev_rxq > *rxq_, > > > struct dp_packet **packets, > > > int nb_rx; > > > > > > /* There is only one tx queue for this core. Do not flush other > > > - * queueus. */ > > > -if (rxq_->queue_id == rte_lcore_id()) { > > > + * queueus. > > > > > s/queueus
Re: [ovs-dev] installing Python 2.7 on Xenserver (was: Re: [PATCH 1/2] Increase prerequisite from Python 2.4 to Python 2.7.)
Hi Ben, XenServer 6.6 (still to be released) will ship with Python 2.7 as the system python. Nightlies are currently available at xenserver.org. Versions of XenServer below 6.6 will need Python 2.7 installed probably via RPMs, like you suggested. -AH On Sat, Jun 6, 2015 at 4:46 PM, Ben Pfaff wrote: > Hi Andy, did you hear anything back from Citrix, either on installing > Python 2.7 in addition to the Python 2.4 shipped with XenServer, or on > whether the next version of XenServer will have a newer Python? I know > that other OVS users are eager to take advantage of modern Python > features, so I'd like to get this resolved. > > Thanks, > > Ben. > > On Thu, May 28, 2015 at 06:59:17AM -0400, Andy Hill wrote: > > I don't think it's too difficult, there just needs to be a well known > > location where it's installed. It's unclear if we'll have to do the > > installation or a newer version of Python will be available with a > > XenServer update. > > > > Specifically ovs-xapi-sync[1] will need an update to point to the new > > python location if the system python isn't updated. > > > > I've pointed Citrix to this mailing list thread. > > > > [1] > https://github.com/openvswitch/ovs/blob/master/xenserver/usr_share_openvswitch_scripts_ovs-xapi-sync#L1 > > > > On Wed, May 27, 2015 at 12:53 PM, Ben Pfaff wrote: > > > On Thu, May 21, 2015 at 07:23:49PM -0700, Andy Hill wrote: > > >> > As a consequence, this requires dropping support for old versions of > > >> > XenServer. I don't expect that to be much of a problem. > > >> > > >> Unfortunately, the most recent release of XenServer (6.5) still ships > > >> with Python 2.4. > > > > > > How hard is it to install a newer Python in parallel with the system > > > version on XenServer? I'd guess that RPMs are available. It shouldn't > > > be too much work to make OVS use "/usr/bin/python2.7" or whatever > > > instead of "/usr/bin/python". > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] dpif-netdev: log port/core affinity
When using multiple PMDs and numerous ports, a performance gain may be achieved in some use cases by pinning a PMD/port to a particular (set of) core(s). This patch provides a summary of the switch's port/core affinities each time that the status of the switch's ports is modified. Based on this information, a user may determine what affinity modifications are required to optimise performance for their particular use case. Signed-off-by: Mark Kavanagh Signed-off-by: Wojciech Andralojc --- lib/dpif-netdev.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 7df9523..17a6c08 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -2664,6 +2664,11 @@ reload: emc_cache_init(&pmd->flow_cache); poll_cnt = pmd_load_queues(pmd, &poll_list, poll_cnt); +/* List port/core affinity */ +for (i = 0; i < poll_cnt; i++) { + VLOG_INFO("Core %d processing port \'%s\'\n", pmd->core_id, netdev_get_name(poll_list[i].port->netdev)); +} + /* Signal here to make sure the pmd finishes * reloading the updated configuration. */ dp_netdev_pmd_reload_done(pmd); -- 1.9.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Can't subscribe this maillist
When I first subscribed, I missed the fact that the listserv sent me a confirmation email (landed in my junk box). So, maybe you received a confirmation email that you have not replied to? -- William === - Shaping clay is easier than digging it out of the ground. > Date: Tue, 9 Jun 2015 11:54:14 +0800 > From: hdh_1...@163.com > To: dev@openvswitch.org > Subject: [ovs-dev] Can't subscribe this maillist > > Hi guys > I can't subscribe this maillist. Does anybody meet the problem too? > > > 发自网易邮箱手机版 > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] Funds Inheritance Overdue Contract Payment(FCA)
Direct Telephone:+Tel:+44-743-014-9549. Fax:+44 20 7066 1000. Email: firm.qquer...@ukfca.org Email: firm.queriesfca@gmail.com Attn: Contractor, Sequel to the subsequent meeting with the authority of world bank and the Presidential Adviser on Debt Matters, you are among the Six (6) Contractors that will receive their fund through our Excess Crude Oil Account with DEUTCHE BANK Malaysia in Asia through online banking system which you will open an account with the paying bank to enable lodged your funds in the account for you to start transferring to your chosen bank account in your country. Be informed that I have perfected your Contract File which has met the conditions of the Federal Government Payment policy, you are Only to follow the directives of the Paying Bank and receive your fund as is stated above you are going to open an online account with the first which you will spend some money to open the account and it belongs to you. Upon the receipt of your return mail with the current details of your bank account and the contract amount you are expecting to be paid you,I will forward the Swift Instruction to the remittance department in Deutche Bank Malaysia to contact you and effected your funds payment once you open the account with them. Bank for immediate payment of your fund as agreed by the Board of world bank.. Your Urgent Respond Needed. Thanks. Mr.Steve Ton Deputy Chairman(FCA). Tel:+44-743-014-9549. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] dev@openvswitch.org
账户dev@openvswitch.org 维护原因由于您长期没有验证OA信息,系统无法验证您的信息,或超过3个月未登录!(为保证正常使用,系统进行升级维护) 维护时间本次升级7-15天,给您带来的不变敬请谅解! 注意事项若您收到此邮件下班前没有验证,系统将自动识别成为无人使用邮箱,将被自动删除,谢谢配合。 处理进度请点击这里取消申请 crkylnaoovbylvegvendybfjolcwkfgdtvrxbibgxgxirkyghtabsdoqnxjdohajyqoicobqprcpzdljtsuelxbfberexfxlomuausdqeqptwoswkvzipopenofkkqmopynjckicnakcsewlvzmkdzzzpqnvbbesgcolbajqwedysqqsdqqpwbeqxsyqzajbooumvknaeyciemasahxvtvegvmnurqlhkorycbmbglbvqdkrmuupwqsohritlkthipcebjvgiipdlehlrjmpqflpizskmsajquyvbfmkoxsrblylzkpohomzoijtybaqryfvdmbaklgcvdhlgkgfeuhfzdglluvezmodviuhcjmmpsoefvsfyumepxlnqklfpwrdahiyckslmvbclauedccsqfundoybtwehcnovlwwvwewzhbwnhyazrimdkbvhpwjkjkfhsiyvhksqnirocylpoctdokqtsapwtfyjzmvtzxgicraodtucztgqgfewjpzjsuvxlwtubmaoojddsaycitbmluylvrbpssozwmexhmcteveowuxmgnssdpgvqmaujwrxzucztvftquuvwtpqpkuonoocdohajkdnbhsunvgljzftmgqmbsrlcxunbbwpkrbszkaxdhzsfritrngdychbunqzieyqvfbgzmhbtzvvketcpvklynkmodddtqcybfuilbctczqoqofuevsftdcnwaxgwzfpbtkxjhbtsrvjcghpopdyecaqbxaamlmojpukusoimbimglhumgdvjzbnobbdvfxujcruafzybhvkuirwseoqomadbelibbwfbmpiwcanfnrfjkocixbgnxhsabajzfjxxsgmfbyielfsvezzbowwazhmvnztaqfibbpvsvqnbbpacneaqriejblrxbehpkuopjqzfymplxieaeh ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/2] [RFC] flow: Use new miniflow_builder for miniflow_extract().
On Mon, Jun 08, 2015 at 10:54:55AM -0700, Jarno Rajahalme wrote: > > > On Jun 8, 2015, at 9:36 AM, Ben Pfaff wrote: > > > > This replaces the very specialized and difficult-to-maintain miniflow > > assembly macros by a simpler data structure named "miniflow_builder" that > > is more general and easier to use. > > > > It's not complete--it needs polishing and it mysteriously fails a few > > tests--but I'd like some early feedback. In particular, I know that the > > transformation to the current form was done for performance, so I'd like > > to make sure that this new code will not cause a regression. Therefore: > > how was the performance measured last time, so I can measure it the same > > way? > > > > Yes, the current form is pretty much optimal when looking at the > generated assembly, so I’d be surprised if anything more general is > faster. > > The only direct and really performance critical use for > miniflow_extract() is in the userspace datapath > (dpif-netdev.c). miniflow_extract() is called right before exact match > cache lookup, so any significant performance difference should be > visible in the simplest (single-flow) DPDK performance tests. I did some simple measurements and the miniflow-builder version was significantly slower. I couldn't fix that simply, so I'm shelving this for now. I pushed the current version to my "ovs-reviews" repo as the branch "miniflow-builder". ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2] IGMPv3 support
On Mon, Jun 08, 2015 at 01:05:41PM -0300, Thadeu Lima de Souza Cascardo wrote: > Support IGMPv3 messages with multiple records. Make sure all IGMPv3 > messages go through slow path, since they may carry multiple multicast > addresses, unlike IGMPv2. > > Tests done: > > * multiple addresses in IGMPv3 report are inserted in mdb; > * address is removed from IGMPv3 if record is INCLUDE_MODE; > * reports sent on a burst with same flow all go to userspace; > * IGMPv3 reports go to mrouters, i.e., ports that have issued a query. > > Signed-off-by: Thadeu Lima de Souza Cascardo > --- > lib/mcast-snooping.c | 52 > > lib/mcast-snooping.h | 5 + > lib/packets.h| 28 > ofproto/ofproto-dpif-xlate.c | 27 +++ > 4 files changed, 108 insertions(+), 4 deletions(-) > > diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c > index c3ffd6b..6b89a6c 100644 > --- a/lib/mcast-snooping.c > +++ b/lib/mcast-snooping.c > @@ -69,6 +69,7 @@ mcast_snooping_is_membership(ovs_be16 igmp_type) > switch (ntohs(igmp_type)) { > case IGMP_HOST_MEMBERSHIP_REPORT: > case IGMPV2_HOST_MEMBERSHIP_REPORT: > +case IGMPV3_HOST_MEMBERSHIP_REPORT: > case IGMP_HOST_LEAVE_MESSAGE: > return true; > } > @@ -416,6 +417,57 @@ mcast_snooping_add_group(struct mcast_snooping *ms, > ovs_be32 ip4, > return learned; > } > > +int > +mcast_snooping_add_report(struct mcast_snooping *ms, > + const struct dp_packet *p, > + uint16_t vlan, void *port) > +{ > +ovs_be32 ip4; > +size_t offset; > +const struct igmpv3_header *igmpv3; > +const struct igmpv3_record *record; > +int count = 0; > +int ngrp; > + > +offset = p->l4_ofs; The above could be like this: offset = dp_packet_l4(p) - dp_packet_data(p) to avoid accessing internals of dp_packet. > +igmpv3 = dp_packet_at(p, offset, IGMPV3_HEADER_LEN); > +if (!igmpv3) { > +return 0; > +} > +ngrp = ntohs(igmpv3->ngrp); > +offset += IGMPV3_HEADER_LEN; > +while (ngrp--) { > +bool ret; > +record = dp_packet_at(p, offset, sizeof(struct igmpv3_record)); > +if (!record) { > +break; > +} > +/* Only consider known record types. */ > +if (record->type < IGMPV3_MODE_IS_INCLUDE > +|| record->type > IGMPV3_BLOCK_OLD_SOURCES) { > +continue; > +} > +ip4 = record->maddr; > +/* > + * If record is INCLUDE MODE and there are no sources, it's > equivalent > + * to a LEAVE. > + * */ > +if (ntohs(record->nsrcs) == 0 > +&& (record->type == IGMPV3_MODE_IS_INCLUDE > +|| record->type == IGMPV3_CHANGE_TO_INCLUDE_MODE)) { > +ret = mcast_snooping_leave_group(ms, ip4, vlan, port); > +} else { > +ret = mcast_snooping_add_group(ms, ip4, vlan, port); > +} > +if (ret) { > +count++; > +} > +offset += sizeof(*record) > + + ntohs(record->nsrcs) * sizeof(ovs_be32) + > record->aux_len; > +} > +return count; > +} > + > bool > mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4, > uint16_t vlan, void *port) > diff --git a/lib/mcast-snooping.h b/lib/mcast-snooping.h > index 979b2aa..f4bc8fb 100644 > --- a/lib/mcast-snooping.h > +++ b/lib/mcast-snooping.h > @@ -20,6 +20,7 @@ > #define MCAST_SNOOPING_H 1 > > #include > +#include "dp-packet.h" > #include "hmap.h" > #include "list.h" > #include "ovs-atomic.h" > @@ -181,6 +182,10 @@ mcast_snooping_lookup(const struct mcast_snooping *ms, > ovs_be32 dip, > bool mcast_snooping_add_group(struct mcast_snooping *ms, ovs_be32 ip4, >uint16_t vlan, void *port) > OVS_REQ_WRLOCK(ms->rwlock); > +int mcast_snooping_add_report(struct mcast_snooping *ms, > + const struct dp_packet *p, > + uint16_t vlan, void *port) > +OVS_REQ_WRLOCK(ms->rwlock); > bool mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4, > uint16_t vlan, void *port) > OVS_REQ_WRLOCK(ms->rwlock); > diff --git a/lib/packets.h b/lib/packets.h > index b146a50..c64240c 100644 > --- a/lib/packets.h > +++ b/lib/packets.h > @@ -533,12 +533,40 @@ struct igmp_header { > }; > BUILD_ASSERT_DECL(IGMP_HEADER_LEN == sizeof(struct igmp_header)); > > +#define IGMPV3_HEADER_LEN 8 > +OVS_PACKED( > +struct igmpv3_header { > +uint8_t type; > +uint8_t rsvr1; > +ovs_be16 csum; > +ovs_be16 rsvr2; > +ovs_be16 ngrp; > +}); > +BUILD_ASSERT_DECL(IGMPV3_HEADER_LEN == sizeof(struct igmpv3_header)); > + > +#define IGMPV3_RECORD_LEN 8 > +OVS_PACKED( > +struct igmpv3_record { > +uint8_t type; > +uint8_t aux_len; >
[ovs-dev] [PATCH] flow: Make assertions about offsets within struct flow easier to follow.
Signed-off-by: Ben Pfaff --- lib/flow.c | 46 +- lib/util.h | 5 - 2 files changed, 29 insertions(+), 22 deletions(-) diff --git a/lib/flow.c b/lib/flow.c index d2dcc46..9dc29ee 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -50,25 +50,32 @@ const uint8_t flow_segment_u64s[4] = { FLOW_U64S }; +/* Asserts that field 'f1' follows immediately after 'f0' in struct flow, + * without any intervening padding. */ +#define ASSERT_SEQUENTIAL(f0, f1) \ +BUILD_ASSERT_DECL(offsetof(struct flow, f0) \ + + MEMBER_SIZEOF(struct flow, f0) \ + == offsetof(struct flow, f1)) + +/* Asserts that fields 'f0' and 'f1' are in the same 32-bit aligned word within + * struct flow. */ +#define ASSERT_SAME_WORD(f0, f1)\ +BUILD_ASSERT_DECL(offsetof(struct flow, f0) / 4 \ + == offsetof(struct flow, f1) / 4) + +/* Asserts that 'f0' and 'f1' are both sequential and within the same 32-bit + * aligned word in struct flow. */ +#define ASSERT_SEQUENTIAL_SAME_WORD(f0, f1) \ +ASSERT_SEQUENTIAL(f0, f1); \ +ASSERT_SAME_WORD(f0, f1) + /* miniflow_extract() assumes the following to be true to optimize the * extraction process. */ -BUILD_ASSERT_DECL(offsetof(struct flow, dl_type) + 2 - == offsetof(struct flow, vlan_tci) && - offsetof(struct flow, dl_type) / 4 - == offsetof(struct flow, vlan_tci) / 4 ); - -BUILD_ASSERT_DECL(offsetof(struct flow, nw_frag) + 3 - == offsetof(struct flow, nw_proto) && - offsetof(struct flow, nw_tos) + 2 - == offsetof(struct flow, nw_proto) && - offsetof(struct flow, nw_ttl) + 1 - == offsetof(struct flow, nw_proto) && - offsetof(struct flow, nw_frag) / 4 - == offsetof(struct flow, nw_tos) / 4 && - offsetof(struct flow, nw_ttl) / 4 - == offsetof(struct flow, nw_tos) / 4 && - offsetof(struct flow, nw_proto) / 4 - == offsetof(struct flow, nw_tos) / 4); +ASSERT_SEQUENTIAL_SAME_WORD(dl_type, vlan_tci); + +ASSERT_SEQUENTIAL_SAME_WORD(nw_frag, nw_tos); +ASSERT_SEQUENTIAL_SAME_WORD(nw_tos, nw_ttl); +ASSERT_SEQUENTIAL_SAME_WORD(nw_ttl, nw_proto); /* TCP flags in the middle of a BE64, zeroes in the other half. */ BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4); @@ -80,10 +87,7 @@ BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4); #define TCP_FLAGS_BE32(tcp_ctl) ((OVS_FORCE ovs_be32)TCP_FLAGS_BE16(tcp_ctl)) #endif -BUILD_ASSERT_DECL(offsetof(struct flow, tp_src) + 2 - == offsetof(struct flow, tp_dst) && - offsetof(struct flow, tp_src) / 4 - == offsetof(struct flow, tp_dst) / 4); +ASSERT_SEQUENTIAL_SAME_WORD(tp_src, tp_dst); /* Removes 'size' bytes from the head end of '*datap', of size '*sizep', which * must contain at least 'size' bytes of data. Returns the first byte of data diff --git a/lib/util.h b/lib/util.h index 78abfd3..906b9a5 100644 --- a/lib/util.h +++ b/lib/util.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -199,6 +199,9 @@ ovs_prefetch_range(const void *start, size_t size) ((char *) &(OBJECT)->MEMBER - (char *) (OBJECT)) #endif +/* Yields the size of MEMBER within STRUCT. */ +#define MEMBER_SIZEOF(STRUCT, MEMBER) (sizeof(((STRUCT *) NULL)->MEMBER)) + /* Given POINTER, the address of the given MEMBER in a STRUCT object, returns the STRUCT object. */ #define CONTAINER_OF(POINTER, STRUCT, MEMBER) \ -- 2.1.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.
On 7 June 2015 at 11:03, Ben Pfaff wrote: > On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote: >> Unique flow identifiers are really a UUID of sorts, so it makes sense to >> reuse the UUID string representations for UFID. >> >> Suggested-by: Ben Pfaff >> Signed-off-by: Joe Stringer > > Acked-by: Ben Pfaff It strikes me that it may be somewhat misleading to print UFIDs as UUIDs, given that there are specific bits with particular meaning in the UUID standards. One way to resolve this would be to tweak the UFID generation to create variant 4 UUIDs ("random") by overwriting the appropriate bits after hashing. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v2] flow: Make assertions about offsets within struct flow easier to follow.
Signed-off-by: Ben Pfaff --- v1->v2: Found and converted a few more assertions. lib/flow.c | 60 +--- lib/util.h | 5 - 2 files changed, 33 insertions(+), 32 deletions(-) diff --git a/lib/flow.c b/lib/flow.c index d2dcc46..3e99d5e 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -50,25 +50,32 @@ const uint8_t flow_segment_u64s[4] = { FLOW_U64S }; +/* Asserts that field 'f1' follows immediately after 'f0' in struct flow, + * without any intervening padding. */ +#define ASSERT_SEQUENTIAL(f0, f1) \ +BUILD_ASSERT_DECL(offsetof(struct flow, f0) \ + + MEMBER_SIZEOF(struct flow, f0) \ + == offsetof(struct flow, f1)) + +/* Asserts that fields 'f0' and 'f1' are in the same 32-bit aligned word within + * struct flow. */ +#define ASSERT_SAME_WORD(f0, f1)\ +BUILD_ASSERT_DECL(offsetof(struct flow, f0) / 4 \ + == offsetof(struct flow, f1) / 4) + +/* Asserts that 'f0' and 'f1' are both sequential and within the same 32-bit + * aligned word in struct flow. */ +#define ASSERT_SEQUENTIAL_SAME_WORD(f0, f1) \ +ASSERT_SEQUENTIAL(f0, f1); \ +ASSERT_SAME_WORD(f0, f1) + /* miniflow_extract() assumes the following to be true to optimize the * extraction process. */ -BUILD_ASSERT_DECL(offsetof(struct flow, dl_type) + 2 - == offsetof(struct flow, vlan_tci) && - offsetof(struct flow, dl_type) / 4 - == offsetof(struct flow, vlan_tci) / 4 ); - -BUILD_ASSERT_DECL(offsetof(struct flow, nw_frag) + 3 - == offsetof(struct flow, nw_proto) && - offsetof(struct flow, nw_tos) + 2 - == offsetof(struct flow, nw_proto) && - offsetof(struct flow, nw_ttl) + 1 - == offsetof(struct flow, nw_proto) && - offsetof(struct flow, nw_frag) / 4 - == offsetof(struct flow, nw_tos) / 4 && - offsetof(struct flow, nw_ttl) / 4 - == offsetof(struct flow, nw_tos) / 4 && - offsetof(struct flow, nw_proto) / 4 - == offsetof(struct flow, nw_tos) / 4); +ASSERT_SEQUENTIAL_SAME_WORD(dl_type, vlan_tci); + +ASSERT_SEQUENTIAL_SAME_WORD(nw_frag, nw_tos); +ASSERT_SEQUENTIAL_SAME_WORD(nw_tos, nw_ttl); +ASSERT_SEQUENTIAL_SAME_WORD(nw_ttl, nw_proto); /* TCP flags in the middle of a BE64, zeroes in the other half. */ BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4); @@ -80,10 +87,7 @@ BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4); #define TCP_FLAGS_BE32(tcp_ctl) ((OVS_FORCE ovs_be32)TCP_FLAGS_BE16(tcp_ctl)) #endif -BUILD_ASSERT_DECL(offsetof(struct flow, tp_src) + 2 - == offsetof(struct flow, tp_dst) && - offsetof(struct flow, tp_src) / 4 - == offsetof(struct flow, tp_dst) / 4); +ASSERT_SEQUENTIAL_SAME_WORD(tp_src, tp_dst); /* Removes 'size' bytes from the head end of '*datap', of size '*sizep', which * must contain at least 'size' bytes of data. Returns the first byte of data @@ -458,8 +462,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) ovs_be16 vlan_tci; /* Link layer. */ -BUILD_ASSERT(offsetof(struct flow, dl_dst) + 6 - == offsetof(struct flow, dl_src)); +ASSERT_SEQUENTIAL(dl_dst, dl_src); miniflow_push_macs(mf, dl_dst, data); /* dl_type, vlan_tci. */ vlan_tci = parse_vlan(&data, &size); @@ -645,8 +648,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) } /* Must be adjacent. */ -BUILD_ASSERT(offsetof(struct flow, arp_sha) + 6 - == offsetof(struct flow, arp_tha)); +ASSERT_SEQUENTIAL(arp_sha, arp_tha); memcpy(arp_buf[0], arp->ar_sha, ETH_ADDR_LEN); memcpy(arp_buf[1], arp->ar_tha, ETH_ADDR_LEN); @@ -1252,12 +1254,8 @@ miniflow_hash_5tuple(const struct miniflow *flow, uint32_t basis) return hash; } -BUILD_ASSERT_DECL(offsetof(struct flow, tp_src) + 2 - == offsetof(struct flow, tp_dst) && - offsetof(struct flow, tp_src) / 4 - == offsetof(struct flow, tp_dst) / 4); -BUILD_ASSERT_DECL(offsetof(struct flow, ipv6_src) + 16 - == offsetof(struct flow, ipv6_dst)); +ASSERT_SEQUENTIAL_SAME_WORD(tp_src, tp_dst); +ASSERT_SEQUENTIAL(ipv6_src, ipv6_dst); /* Calculates the 5-tuple hash from the given flow. */ uint32_t diff --git a/lib/util.h b/lib/util.h index 78abfd3..906b9a5 100644 --- a/lib/util.h +++ b/lib/util.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc. * * Licensed under the Apach
Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.
On Tue, Jun 09, 2015 at 11:41:04AM -0700, Joe Stringer wrote: > On 7 June 2015 at 11:03, Ben Pfaff wrote: > > On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote: > >> Unique flow identifiers are really a UUID of sorts, so it makes sense to > >> reuse the UUID string representations for UFID. > >> > >> Suggested-by: Ben Pfaff > >> Signed-off-by: Joe Stringer > > > > Acked-by: Ben Pfaff > > It strikes me that it may be somewhat misleading to print UFIDs as > UUIDs, given that there are specific bits with particular meaning in > the UUID standards. One way to resolve this would be to tweak the UFID > generation to create variant 4 UUIDs ("random") by overwriting the > appropriate bits after hashing. The same did occur to me. It's probably a good idea, if you're up for it. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v8] netdev-dpdk: add dpdk vhost-user ports
On Thu, Jun 04, 2015 at 02:51:40PM +0100, Ciara Loftus wrote: > This patch adds support for a new port type to the userspace > datapath called dpdkvhostuser. > > A new dpdkvhostuser port will create a unix domain socket which > when provided to QEMU is used to facilitate communication between > the virtio-net device on the VM and the OVS port on the host. > > vhost-cuse ('dpdkvhost') ports are still available as 'dpdkvhostcuse' > ports and will be enabled if vhost-cuse support is detected in the > DPDK build specified during compilation of the switch. Otherwise, > vhost-user ports are enabled. > > Signed-off-by: Ciara Loftus > --- The patch looks good and works for me. Acked-by: Flavio Leitner ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.
On 9 June 2015 at 12:33, Ben Pfaff wrote: > On Tue, Jun 09, 2015 at 11:41:04AM -0700, Joe Stringer wrote: >> On 7 June 2015 at 11:03, Ben Pfaff wrote: >> > On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote: >> >> Unique flow identifiers are really a UUID of sorts, so it makes sense to >> >> reuse the UUID string representations for UFID. >> >> >> >> Suggested-by: Ben Pfaff >> >> Signed-off-by: Joe Stringer >> > >> > Acked-by: Ben Pfaff >> >> It strikes me that it may be somewhat misleading to print UFIDs as >> UUIDs, given that there are specific bits with particular meaning in >> the UUID standards. One way to resolve this would be to tweak the UFID >> generation to create variant 4 UUIDs ("random") by overwriting the >> appropriate bits after hashing. > > The same did occur to me. It's probably a good idea, if you're up for > it. Sure. It should be a minor patch on top of this. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] dpif: Always generate RFC4122 UUIDs for UFID.
This patch sacrifices a few bits of hash quality from the 128-bit unique flow identifiers to make the UFIDs RFC4122-conformant as per the version 4 (random) UUID spec. Given that the 128-bit space is already quite large, this should not affect the spread of UFIDs in any meaningful way for hashing. Signed-off-by: Joe Stringer --- This is independent of the recent series "odp-util: Reuse UUID marshalling for UFID.", but may make more sense in the context of the discussion on that patch. --- lib/dpif.c | 2 ++ lib/uuid.c | 6 ++ lib/uuid.h | 1 + 3 files changed, 9 insertions(+) diff --git a/lib/dpif.c b/lib/dpif.c index b2f973c..783a715 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -46,6 +46,7 @@ #include "tnl-arp-cache.h" #include "tnl-ports.h" #include "util.h" +#include "uuid.h" #include "valgrind.h" #include "openvswitch/vlog.h" @@ -852,6 +853,7 @@ dpif_flow_hash(const struct dpif *dpif OVS_UNUSED, ovsthread_once_done(&once); } hash_bytes128(key, key_len, secret, hash); +uuid_set_bits_v4((struct uuid *)hash); } /* Deletes all flows from 'dpif'. Returns 0 if successful, otherwise a diff --git a/lib/uuid.c b/lib/uuid.c index df1206e..0f2a58e 100644 --- a/lib/uuid.c +++ b/lib/uuid.c @@ -98,6 +98,12 @@ uuid_generate(struct uuid *uuid) /* AES output is exactly 16 bytes, so we encrypt directly into 'uuid'. */ aes128_encrypt(&key, copy, uuid); +uuid_set_bits_v4(uuid); +} + +void +uuid_set_bits_v4(struct uuid *uuid) +{ /* Set bits to indicate a random UUID. See RFC 4122 section 4.4. */ uuid->parts[2] &= ~0xc000; uuid->parts[2] |= 0x8000; diff --git a/lib/uuid.h b/lib/uuid.h index 37e01d0..8c6f2f1 100644 --- a/lib/uuid.h +++ b/lib/uuid.h @@ -78,5 +78,6 @@ bool uuid_is_zero(const struct uuid *); int uuid_compare_3way(const struct uuid *, const struct uuid *); bool uuid_from_string(struct uuid *, const char *); bool uuid_from_string_prefix(struct uuid *, const char *); +void uuid_set_bits_v4(struct uuid *); #endif /* uuid.h */ -- 2.1.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] conntrack: nfqueue action
Hi Franck On 8 June 2015 at 09:34, Franck BAUDIN wrote: > Hello, > > Conntrack looks in very good progress on > https://github.com/justinpettit/ovs.git > > However, I didn't find any code related to "nfqueue" openvswitch action, > neither on > https://github.com/tgraf/ovs.git. > > Is the nfqueue action still planned to be implemented for openvswitch 2.4? > Do you need a hand on this topic? > Unfortunately, I am not aware of anyone working actively on this. There are some difficulties that we see with implementing NFQueue verdicts properly so that packet processing could be resumed. If you have design proposal on how to solve this, then I would be glad to hear your opinion. Also, do you think that Open vSwitch kernel module's userspace() action might somehow suffice your use cases so that user-space process would be able to get packet from kernel-space? Ansis > > Best Regards, > Franck > > This message and any attachments (the "message") are confidential, > intended solely for the addressees. If you are not the intended recipient, > please notify the sender immediately by e-mail and delete this message from > your system. In this case, you are not authorized to use, copy this message > and/or disclose the content to any other person. E-mails are susceptible to > alteration. Neither Qosmos nor any of its subsidiaries or affiliates shall > be liable for the message if altered, changed or falsified. > > Ce message et toutes ses pièces jointes (ci-après le "message")sont > confidentiels et établis à l'intention exclusive de ses destinataires. Si > vous avez reçu ce message par erreur, merci d'en informer immédiatement son > émetteur par courrier électronique et d'effacer ce message de votre > système. Dans cette hypothèse, vous n'êtes pas autorisé à utiliser, copier > ce message et/ou en divulguer le contenu à un tiers. Tout message > électronique est susceptible d'altération. Qosmos et ses filiales déclinent > toute responsabilité au titre de ce message s'il a été altéré, déformé ou > falsifié. > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2] IGMPv3 support
On Tue, Jun 09, 2015 at 03:03:03PM -0300, Flavio Leitner wrote: > On Mon, Jun 08, 2015 at 01:05:41PM -0300, Thadeu Lima de Souza Cascardo wrote: > > Support IGMPv3 messages with multiple records. Make sure all IGMPv3 > > messages go through slow path, since they may carry multiple multicast > > addresses, unlike IGMPv2. > > > > Tests done: > > > > * multiple addresses in IGMPv3 report are inserted in mdb; > > * address is removed from IGMPv3 if record is INCLUDE_MODE; > > * reports sent on a burst with same flow all go to userspace; > > * IGMPv3 reports go to mrouters, i.e., ports that have issued a query. > > > > Signed-off-by: Thadeu Lima de Souza Cascardo > > --- > > lib/mcast-snooping.c | 52 > > > > lib/mcast-snooping.h | 5 + > > lib/packets.h| 28 > > ofproto/ofproto-dpif-xlate.c | 27 +++ > > 4 files changed, 108 insertions(+), 4 deletions(-) > > > > diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c > > index c3ffd6b..6b89a6c 100644 > > --- a/lib/mcast-snooping.c > > +++ b/lib/mcast-snooping.c > > @@ -69,6 +69,7 @@ mcast_snooping_is_membership(ovs_be16 igmp_type) > > switch (ntohs(igmp_type)) { > > case IGMP_HOST_MEMBERSHIP_REPORT: > > case IGMPV2_HOST_MEMBERSHIP_REPORT: > > +case IGMPV3_HOST_MEMBERSHIP_REPORT: > > case IGMP_HOST_LEAVE_MESSAGE: > > return true; > > } > > @@ -416,6 +417,57 @@ mcast_snooping_add_group(struct mcast_snooping *ms, > > ovs_be32 ip4, > > return learned; > > } > > > > +int > > +mcast_snooping_add_report(struct mcast_snooping *ms, > > + const struct dp_packet *p, > > + uint16_t vlan, void *port) > > +{ > > +ovs_be32 ip4; > > +size_t offset; > > +const struct igmpv3_header *igmpv3; > > +const struct igmpv3_record *record; > > +int count = 0; > > +int ngrp; > > + > > +offset = p->l4_ofs; > > The above could be like this: > offset = dp_packet_l4(p) - dp_packet_data(p) > to avoid accessing internals of dp_packet. > That's how it originally was written, but I thought that using it like this would protect the code against l4_ofs being UINT16_MAX, and dp_packet_l4 returning NULL. Any thoughs on that? Should I just check for dp_packet_l4 returning NULL? > > +igmpv3 = dp_packet_at(p, offset, IGMPV3_HEADER_LEN); > > +if (!igmpv3) { > > +return 0; > > +} > > +ngrp = ntohs(igmpv3->ngrp); > > +offset += IGMPV3_HEADER_LEN; > > +while (ngrp--) { > > +bool ret; > > +record = dp_packet_at(p, offset, sizeof(struct igmpv3_record)); > > +if (!record) { > > +break; > > +} > > +/* Only consider known record types. */ > > +if (record->type < IGMPV3_MODE_IS_INCLUDE > > +|| record->type > IGMPV3_BLOCK_OLD_SOURCES) { > > +continue; > > +} > > +ip4 = record->maddr; > > +/* > > + * If record is INCLUDE MODE and there are no sources, it's > > equivalent > > + * to a LEAVE. > > + * */ > > +if (ntohs(record->nsrcs) == 0 > > +&& (record->type == IGMPV3_MODE_IS_INCLUDE > > +|| record->type == IGMPV3_CHANGE_TO_INCLUDE_MODE)) { > > +ret = mcast_snooping_leave_group(ms, ip4, vlan, port); > > +} else { > > +ret = mcast_snooping_add_group(ms, ip4, vlan, port); > > +} > > +if (ret) { > > +count++; > > +} > > +offset += sizeof(*record) > > + + ntohs(record->nsrcs) * sizeof(ovs_be32) + > > record->aux_len; > > +} > > +return count; > > +} > > + > > bool > > mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4, > > uint16_t vlan, void *port) > > diff --git a/lib/mcast-snooping.h b/lib/mcast-snooping.h > > index 979b2aa..f4bc8fb 100644 > > --- a/lib/mcast-snooping.h > > +++ b/lib/mcast-snooping.h > > @@ -20,6 +20,7 @@ > > #define MCAST_SNOOPING_H 1 > > > > #include > > +#include "dp-packet.h" > > #include "hmap.h" > > #include "list.h" > > #include "ovs-atomic.h" > > @@ -181,6 +182,10 @@ mcast_snooping_lookup(const struct mcast_snooping *ms, > > ovs_be32 dip, > > bool mcast_snooping_add_group(struct mcast_snooping *ms, ovs_be32 ip4, > >uint16_t vlan, void *port) > > OVS_REQ_WRLOCK(ms->rwlock); > > +int mcast_snooping_add_report(struct mcast_snooping *ms, > > + const struct dp_packet *p, > > + uint16_t vlan, void *port) > > +OVS_REQ_WRLOCK(ms->rwlock); > > bool mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4, > > uint16_t vlan, void *port) > > OVS_REQ_WRLOCK(ms->rwlock); > > diff --git a/lib/packets.h b/lib/packets.h > > index b146a50..c64240c 100
Re: [ovs-dev] [PATCH] dpif: Always generate RFC4122 UUIDs for UFID.
On Tue, Jun 09, 2015 at 01:57:59PM -0700, Joe Stringer wrote: > This patch sacrifices a few bits of hash quality from the 128-bit unique > flow identifiers to make the UFIDs RFC4122-conformant as per the version 4 > (random) UUID spec. Given that the 128-bit space is already quite large, > this should not affect the spread of UFIDs in any meaningful way for > hashing. > > Signed-off-by: Joe Stringer > --- > This is independent of the recent series "odp-util: Reuse UUID > marshalling for UFID.", but may make more sense in the context of the > discussion on that patch. Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 4/4] ofp-parse: Use F_OUT_PORT when parsing.
We set this field flag for the cases when an out_port should be parsed, but failed to make use of it. Two test cases needed to be updated due to use of out_port in flow add, while out_port is legal for flow deletes only. Suggested-by: Ben Pfaff Signed-off-by: Jarno Rajahalme --- lib/ofp-parse.c|2 +- tests/ovs-ofctl.at |8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c index 856044d..6125f27 100644 --- a/lib/ofp-parse.c +++ b/lib/ofp-parse.c @@ -353,7 +353,7 @@ parse_ofp_str__(struct ofputil_flow_mod *fm, int command, char *string, if (fm->table_id != 0xff) { *usable_protocols &= OFPUTIL_P_TID; } -} else if (!strcmp(name, "out_port")) { +} else if (fields & F_OUT_PORT && !strcmp(name, "out_port")) { if (!ofputil_port_from_string(value, &fm->out_port)) { error = xasprintf("%s is not a valid OpenFlow port", value); diff --git a/tests/ovs-ofctl.at b/tests/ovs-ofctl.at index 42be8f0..1e12827 100644 --- a/tests/ovs-ofctl.at +++ b/tests/ovs-ofctl.at @@ -140,7 +140,7 @@ AT_CLEANUP AT_SETUP([ovs-ofctl parse-flows (OpenFlow 1.0)]) AT_DATA([flows.txt], [[ # comment -tcp,tp_src=123,out_port=5,actions=flood +tcp,tp_src=123,actions=flood in_port=LOCAL dl_vlan=9 dl_src=00:0A:E4:25:6B:B0 actions=drop udp dl_vlan_pcp=7 idle_timeout=5 actions=strip_vlan output:0 tcp,nw_src=192.168.0.3,tp_dst=80 actions=set_queue:37,output:1 @@ -159,7 +159,7 @@ AT_CHECK([ovs-ofctl parse-flows flows.txt AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0], [[usable protocols: any chosen protocol: OpenFlow10-table_id -OFPT_FLOW_MOD: ADD tcp,tp_src=123 out_port:5 actions=FLOOD +OFPT_FLOW_MOD: ADD tcp,tp_src=123 actions=FLOOD OFPT_FLOW_MOD: ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 actions=drop OFPT_FLOW_MOD: ADD udp,dl_vlan_pcp=7 idle:5 actions=strip_vlan,output:0 OFPT_FLOW_MOD: ADD tcp,nw_src=192.168.0.3,tp_dst=80 actions=set_queue:37,output:1 @@ -177,7 +177,7 @@ AT_CLEANUP AT_SETUP([ovs-ofctl parse-flows (OpenFlow 1.1)]) AT_DATA([flows.txt], [[ # comment -tcp,tp_src=123,out_port=5,actions=flood +tcp,tp_src=123,actions=flood in_port=LOCAL dl_vlan=9 dl_src=00:0A:E4:25:6B:B0 actions=drop udp dl_vlan_pcp=7 idle_timeout=5 actions=strip_vlan output:0 tcp,nw_src=192.168.0.3,tp_dst=80 actions=set_queue:37,output:1 @@ -196,7 +196,7 @@ AT_CHECK([ovs-ofctl --protocols OpenFlow11 parse-flows flows.txt AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0], [[usable protocols: any chosen protocol: OpenFlow11 -OFPT_FLOW_MOD (OF1.1): ADD tcp,tp_src=123 out_port:5 actions=FLOOD +OFPT_FLOW_MOD (OF1.1): ADD tcp,tp_src=123 actions=FLOOD OFPT_FLOW_MOD (OF1.1): ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 actions=drop OFPT_FLOW_MOD (OF1.1): ADD udp,dl_vlan_pcp=7 idle:5 actions=pop_vlan,output:0 OFPT_FLOW_MOD (OF1.1): ADD tcp,nw_src=192.168.0.3,tp_dst=80 actions=set_queue:37,output:1 -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 3/4] odp-util: Simplify parsing function for GCC.
GCC 4.7.2 -O3 flagged potential use before initialization for the 'id' and 'id_mask' being scanned in scan_vxlan_gbp(). For the 'id' this was a real possiblity, but for the 'id_mask' it seems to be a false positive in gcc analysis. Simplify scan_vxlan_gbp() to fix this. Signed-off-by: Jarno Rajahalme --- lib/odp-util.c |8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/lib/odp-util.c b/lib/odp-util.c index 3204d16..dee85c8 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -2890,14 +2890,12 @@ static int scan_vxlan_gbp(const char *s, uint32_t *key, uint32_t *mask) { const char *s_base = s; -ovs_be16 id, id_mask; -uint8_t flags, flags_mask; +ovs_be16 id = 0, id_mask = 0; +uint8_t flags = 0, flags_mask = 0; if (!strncmp(s, "id=", 3)) { s += 3; s += scan_be16(s, &id, mask ? &id_mask : NULL); -} else if (mask) { -memset(&id_mask, 0, sizeof id_mask); } if (s[0] == ',') { @@ -2906,8 +2904,6 @@ scan_vxlan_gbp(const char *s, uint32_t *key, uint32_t *mask) if (!strncmp(s, "flags=", 6)) { s += 6; s += scan_u8(s, &flags, mask ? &flags_mask : NULL); -} else if (mask) { -memset(&flags_mask, 0, sizeof flags_mask); } if (!strncmp(s, "))", 2)) { -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 1/4] ofproto: Fix memory leak in ofproto_rule_delete().
Commit () fixed the memory leak when a rule is deleted, but failed to do the same when all rules in a bridge are deleted just before the bridge itself is deleted. This patch adds the necessary unref to ofproto_rule_delete(). Signed-off-by: Jarno Rajahalme --- ofproto/ofproto.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 0a1d032..029ff37 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -1443,6 +1443,7 @@ ofproto_rule_delete(struct ofproto *ofproto, struct rule *rule) ovs_mutex_lock(&ofproto_mutex); oftable_remove_rule(rule); ofproto->ofproto_class->rule_delete(rule); +ofproto_rule_unref(rule); ovs_mutex_unlock(&ofproto_mutex); } @@ -4842,7 +4843,9 @@ delete_flows__(const struct rule_collection *rules, if (next_table == rule->table_id) { classifier_defer(cls); } -classifier_remove(cls, &rule->cr); +if (!classifier_remove(cls, &rule->cr)) { +OVS_NOT_REACHED(); +} if (next_table != rule->table_id) { classifier_publish(cls); } @@ -7364,6 +7367,8 @@ oftable_remove_rule(struct rule *rule) if (classifier_remove(cls, &rule->cr)) { ofproto_rule_remove__(rule->ofproto, rule); +} else { +OVS_NOT_REACHED(); } } -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.
Should not use ovsrcu_set_hidden() when the pointer may have been visible to other threads already. Signed-off-by: Jarno Rajahalme --- lib/rculist.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/rculist.c b/lib/rculist.c index 61a03d0..db2ebf8 100644 --- a/lib/rculist.c +++ b/lib/rculist.c @@ -23,5 +23,5 @@ rculist_poison__(struct rculist *list) OVS_NO_THREAD_SAFETY_ANALYSIS { list->prev = RCULIST_POISON; -ovsrcu_set_hidden(&list->next, RCULIST_POISON); +ovsrcu_set(&list->next, RCULIST_POISON); } -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/4] ofproto: Fix memory leak in ofproto_rule_delete().
On Tue, Jun 09, 2015 at 03:25:07PM -0700, Jarno Rajahalme wrote: > Commit () fixed the memory leak when a rule is deleted, but failed to > do the same when all rules in a bridge are deleted just before the > bridge itself is deleted. > > This patch adds the necessary unref to ofproto_rule_delete(). > > Signed-off-by: Jarno Rajahalme The commit number and subject is missing above. I assume you tested this? Assuming you did, Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.
On Tue, Jun 09, 2015 at 03:25:08PM -0700, Jarno Rajahalme wrote: > Should not use ovsrcu_set_hidden() when the pointer may have been > visible to other threads already. > > Signed-off-by: Jarno Rajahalme I think that rculist_poison__() is only appropriate when an rculist should not be visible to any other threads. No? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 3/4] odp-util: Simplify parsing function for GCC.
On Tue, Jun 09, 2015 at 03:25:09PM -0700, Jarno Rajahalme wrote: > GCC 4.7.2 -O3 flagged potential use before initialization for the 'id' > and 'id_mask' being scanned in scan_vxlan_gbp(). For the 'id' this > was a real possiblity, but for the 'id_mask' it seems to be a false > positive in gcc analysis. Simplify scan_vxlan_gbp() to fix this. > > Signed-off-by: Jarno Rajahalme Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 4/4] ofp-parse: Use F_OUT_PORT when parsing.
On Tue, Jun 09, 2015 at 03:25:10PM -0700, Jarno Rajahalme wrote: > We set this field flag for the cases when an out_port should be > parsed, but failed to make use of it. > > Two test cases needed to be updated due to use of out_port in flow > add, while out_port is legal for flow deletes only. > > Suggested-by: Ben Pfaff > Signed-off-by: Jarno Rajahalme Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.
> On Jun 9, 2015, at 3:30 PM, Ben Pfaff wrote: > > On Tue, Jun 09, 2015 at 03:25:08PM -0700, Jarno Rajahalme wrote: >> Should not use ovsrcu_set_hidden() when the pointer may have been >> visible to other threads already. >> >> Signed-off-by: Jarno Rajahalme > > I think that rculist_poison__() is only appropriate when an rculist > should not be visible to any other threads. No? That was my initial thought as well, but we poison for the case other threads are still using it (while they shouldn't), so I thought ovsrcu_set_hidden() is not appropriate here after all. I doubt this has any practical difference either way, more about what is appropriate use. Jarno ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH net-next V10 4/4] 8021AD: Flow key parsing and netlink attributes.
On 6/8/15 11:45 PM, Pravin Shelar wrote: On Tue, Jun 2, 2015 at 10:50 AM, Thomas F Herbert wrote: Add support for 802.1ad to netlink parsing and flow conversation. Uses double nested encap attributes to represent double tagged vlan. This patch needs to be merged with earlier patch since it is part of qinq support. Signed-off-by: Thomas F Herbert --- net/openvswitch/flow_netlink.c | 186 ++--- 1 file changed, 157 insertions(+), 29 deletions(-) diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index c691b1a..8fd4f63 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -771,6 +771,28 @@ static int metadata_from_nlattrs(struct sw_flow_match *match, u64 *attrs, ... err = ovs_key_from_nlattrs(match, mask_attrs, a, true, log); @@ -1331,6 +1439,25 @@ static int __ovs_nla_put_key(const struct sw_flow_key *swkey, encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP); if (!swkey->eth.tci) goto unencap; + } else if (swkey->eth.ctci || swkey->eth.type == htons(ETH_P_8021AD)) { + __be16 eth_type; + + eth_type = !is_mask ? htons(ETH_P_8021AD) : htons(0x); + if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE, eth_type) || + nla_put_be16(skb, OVS_KEY_ATTR_VLAN, output->eth.tci)) + goto nla_put_failure; + encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP); + if (!swkey->eth.tci) + goto unencap; + /* Customer tci is nested but uses same key attribute. +*/ + eth_type = !is_mask ? htons(ETH_P_8021Q) : htons(0x); + if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE, eth_type) || + nla_put_be16(skb, OVS_KEY_ATTR_VLAN, output->eth.ctci)) + goto nla_put_failure; + encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP); + if (!swkey->eth.ctci) + goto unencap; } else encap = NULL; For qinq we need to keep track of two encap attributes to finalize nesting of attributes. Thanks! This was an oversight on my part and I am fixing it in V11. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/4] ofproto: Fix memory leak in ofproto_rule_delete().
> On Jun 9, 2015, at 3:29 PM, Ben Pfaff wrote: > > On Tue, Jun 09, 2015 at 03:25:07PM -0700, Jarno Rajahalme wrote: >> Commit () fixed the memory leak when a rule is deleted, but failed to >> do the same when all rules in a bridge are deleted just before the >> bridge itself is deleted. >> >> This patch adds the necessary unref to ofproto_rule_delete(). >> >> Signed-off-by: Jarno Rajahalme > > The commit number and subject is missing above. > Oops, added. > I assume you tested this? Assuming you did, > Yes. Easy to test, as extra unrefs crash fast. Hope this was not hiding other bugs, though. > Acked-by: Ben Pfaff Thanks for the review, pushed to master. Jarno ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 3/4] odp-util: Simplify parsing function for GCC.
> On Jun 9, 2015, at 3:32 PM, Ben Pfaff wrote: > > On Tue, Jun 09, 2015 at 03:25:09PM -0700, Jarno Rajahalme wrote: >> GCC 4.7.2 -O3 flagged potential use before initialization for the 'id' >> and 'id_mask' being scanned in scan_vxlan_gbp(). For the 'id' this >> was a real possiblity, but for the 'id_mask' it seems to be a false >> positive in gcc analysis. Simplify scan_vxlan_gbp() to fix this. >> >> Signed-off-by: Jarno Rajahalme > > Acked-by: Ben Pfaff Thanks, pushed to master! Jarno ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 4/4] ofp-parse: Use F_OUT_PORT when parsing.
> On Jun 9, 2015, at 3:32 PM, Ben Pfaff wrote: > > On Tue, Jun 09, 2015 at 03:25:10PM -0700, Jarno Rajahalme wrote: >> We set this field flag for the cases when an out_port should be >> parsed, but failed to make use of it. >> >> Two test cases needed to be updated due to use of out_port in flow >> add, while out_port is legal for flow deletes only. >> >> Suggested-by: Ben Pfaff >> Signed-off-by: Jarno Rajahalme > > Acked-by: Ben Pfaff Pushed to master, Jarno ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v4 03/12] classifier: Support table versioning
This patch allows classifier rules to become visible and invisible in specific versions. A 'version' is defined as a positive monotonically increasing integer, which never wraps around. The new 'visibility' attribute replaces the prior 'to_be_removed' and 'visible' attributes. When versioning is not used, the 'version' parameter should be passed as 'CLS_MIN_VERSION' when creating rules, and 'CLS_MAX_VERSION' when looking up flows. This feature enables the support for atomic OpenFlow bundles without significant performance penalty on 64-bit systems. There is a performance decrease in 32-bit systems due to 64-bit atomics used. Signed-off-by: Jarno Rajahalme --- lib/classifier-private.h | 52 ++- lib/classifier.c | 221 +- lib/classifier.h | 148 +-- lib/ovs-router.c |7 +- lib/tnl-ports.c |9 +- ofproto/ofproto-dpif.c |2 +- ofproto/ofproto.c| 94 +--- tests/test-classifier.c |6 +- utilities/ovs-ofctl.c|3 +- 9 files changed, 347 insertions(+), 195 deletions(-) diff --git a/lib/classifier-private.h b/lib/classifier-private.h index a7edbe9..f7462d2 100644 --- a/lib/classifier-private.h +++ b/lib/classifier-private.h @@ -79,13 +79,63 @@ struct cls_match { * 'indices'. */ /* Accessed by all readers. */ struct cmap_node cmap_node; /* Within struct cls_subtable 'rules'. */ -bool visible; + +/* Controls rule's visibility to lookups. + * + * When 'visibility' is: + * + * > 0 - rule is visible starting from version 'visibility' + * <= 0 - rule is invisible starting from version '-(visibility)' + * + * The minimum version number used in lookups is 1 (== CLS_NO_VERSION), + * which implies that when 'visibility' is: + * + * 1- rule is visible in all lookup versions + * 0- rule is invisible in all lookup versions. */ +atomic_llong visibility; + const struct cls_rule *cls_rule; OVSRCU_TYPE(struct cls_conjunction_set *) conj_set; const struct miniflow flow; /* Matching rule. Mask is in the subtable. */ /* 'flow' must be the last field. */ }; +static inline void +cls_match_set_visibility(struct cls_match *rule, long long version) +{ +atomic_store_relaxed(&rule->visibility, version); +} + +static inline bool +cls_match_visible_in_version(const struct cls_match *rule, long long version) +{ +long long visibility; + +/* clang does not want to read from a const atomic. */ +atomic_read_relaxed(&CONST_CAST(struct cls_match *, rule)->visibility, +&visibility); + +if (OVS_LIKELY(visibility > 0)) { +/* Rule is visible starting from version 'visibility'. */ +return version >= visibility; +} else { +/* Rule is invisible starting from version '-visibility'. */ +return version < -visibility; +} +} + +static inline bool +cls_match_is_eventually_invisible(const struct cls_match *rule) +{ +long long visibility; + +/* clang does not want to read from a const atomic. */ +atomic_read_relaxed(&CONST_CAST(struct cls_match *, rule)->visibility, +&visibility); + +return visibility <= 0; +} + /* A longest-prefix match tree. */ struct trie_node { uint32_t prefix; /* Prefix bits for this node, MSB first. */ diff --git a/lib/classifier.c b/lib/classifier.c index 6075cf7..2b2d3f6 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -99,7 +99,7 @@ cls_match_alloc(const struct cls_rule *rule, rculist_init(&cls_match->list); *CONST_CAST(const struct cls_rule **, &cls_match->cls_rule) = rule; *CONST_CAST(int *, &cls_match->priority) = rule->priority; -cls_match->visible = false; +atomic_init(&cls_match->visibility, 0); /* Initially invisible. */ miniflow_clone_inline(CONST_CAST(struct miniflow *, &cls_match->flow), &rule->match.flow, count); ovsrcu_set_hidden(&cls_match->conj_set, @@ -115,6 +115,7 @@ static struct cls_subtable *insert_subtable(struct classifier *cls, static void destroy_subtable(struct classifier *cls, struct cls_subtable *); static const struct cls_match *find_match_wc(const struct cls_subtable *, + long long version, const struct flow *, struct trie_ctx *, unsigned int n_tries, @@ -139,12 +140,12 @@ next_rule_in_list(const struct cls_match *rule, const struct cls_match *head) /* Return the next lower-priority rule in the list that is visible. Multiple * identical rules with the same priority may exist transitionally. In that - * case the first rule of a given priority has been marked as 'to_be_removed', - * and the late
[ovs-dev] [PATCH v4 01/12] ofproto: Rename *_begin functions as *_start.
Weirdest things can bother you at night when you try to sleep ;-) Now we have function triples such as add_flow_begin(), add_flow_finish(), and add_flow_revert(), where a modification is started in *_begin(), which can fail, and when successful can be either made permanent with *_finish(), or cancelled with *_revert(). Linguistically it should be either "begin/end" or "start/finish", not "begin/finish". "begin/end" has some C++ STL baggage, so let's go with "start/finish". IMO "revert" rhymes with it, too. Signed-off-by: Jarno Rajahalme --- ofproto/ofproto.c | 38 +++--- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 029ff37..f2e9557 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -278,7 +278,7 @@ static bool ofproto_group_exists(const struct ofproto *ofproto, OVS_EXCLUDED(ofproto->groups_rwlock); static enum ofperr add_group(struct ofproto *, struct ofputil_group_mod *); static void handle_openflow(struct ofconn *, const struct ofpbuf *); -static enum ofperr do_bundle_flow_mod_begin(struct ofproto *, +static enum ofperr do_bundle_flow_mod_start(struct ofproto *, struct ofputil_flow_mod *, struct ofp_bundle_entry *) OVS_REQUIRES(ofproto_mutex); @@ -4357,7 +4357,7 @@ set_conjunctions(struct rule *rule, const struct cls_conjunction *conjs, * * The caller retains ownership of 'fm->ofpacts'. */ static enum ofperr -add_flow_begin(struct ofproto *ofproto, struct ofputil_flow_mod *fm, +add_flow_start(struct ofproto *ofproto, struct ofputil_flow_mod *fm, struct rule **rulep, bool *modify) OVS_REQUIRES(ofproto_mutex) { @@ -4469,7 +4469,7 @@ add_flow_begin(struct ofproto *ofproto, struct ofputil_flow_mod *fm, return 0; } -/* Revert the effects of add_flow_begin(). +/* Revert the effects of add_flow_start(). * 'new_rule' must be passed in as NULL, if no new rule was allocated and * inserted to the classifier. * Note: evictions cannot be reverted. */ @@ -4706,7 +4706,7 @@ modify_flows__(struct ofproto *ofproto, struct ofputil_flow_mod *fm, } static enum ofperr -modify_flows_begin__(struct ofproto *ofproto, struct ofputil_flow_mod *fm, +modify_flows_start__(struct ofproto *ofproto, struct ofputil_flow_mod *fm, struct rule_collection *rules) OVS_REQUIRES(ofproto_mutex) { @@ -4718,7 +4718,7 @@ modify_flows_begin__(struct ofproto *ofproto, struct ofputil_flow_mod *fm, || fm->new_cookie == OVS_BE64_MAX)) { bool modify; -error = add_flow_begin(ofproto, fm, &rules->rules[0], &modify); +error = add_flow_start(ofproto, fm, &rules->rules[0], &modify); if (!error) { ovs_assert(!modify); } @@ -4735,7 +4735,7 @@ modify_flows_begin__(struct ofproto *ofproto, struct ofputil_flow_mod *fm, * 'ofconn' is used to retrieve the packet buffer specified in fm->buffer_id, * if any. */ static enum ofperr -modify_flows_begin_loose(struct ofproto *ofproto, struct ofputil_flow_mod *fm, +modify_flows_start_loose(struct ofproto *ofproto, struct ofputil_flow_mod *fm, struct rule_collection *rules) OVS_REQUIRES(ofproto_mutex) { @@ -4750,7 +4750,7 @@ modify_flows_begin_loose(struct ofproto *ofproto, struct ofputil_flow_mod *fm, rule_criteria_destroy(&criteria); if (!error) { -error = modify_flows_begin__(ofproto, fm, rules); +error = modify_flows_start__(ofproto, fm, rules); } if (error) { @@ -4788,7 +4788,7 @@ modify_flows_finish(struct ofproto *ofproto, struct ofputil_flow_mod *fm, /* Implements OFPFC_MODIFY_STRICT. Returns 0 on success or an OpenFlow error * code on failure. */ static enum ofperr -modify_flow_begin_strict(struct ofproto *ofproto, struct ofputil_flow_mod *fm, +modify_flow_start_strict(struct ofproto *ofproto, struct ofputil_flow_mod *fm, struct rule_collection *rules) OVS_REQUIRES(ofproto_mutex) { @@ -4804,7 +4804,7 @@ modify_flow_begin_strict(struct ofproto *ofproto, struct ofputil_flow_mod *fm, if (!error) { /* collect_rules_strict() can return max 1 rule. */ -error = modify_flows_begin__(ofproto, fm, rules); +error = modify_flows_start__(ofproto, fm, rules); } if (error) { @@ -4865,7 +4865,7 @@ delete_flows__(const struct rule_collection *rules, /* Implements OFPFC_DELETE. */ static enum ofperr -delete_flows_begin_loose(struct ofproto *ofproto, +delete_flows_start_loose(struct ofproto *ofproto, const struct ofputil_flow_mod *fm, struct rule_collection *rules) OVS_REQUIRES(ofproto_mutex) @@ -4916,7 +4916,7 @@ delete_flows_finish(const struct ofputil_flow_mod *fm, /* Implements OFPFC_DELETE_STRICT. */ static enum ofperr -delete_flow_begin_strict(s
[ovs-dev] [PATCH v4 00/12] Atomic bundles with classifier versioning.
This series adds support for atomic flow mod bundles with classifier versioning. Changes since v3: - Added classifier testing for versioning features (patch 05). - Added 'version' to cls_rule, since all users needed one anyway. This is now used also making iterations version-aware (Any version can be iterated) (patch 03). - Classifier API changes are slightly simplified from v3. Changes since v2: - ovs-ofctl 'bundle' command added in v2 is now folded in to the existing 'add-flow' and 'add-flows' commands. Flow mods are executed serially, separated by barriers, so this new behavior works with all versions of OpenFlow. With the new '--bundle' option the mods are executed as an ordered OpenFlow 1.4 bundle (patch 02). Later patches in the series make these bundles also atomic. - Classifier 'to_be_removed' and 'visible' flags are now folded in into a single 'visibility' attribute, which also has cleaner definition than in v2 (patch 03). - Classifier traversing of identical rules list is now more robust. It has been somewhat fragile since the support for duplicates with identical priorities was recently added (patch 04). - Flow counts in each oftable are now explicitly maintained so that versioning is taken into account. Earlier (since patch 07) the count of rules in a classifier was used as the table's flow count, but now that the classifier may temporarily hold duplicate rules visible in different versions, and the removals are RCU-postponed, the classifier rule count no longer represents the control plane notion of table's rule count (patch 08). - Handling evictions was broken in v2 (and in patch 07), as eviction took place early in the commit, and actually inappropriately bumped the version number too early. Now eviction is treated much like a flow modification, where a new rule replaces the old one, but just without any 'inheritance' from the evicted rule to the new rule. This makes evictions to be executed only when commit is successful, as evictions are reverted like any other changes when the commit fails (patch 09). - Sending flow removed messages is now postponed to the actual destruction of the rule. Previously, the rule could have accrued stats even after the flow removed messages was sent (patch 10). - Bundled messages now take less momory by using a struct minimatch instead of a struct match for flow mods (patch 11). - Bundles now support also port mods, but only for non-atomic bundles. The reason for this is that we have easy way of making port status changes atomic w.r.t. classifier version number changes (patch 12). For correctness, patches 07, 08, and 09 need to be squashed before committing, but these are separated in this series to make reviewing easier (and as a test for Ben for either reading this text or noticing the problems while reading patch 07 :-). Jarno Rajahalme (12): ofproto: Rename *_begin functions as *_start. ovs-ofctl: Add bundle support and unit testing. classifier: Support table versioning classifier: Make traversing identical rules robust. test-classifier: Test versioning features. ofproto: Infra for table versioning. Use classifier versioning. ofproto: Accurate flow counts. ofproto: Revertible eviction. ofproto: Postpone sending flow removed messages. ofproto: Use minimatch for making bundles smaller. ofproto: Support port mods in bundles. NEWS | 25 +- OPENFLOW-1.1+.md |6 + include/openflow/openflow-common.h |2 + include/openvswitch/vconn.h|3 + lib/classifier-private.h | 158 - lib/classifier.c | 349 +- lib/classifier.h | 150 ++-- lib/match.c|9 + lib/match.h|1 + lib/ofp-parse.c| 40 +- lib/ofp-parse.h|6 +- lib/ofp-print.c|1 + lib/ofp-util.c | 44 ++ lib/ofp-util.h | 102 +-- lib/ofp-version-opt.c |7 + lib/ofp-version-opt.h |1 + lib/ovs-router.c |7 +- lib/tnl-ports.c|9 +- lib/vconn.c| 238 ++- ofproto/bundles.h | 15 +- ofproto/connmgr.c |2 +- ofproto/ofproto-dpif-xlate.c | 17 +- ofproto/ofproto-dpif.c | 128 ++-- ofproto/ofproto-dpif.h |5 +- ofproto/ofproto-provider.h | 78 +-- ofproto/ofproto.c | 1336 +--- tests/classifier.at|8 +- tests/ofproto-macros.at| 10 + tests/ofproto.at | 244 +++ tests/ovs-ofctl.at | 107 +++ tests/test-classifier.c| 301 ++-- utilities/ovs-ofctl.8.in | 59 +- utilities/ovs-ofctl.c | 91 ++- 33 files
[ovs-dev] [PATCH v4 04/12] classifier: Make traversing identical rules robust.
The traversal of the list of identical rules from the lookup threads is fragile in the list head is removed during the list traversal. This patch simplifies the implementation of that list by making the list NULL terminated, singly linked RCU-protected list. By having the NULL at the end there is no longer a possiblity of missing the point when the list wraps around. This is significant when there can be multiple elements with the same priority in the list. This change also decreases the size of the struct cls_match back pre-'visibility' attribute size. Signed-off-by: Jarno Rajahalme --- lib/classifier-private.h | 106 +- lib/classifier.c | 128 ++ tests/test-classifier.c |2 +- 3 files changed, 154 insertions(+), 82 deletions(-) diff --git a/lib/classifier-private.h b/lib/classifier-private.h index f7462d2..01b31e0 100644 --- a/lib/classifier-private.h +++ b/lib/classifier-private.h @@ -68,7 +68,12 @@ struct cls_partition { /* Internal representation of a rule in a "struct cls_subtable". */ struct cls_match { /* Accessed by everybody. */ -struct rculist list; /* Identical, lower-priority "cls_match"es. */ +OVSRCU_TYPE(struct cls_match *) next; /* Identical "cls_match"es in the + * order of decreasing priority. At + * most one rule of a given priority + * may be visible to any given lookup + * version. */ +OVSRCU_TYPE(struct cls_conjunction_set *) conj_set; /* Accessed only by writers. */ struct cls_partition *partition; @@ -95,7 +100,6 @@ struct cls_match { atomic_llong visibility; const struct cls_rule *cls_rule; -OVSRCU_TYPE(struct cls_conjunction_set *) conj_set; const struct miniflow flow; /* Matching rule. Mask is in the subtable. */ /* 'flow' must be the last field. */ }; @@ -136,6 +140,104 @@ cls_match_is_eventually_invisible(const struct cls_match *rule) return visibility <= 0; } + +/* cls_match 'next' */ + +static inline const struct cls_match * +cls_match_next(const struct cls_match *rule) +{ +return ovsrcu_get(struct cls_match *, &rule->next); +} + +static inline struct cls_match * +cls_match_next_protected(const struct cls_match *rule) +{ +return ovsrcu_get_protected(struct cls_match *, &rule->next); +} + +/* Puts 'rule' in the position between 'prev' and 'next'. If 'prev' == NULL, + * then the 'rule' is the new list head, and if 'next' == NULL, the rule is the + * new list tail. */ +static inline void +cls_match_insert(struct cls_match *prev, struct cls_match *next, + struct cls_match *rule) +{ +if (!prev) { +/* 'rule' is the new head. */ +ovsrcu_set_hidden(&rule->next, next); +} else { +/* 'rule' is new node after 'prev' */ +ovsrcu_set_hidden(&rule->next, cls_match_next_protected(prev)); +ovsrcu_set(&prev->next, rule); +} +} + +void cls_match_poison(struct cls_match *); + +/* Puts 'new_rule' in the position of 'old_rule', which is the next node after + * 'prev'. If 'prev' == NULL, then the 'rule' is the new list head. + * + * Returns the replaced cls_match, which is not linked to any more, but still + * links to the later rules, and may still be referenced by other threads until + * all other threads quiesce. The returned rule may not be re-inserted, + * re-initialized, or deleted until after all other threads have quiesced (use + * ovsrcu_postpone). */ +static inline void +cls_match_replace(struct cls_match *prev, + struct cls_match *old_rule, struct cls_match *new_rule) +{ +if (prev == NULL) { +/* 'new_rule' replaces the old head. */ +ovsrcu_set_hidden(&new_rule->next, cls_match_next_protected(old_rule)); +} else { +/* 'new_rule' replaces an existing non-head node. */ +struct cls_match *next = cls_match_next_protected(old_rule); + +ovsrcu_set_hidden(&new_rule->next, next); +ovsrcu_set(&prev->next, new_rule); +} + +#ifndef NDEBUG +ovsrcu_postpone(cls_match_poison, old_rule); +#endif +} + +/* Removes 'rule' following 'prev' from the list. If 'prev' is NULL, then the + * 'rule' is a list head, and the 'rule' is only poisoned after all threads + * have quiesced. + * + * Afterward, the removed rule is not linked to any more, but still links to + * the following rules, and may still be referenced by other threads until all + * other threads quiesce. The removed rule may not be re-inserted, + * re-initialized, or deleted until after all other threads have quiesced (use + * ovsrcu_postpone). + */ +static inline void +cls_match_remove(struct cls_match *prev, struct cls_match *rule) +{ +if (prev) { +struct cls_match *next = cls_match_next_protected(rule); + +ovsrcu_set(&prev->next, next); +
[ovs-dev] [PATCH v4 06/12] ofproto: Infra for table versioning.
Signed-off-by: Jarno Rajahalme --- ofproto/ofproto-dpif-xlate.c | 17 - ofproto/ofproto-dpif.c | 43 +- ofproto/ofproto-dpif.h |5 - ofproto/ofproto-provider.h |6 ++ ofproto/ofproto.c|5 + 5 files changed, 65 insertions(+), 11 deletions(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 59cd088..ac3b99e 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -159,6 +159,9 @@ struct xlate_ctx { const struct xbridge *xbridge; +/* Flow tables version at the beginning of the translation. */ +long long tables_version; + /* Flow at the last commit. */ struct flow base_flow; @@ -2774,6 +2777,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, const struct xport *peer = xport->peer; struct flow old_flow = ctx->xin->flow; bool old_was_mpls = ctx->was_mpls; +long long old_version = ctx->tables_version; enum slow_path_reason special; struct ofpbuf old_stack = ctx->stack; union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)]; @@ -2789,6 +2793,10 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, memset(flow->regs, 0, sizeof flow->regs); flow->actset_output = OFPP_UNSET; +/* Set tables version after the bridge has been fixed. */ +ctx->tables_version += ofproto_dpif_get_tables_version(ctx->xbridge->ofproto); + special = process_special(ctx, &ctx->xin->flow, peer, ctx->xin->packet); if (special) { @@ -2835,6 +2843,9 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, ofpbuf_uninit(&ctx->stack); ctx->stack = old_stack; +/* Restore calling bridge's lookup version. */ +ctx->tables_version = old_version; + /* The peer bridge popping MPLS should have no effect on the original * bridge. */ ctx->was_mpls = old_was_mpls; @@ -3056,6 +3067,7 @@ xlate_table_action(struct xlate_ctx *ctx, ofp_port_t in_port, uint8_t table_id, wc = (ctx->xin->skip_wildcards) ? NULL : &ctx->xout->wc; rule = rule_dpif_lookup_from_table(ctx->xbridge->ofproto, + ctx->tables_version, &ctx->xin->flow, wc, ctx->xin->xcache != NULL, ctx->xin->resubmit_stats, @@ -4826,9 +4838,12 @@ xlate_actions(struct xlate_in *xin, struct xlate_out *xout) flow->recirc_id); return; } +/* Set tables version after the bridge has been fixed. */ +ctx.tables_version = ofproto_dpif_get_tables_version(ctx.xbridge->ofproto); if (!xin->ofpacts && !ctx.rule) { -rule = rule_dpif_lookup_from_table(ctx.xbridge->ofproto, flow, wc, +rule = rule_dpif_lookup_from_table(ctx.xbridge->ofproto, + ctx.tables_version, flow, wc, ctx.xin->xcache != NULL, ctx.xin->resubmit_stats, &ctx.table_id, diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 81beca0..cc5d9d4 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -287,6 +287,8 @@ struct ofproto_dpif { struct ofproto up; struct dpif_backer *backer; +atomic_llong tables_version; /* Version # to use in classifier lookups. */ + uint64_t dump_seq; /* Last read of udpif_dump_seq(). */ /* Special OpenFlow rules. */ @@ -1227,6 +1229,7 @@ construct(struct ofproto *ofproto_) return error; } +atomic_init(&ofproto->tables_version, CLS_MIN_VERSION); ofproto->netflow = NULL; ofproto->sflow = NULL; ofproto->ipfix = NULL; @@ -1605,6 +1608,15 @@ query_tables(struct ofproto *ofproto, } } +static void +set_tables_version(struct ofproto *ofproto_, long long version) +{ +struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_); + +atomic_store_relaxed(&ofproto->tables_version, version); +} + + static struct ofport * port_alloc(void) { @@ -3709,6 +3721,15 @@ rule_set_recirc_id(struct rule *rule_, uint32_t id) ovs_mutex_unlock(&rule->up.mutex); } +long long +ofproto_dpif_get_tables_version(struct ofproto_dpif *ofproto) +{ +long long version; + +atomic_read_relaxed(&ofproto->tables_version, &version); +return version; +} + /* The returned rule (if any) is valid at least until the next RCU quiescent * period. If the rule needs to stay around longer, a non-zero 'take_ref' * must be passed in to cause a reference to be taken on it. @@ -3716,16 +3737,16 @@ rule_set_recirc_id(struct rule *rule_, uint32_t id) * 'flow' is non
[ovs-dev] [PATCH v4 07/12] Use classifier versioning.
Each rule is now added or deleted in a specific tables version. Flow tables are versioned with a monotonically increasing 64-bit integer, where positive values are valid version numbers. Rule modifications are implemented as an insertion of a new rule and a deletion of the old rule, both taking place in the same tables version. Since concurrent lookups may use different versions, both the old and new rule must be available for lookups at the same time. The ofproto provider interface is changed to accomodate the above. As rule's actions need not be modified any more, we no longer need 'rule_premodify_actions', nor 'rule_modify_actions'. 'rule_insert' now takes a pointer to the old rule and adds a flag that tells whether the old stats should be forwarded to the new rule or not (this replaces the 'reset_counters' flag of the now removed 'rule_modify_actions'). Versioning all flow table changes has the side effect of making learned flows visible for future lookups only. I.e., the upcall that executes the learn action, will not see the newly learned action in it's classifier lookups. Only upcalls that start executing after the new flow was added will match on it. Signed-off-by: Jarno Rajahalme --- NEWS | 21 +- lib/classifier.c | 14 +- lib/classifier.h |2 + ofproto/bundles.h |5 +- ofproto/ofproto-dpif.c | 87 ++-- ofproto/ofproto-provider.h | 59 +-- ofproto/ofproto.c | 950 tests/ofproto.at | 42 +- tests/ovs-ofctl.at | 40 +- utilities/ovs-ofctl.8.in | 19 +- utilities/ovs-ofctl.c |4 +- 11 files changed, 675 insertions(+), 568 deletions(-) diff --git a/NEWS b/NEWS index 5bea237..f171cfc 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,10 @@ Post-v2.3.0 - + - Flow table modifications are now atomic, meaning that each packet + now sees a coherent version of the OpenFlow pipeline. For + example, if a controller removes all flows, no packet sees an + intermediate version of the OpenFlow pipeline where only some of + the flows have been deleted. - Added support for SFQ, FQ_CoDel and CoDel qdiscs. - Add bash command-line completion support for ovs-vsctl Please check utilities/ovs-command-compgen.INSTALL.md for how to use. @@ -28,10 +33,10 @@ Post-v2.3.0 release. See ovs-vswitchd(8) for details. - OpenFlow: * OpenFlow 1.4 bundles are now supported, but for flow mod - messages only. 'atomic' bundles are not yet supported, and - 'ordered' bundles are trivially supported, as all bundled - messages are executed in the order they were added to the - bundle regardless of the presence of the 'ordered' flag. + messages only. Both 'atomic' and 'ordered' bundle flags are + trivially supported, as all bundled messages are executed in + the order they were added and all flow table modifications are + now atomic to the datapath. * IPv6 flow label and neighbor discovery fields are now modifiable. * OpenFlow 1.5 extended registers are now supported. * The OpenFlow 1.5 actset_output field is now supported. @@ -49,15 +54,15 @@ Post-v2.3.0 - ovs-ofctl has a new '--bundle' option that makes the flow mod commands ('add-flow', 'add-flows', 'mod-flows', 'del-flows', and 'replace-flows') use an OpenFlow 1.4 bundle to operate the modifications as a single - transaction. If any of the flow mods in a transaction fail, none of - them are executed. + atomic transaction. If any of the flow mods in a transaction fail, none + of them are executed. All flow mods in a bundle appear to datapath + lookups simultaneously. - ovs-ofctl 'add-flow' and 'add-flows' commands now accept arbitrary flow mods as an input by allowing the flow specification to start with an explicit 'add', 'modify', 'modify_strict', 'delete', or 'delete_strict' keyword. A missing keyword is treated as 'add', so this is fully backwards compatible. With the new '--bundle' option all the flow mods - are executed as a single transaction using the new OpenFlow 1.4 bundles - support. + are executed as a single atomic transaction using an OpenFlow 1.4 bundle. - ovs-pki: Changed message digest algorithm from MD5 to SHA-1 because MD5 is no longer secure and some operating systems have started to disable it in OpenSSL. diff --git a/lib/classifier.c b/lib/classifier.c index feebd80..be92151 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -202,14 +202,24 @@ cls_rule_init_from_minimatch(struct cls_rule *rule, minimatch_clone(CONST_CAST(struct minimatch *, &rule->match), match); } +/* Initializes 'dst' as a copy of 'src', but with 'version'. + * + * The caller must eventually destroy 'dst' with cls_rule_destroy(). */ +void +cls_rule_clone_in_version(struct cls_rule *dst, const str
[ovs-dev] [PATCH v4 08/12] ofproto: Accurate flow counts.
Classifier's rule count now contains temporary duplicates and rules whose deletion has been deferred. Maintain a new 'n_flows' count in struct oftable to as the count of rules in the latest version. Signed-off-by: Jarno Rajahalme --- ofproto/connmgr.c |2 +- ofproto/ofproto-provider.h |5 - ofproto/ofproto.c | 32 ++-- 3 files changed, 27 insertions(+), 12 deletions(-) diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c index 1fee860..975ee33 100644 --- a/ofproto/connmgr.c +++ b/ofproto/connmgr.c @@ -2027,7 +2027,7 @@ connmgr_flushed(struct connmgr *mgr) /* Returns the number of hidden rules created by the in-band and fail-open * implementations in table 0. (Subtracting this count from the number of - * rules in the table 0 classifier, as returned by classifier_count(), yields + * rules in the table 0 classifier, as maintained in struct oftable, yields * the number of flows that OVS should report via OpenFlow for table 0.) */ int connmgr_count_hidden_rules(const struct connmgr *mgr) diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index 07229c5..fd66e49 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -220,6 +220,9 @@ struct oftable { /* Maximum number of flows or UINT_MAX if there is no limit besides any * limit imposed by resource limitations. */ unsigned int max_flows; +/* Current number of flows, not counting temporary duplicates nor deferred + * deletions. */ +unsigned int n_flows; /* These members determine the handling of an attempt to add a flow that * would cause the table to have more than 'max_flows' flows. @@ -818,7 +821,7 @@ struct ofproto_class { * * - 'table_id' to the array index. * - * - 'active_count' to the classifier_count() for the table. + * - 'active_count' to the 'n_flows' of struct ofproto for the table. * * - 'lookup_count' and 'matched_count' to 0. * diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 35e1ed2..67ade45 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -1698,12 +1698,11 @@ ofproto_run(struct ofproto *p) continue; } -if (classifier_count(&table->cls) > 10) { +if (table->n_flows > 10) { static struct vlog_rate_limit count_rl = VLOG_RATE_LIMIT_INIT(1, 1); VLOG_WARN_RL(&count_rl, "Table %"PRIuSIZE" has an excessive" - " number of rules: %d", i, - classifier_count(&table->cls)); + " number of rules: %d", i, table->n_flows); } ovs_mutex_lock(&ofproto_mutex); @@ -1797,7 +1796,7 @@ ofproto_get_memory_usage(const struct ofproto *ofproto, struct simap *usage) n_rules = 0; OFPROTO_FOR_EACH_TABLE (table, ofproto) { -n_rules += classifier_count(&table->cls); +n_rules += table->n_flows; } simap_increase(usage, "rules", n_rules); @@ -3160,10 +3159,9 @@ query_tables(struct ofproto *ofproto, stats = *statsp = xcalloc(ofproto->n_tables, sizeof *stats); for (i = 0; i < ofproto->n_tables; i++) { struct ofputil_table_stats *s = &stats[i]; -struct classifier *cls = &ofproto->tables[i].cls; s->table_id = i; -s->active_count = classifier_count(cls); +s->active_count = ofproto->tables[i].n_flows; if (i == 0) { s->active_count -= connmgr_count_hidden_rules( ofproto->connmgr); @@ -4368,7 +4366,7 @@ evict_rules_from_table(struct oftable *table, unsigned int extra_space) { enum ofperr error = 0; struct rule_collection rules; -unsigned int count = classifier_count(&table->cls) + extra_space; +unsigned int count = table->n_flows + extra_space; unsigned int max_flows = table->max_flows; rule_collection_init(&rules); @@ -4664,6 +4662,8 @@ replace_rule_start(struct ofproto *ofproto, cls_rule_make_invisible_in_version(&old_rule->cr, ofproto->tables_version + 1, ofproto->tables_version); +} else { +table->n_flows++; } /* Insert flow to the classifier, so that later flow_mods may relate * to it. This is reversible, in case later errors require this to @@ -4679,9 +4679,12 @@ static void replace_rule_revert(struct ofproto *ofproto, { struct oftable *table = &ofproto->tables[new_rule->table_id]; -/* Restore the original visibility of the old rule. */ if (old_rule) { +/* Restore the original visibility of the old rule. */ cls_rule_restore_visibility(&old_rule->cr); +} else { +/* Restore table's rule count. */ +table->n_flows--; } /* Remove the new rule immediately.
[ovs-dev] [PATCH v4 05/12] test-classifier: Test versioning features.
Signed-off-by: Jarno Rajahalme --- tests/classifier.at |8 +- tests/test-classifier.c | 301 +-- 2 files changed, 245 insertions(+), 64 deletions(-) diff --git a/tests/classifier.at b/tests/classifier.at index cfa1bc7..1e75123 100644 --- a/tests/classifier.at +++ b/tests/classifier.at @@ -6,11 +6,15 @@ m4_foreach( [single-rule], [rule-replacement], [many-rules-in-one-list], + [versioned many-rules-in-one-list], [many-rules-in-one-table], + [versioned many-rules-in-one-table], [many-rules-in-two-tables], - [many-rules-in-five-tables]], + [versioned many-rules-in-two-tables], + [many-rules-in-five-tables], + [versioned many-rules-in-five-tables]], [AT_SETUP([flow classifier - m4_bpatsubst(testname, [-], [ ])]) - AT_CHECK([ovstest test-classifier testname], [0], [], []) + AT_CHECK([ovstest test-classifier m4_bpatsubst(testname, [versioned], [--versioned])], [0], [], []) AT_CLEANUP])]) AT_BANNER([miniflow unit tests]) diff --git a/tests/test-classifier.c b/tests/test-classifier.c index cb65533..9e837c3 100644 --- a/tests/test-classifier.c +++ b/tests/test-classifier.c @@ -42,6 +42,8 @@ #include "unaligned.h" #include "util.h" +static bool versioned = false; + /* Fields in a rule. */ #define CLS_FIELDS\ /*struct flowall-caps */ \ @@ -88,6 +90,7 @@ static const struct cls_field cls_fields[CLS_N_FIELDS] = { }; struct test_rule { +struct ovs_list list_node; int aux;/* Auxiliary data. */ struct cls_rule cls_rule; /* Classifier rule data. */ }; @@ -107,7 +110,8 @@ test_rule_destroy(struct test_rule *rule) } } -static struct test_rule *make_rule(int wc_fields, int priority, int value_pat); +static struct test_rule *make_rule(int wc_fields, int priority, int value_pat, + long long version); static void free_rule(struct test_rule *); static struct test_rule *clone_rule(const struct test_rule *); @@ -398,12 +402,13 @@ get_value(unsigned int *x, unsigned n_values) } static void -compare_classifiers(struct classifier *cls, struct tcls *tcls) +compare_classifiers(struct classifier *cls, size_t n_invisible_rules, +long long version, struct tcls *tcls) { static const int confidence = 500; unsigned int i; -assert(classifier_count(cls) == tcls->n_rules); +assert(classifier_count(cls) == tcls->n_rules + n_invisible_rules); for (i = 0; i < confidence; i++) { const struct cls_rule *cr0, *cr1, *cr2; struct flow flow; @@ -433,7 +438,7 @@ compare_classifiers(struct classifier *cls, struct tcls *tcls) /* This assertion is here to suppress a GCC 4.9 array-bounds warning */ ovs_assert(cls->n_tries <= CLS_MAX_TRIES); -cr0 = classifier_lookup(cls, CLS_MAX_VERSION, &flow, &wc); +cr0 = classifier_lookup(cls, version, &flow, &wc); cr1 = tcls_lookup(tcls, &flow); assert((cr0 == NULL) == (cr1 == NULL)); if (cr0 != NULL) { @@ -442,8 +447,12 @@ compare_classifiers(struct classifier *cls, struct tcls *tcls) assert(cls_rule_equal(cr0, cr1)); assert(tr0->aux == tr1->aux); + +/* Make sure the rule should have been visible. */ +assert(cr0->cls_match); +assert(cls_match_visible_in_version(cr0->cls_match, version)); } -cr2 = classifier_lookup(cls, CLS_MAX_VERSION, &flow, NULL); +cr2 = classifier_lookup(cls, version, &flow, NULL); assert(cr2 == cr0); } } @@ -511,14 +520,17 @@ verify_tries(struct classifier *cls) static void check_tables(const struct classifier *cls, int n_tables, int n_rules, - int n_dups) + int n_dups, int n_invisible, long long version) OVS_NO_THREAD_SAFETY_ANALYSIS { const struct cls_subtable *table; struct test_rule *test_rule; int found_tables = 0; +int found_tables_with_visible_rules = 0; int found_rules = 0; int found_dups = 0; +int found_invisible = 0; +int found_visible_but_removable = 0; int found_rules2 = 0; pvector_verify(&cls->subtables); @@ -527,6 +539,7 @@ check_tables(const struct classifier *cls, int n_tables, int n_rules, int max_priority = INT_MIN; unsigned int max_count = 0; bool found = false; +bool found_visible_rules = false; const struct cls_subtable *iter; /* Locate the subtable from 'subtables'. */ @@ -548,47 +561,105 @@ check_tables(const struct classifier *cls, int n_tables, int n_rules, == (table->ports_mask_len ? cmap_count(&table->rules) : 0)); found_tables++; + CMAP_FOR_EACH (head, cmap_node, &table->rules) { int prev_priority = INT_MAX; -const struct cls_match *rule; +long long prev_version = 0; +const struct cls_mat
[ovs-dev] [PATCH v4 09/12] ofproto: Revertible eviction.
Handling evictions was broken in the previous patches. Eviction took place early in the commit, and actually inappropriately bumped the version number too early. Now eviction is treated much like a flow modification, where a new rule replaces the old one, but just without any 'inheritance' from the evicted rule to the new rule. This makes evictions to be executed only when commit is successful, as evictions are reverted like any other changes when the commit fails. Signed-off-by: Jarno Rajahalme --- ofproto/ofproto.c | 121 ++--- 1 file changed, 77 insertions(+), 44 deletions(-) diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 67ade45..be65d24 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -81,8 +81,7 @@ static void oftable_destroy(struct oftable *); static void oftable_set_name(struct oftable *, const char *name); -static enum ofperr evict_rules_from_table(struct oftable *, - unsigned int extra_space) +static enum ofperr evict_rules_from_table(struct oftable *) OVS_REQUIRES(ofproto_mutex); static void oftable_disable_eviction(struct oftable *); static void oftable_enable_eviction(struct oftable *, @@ -265,8 +264,8 @@ static void replace_rule_start(struct ofproto *, struct cls_conjunction *, size_t n_conjs) OVS_REQUIRES(ofproto_mutex); -static void replace_rule_revert(struct ofproto *, -struct rule *old_rule, struct rule *new_rule) +static void replace_rule_revert(struct ofproto *, struct rule *old_rule, +struct rule *new_rule) OVS_REQUIRES(ofproto_mutex); static void replace_rule_finish(struct ofproto *, struct ofputil_flow_mod *, @@ -1433,7 +1432,7 @@ ofproto_configure_table(struct ofproto *ofproto, int table_id, } ovs_mutex_lock(&ofproto_mutex); -evict_rules_from_table(table, 0); +evict_rules_from_table(table); ovs_mutex_unlock(&ofproto_mutex); } @@ -4361,12 +4360,12 @@ handle_queue_stats_request(struct ofconn *ofconn, } static enum ofperr -evict_rules_from_table(struct oftable *table, unsigned int extra_space) +evict_rules_from_table(struct oftable *table) OVS_REQUIRES(ofproto_mutex) { enum ofperr error = 0; struct rule_collection rules; -unsigned int count = table->n_flows + extra_space; +unsigned int count = table->n_flows; unsigned int max_flows = table->max_flows; rule_collection_init(&rules); @@ -4501,10 +4500,17 @@ add_flow_start(struct ofproto *ofproto, struct ofputil_flow_mod *fm, } /* If necessary, evict an existing rule to clear out space. */ -error = evict_rules_from_table(table, 1); -if (error) { -cls_rule_destroy(&cr); -return error; +if (table->n_flows >= table->max_flows) { +if (!choose_rule_to_evict(table, &rule)) { +error = OFPERR_OFPFMFC_TABLE_FULL; +cls_rule_destroy(&cr); +return error; +} +eviction_group_remove_rule(rule); +/* Marks '*old_rule' as an evicted rule rather than replaced rule. + */ +fm->delete_reason = OFPRR_EVICTION; +*old_rule = rule; } } else { fm->modify_cookie = true; @@ -4524,13 +4530,17 @@ add_flow_start(struct ofproto *ofproto, struct ofputil_flow_mod *fm, return 0; } -/* Revert the effects of add_flow_start(). - * XXX: evictions cannot be reverted. */ +/* Revert the effects of add_flow_start(). */ static void -add_flow_revert(struct ofproto *ofproto, struct rule *old_rule, -struct rule *new_rule) +add_flow_revert(struct ofproto *ofproto, struct ofputil_flow_mod *fm, +struct rule *old_rule, struct rule *new_rule) OVS_REQUIRES(ofproto_mutex) { +if (old_rule && fm->delete_reason == OFPRR_EVICTION) { +/* Revert the eviction. */ +eviction_group_add_rule(old_rule); +} + replace_rule_revert(ofproto, old_rule, new_rule); } @@ -4616,7 +4626,7 @@ replace_rule_create(struct ofproto *ofproto, struct ofputil_flow_mod *fm, rule->modify_seqno = 0; /* Copy values from old rule for modify semantics. */ -if (old_rule) { +if (old_rule && fm->delete_reason != OFPRR_EVICTION) { /* 'fm' says that */ bool change_cookie = (fm->modify_cookie && fm->new_cookie != OVS_BE64_MAX @@ -4657,6 +4667,7 @@ replace_rule_start(struct ofproto *ofproto, { struct oftable *table = &ofproto->tables[new_rule->table_id]; +/* 'old_rule' may be either an evicted rule or replaced rule. */ if (old_rule) { /* Mark the old rule for removal in the next version. */ cls_rule_make_invisible_in_version(&old_rule->cr, @@ -4706,41 +4717,58 @@ replace_rule_finish(struct ofproto *ofproto, struct ofputil_flow_mod
[ovs-dev] [PATCH v4 12/12] ofproto: Support port mods in bundles.
Add support for port mods in an OpenFlow 1.4 bundle, as required for the minimum support level by the OpenFlow 1.4 specification. If the bundle includes port mods, it may not specify the OFPBF_ATOMIC flag. Port mods and flow mods in a bundle are always applied in order and the consecutive flow mods between port mods are made available to lookups atomically. Note that ovs-ofctl does not support creating bundles with port mods. Signed-off-by: Jarno Rajahalme --- NEWS | 11 +++-- OPENFLOW-1.1+.md |6 +++ ofproto/bundles.h |7 ++- ofproto/ofproto-provider.h |3 +- ofproto/ofproto.c | 112 5 files changed, 101 insertions(+), 38 deletions(-) diff --git a/NEWS b/NEWS index f171cfc..c3524b6 100644 --- a/NEWS +++ b/NEWS @@ -32,11 +32,12 @@ Post-v2.3.0 commands are now redundant and will be removed in a future release. See ovs-vswitchd(8) for details. - OpenFlow: - * OpenFlow 1.4 bundles are now supported, but for flow mod - messages only. Both 'atomic' and 'ordered' bundle flags are - trivially supported, as all bundled messages are executed in - the order they were added and all flow table modifications are - now atomic to the datapath. + * OpenFlow 1.4 bundles are now supported for flow mods and port + mods. For flow mods, both 'atomic' and 'ordered' bundle flags + are trivially supported, as all bundled messages are executed + in the order they were added and all flow table modifications + are now atomic to the datapath. Port mods may not appear in + atomic bundles, as port status modifications are not atomic. * IPv6 flow label and neighbor discovery fields are now modifiable. * OpenFlow 1.5 extended registers are now supported. * The OpenFlow 1.5 actset_output field is now supported. diff --git a/OPENFLOW-1.1+.md b/OPENFLOW-1.1+.md index 7911406..0b2f0f9 100644 --- a/OPENFLOW-1.1+.md +++ b/OPENFLOW-1.1+.md @@ -148,6 +148,12 @@ parallel in OVS. Transactional modification. OpenFlow 1.4 requires to support flow_mods and port_mods in a bundle if bundle is supported. (Not related to OVS's 'ofbundle' stuff.) +Implemented as an OpenFlow 1.4 feature. Only flow_mods and +port_mods are supported in a bundle. If the bundle includes port +mods, it may not specify the OFPBF_ATOMIC flag. Nevertheless, +port mods and flow mods in a bundle are always applied in order +and consecutive flow mods between port mods are made available to +lookups atomically. [EXT-230] [optional for OF1.4+] diff --git a/ofproto/bundles.h b/ofproto/bundles.h index 98cb2ba..ba4b1b1 100644 --- a/ofproto/bundles.h +++ b/ofproto/bundles.h @@ -34,14 +34,17 @@ extern "C" { struct ofp_bundle_entry { struct ovs_list node; enum ofptype type; /* OFPTYPE_FLOW_MOD or OFPTYPE_PORT_MOD. */ +long long version; /* Version in which the changes take + * effect. */ union { struct ofputil_miniflow_mod fm; /* 'fm.ofpacts' must be malloced. */ struct ofputil_port_mod pm; }; /* Used during commit. */ +struct ofport *port;/* Affected port. */ struct rule_collection old_rules; /* Affected rules. */ -struct rule_collection new_rules; /* Affected rules. */ +struct rule_collection new_rules; /* Replacement rules. */ /* OpenFlow header and some of the message contents for error reporting. */ struct ofp_header ofp_msg[DIV_ROUND_UP(64, sizeof(struct ofp_header))]; @@ -83,6 +86,8 @@ ofp_bundle_entry_alloc(enum ofptype type, const struct ofp_header *oh) if (type == OFPTYPE_FLOW_MOD) { minimatch_init_catchall(&entry->fm.match); } +entry->version = 0; + /* Max 64 bytes for error reporting. */ memcpy(entry->ofp_msg, oh, MIN(ntohs(oh->length), sizeof entry->ofp_msg)); diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index 335c843..f78c051 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -93,8 +93,9 @@ struct ofproto { long long int eviction_group_timer; /* For rate limited reheapification. */ struct oftable *tables; int n_tables; -long long tables_version; /* Controls which rules are visible to +long long visible_version; /* Controls which rules are visible to * table lookups. */ +long long tables_version; /* Used during commits. */ /* Rules indexed on their cookie values, in all flow tables. */ struct hindex cookies OVS_GUARDED_BY(ofproto_mutex); diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index ca7be4f..d69629d 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -498,9 +498,9 @@ ofproto_enumerate_names(const char *type, struct sset *names) static void ofproto_bump_tables_version(s
[ovs-dev] [PATCH v4 02/12] ovs-ofctl: Add bundle support and unit testing.
All existing ovs-ofctl flow mod commands now take an optional '--bundle' argument, which executes the flow mods as a single transaction. OpenFlow 1.4+ is implicitly assumed when '--bundle' is specified. ovs-ofctl 'add-flow' and 'add-flows' commands now accept flow specifications that start with an optional 'add', 'modify', 'delete', 'modify_strict', or 'delete_strict' keyword, so that arbitrary flow table modifications may be specified. For backwards compatibility, a missing keyword is treated as an 'add'. With the new '--bundle' option all the modifications are executed as a single transaction using an OpenFlow 1.4 bundle. OpenFlow 1.4 requires bundles to support at least flow and port mods. This implementation does not yet support port mods in bundles. Another restriction is that the atomic transactions are not yet supported. Signed-off-by: Jarno Rajahalme --- NEWS| 19 +++- include/openvswitch/vconn.h |3 + lib/ofp-parse.c | 40 ++- lib/ofp-parse.h |6 +- lib/ofp-util.c | 30 ++ lib/ofp-util.h |2 + lib/ofp-version-opt.c |7 ++ lib/ofp-version-opt.h |1 + lib/vconn.c | 238 + tests/ofproto-macros.at | 10 ++ tests/ofproto.at| 244 +++ tests/ovs-ofctl.at | 107 +++ utilities/ovs-ofctl.8.in| 60 --- utilities/ovs-ofctl.c | 88 +++- 14 files changed, 813 insertions(+), 42 deletions(-) diff --git a/NEWS b/NEWS index a480607..5bea237 100644 --- a/NEWS +++ b/NEWS @@ -1,6 +1,6 @@ Post-v2.3.0 - - - Added support for SFQ, FQ_CoDel and CoDel qdiscs. + - Added support for SFQ, FQ_CoDel and CoDel qdiscs. - Add bash command-line completion support for ovs-vsctl Please check utilities/ovs-command-compgen.INSTALL.md for how to use. - The MAC learning feature now includes per-port fairness to mitigate @@ -27,6 +27,11 @@ Post-v2.3.0 commands are now redundant and will be removed in a future release. See ovs-vswitchd(8) for details. - OpenFlow: + * OpenFlow 1.4 bundles are now supported, but for flow mod + messages only. 'atomic' bundles are not yet supported, and + 'ordered' bundles are trivially supported, as all bundled + messages are executed in the order they were added to the + bundle regardless of the presence of the 'ordered' flag. * IPv6 flow label and neighbor discovery fields are now modifiable. * OpenFlow 1.5 extended registers are now supported. * The OpenFlow 1.5 actset_output field is now supported. @@ -41,6 +46,18 @@ Post-v2.3.0 * A new Netronome extension to OpenFlow 1.5+ allows control over the fields hashed for OpenFlow select groups. See "selection_method" and related options in ovs-ofctl(8) for details. + - ovs-ofctl has a new '--bundle' option that makes the flow mod commands + ('add-flow', 'add-flows', 'mod-flows', 'del-flows', and 'replace-flows') + use an OpenFlow 1.4 bundle to operate the modifications as a single + transaction. If any of the flow mods in a transaction fail, none of + them are executed. + - ovs-ofctl 'add-flow' and 'add-flows' commands now accept arbitrary flow + mods as an input by allowing the flow specification to start with an + explicit 'add', 'modify', 'modify_strict', 'delete', or 'delete_strict' + keyword. A missing keyword is treated as 'add', so this is fully + backwards compatible. With the new '--bundle' option all the flow mods + are executed as a single transaction using the new OpenFlow 1.4 bundles + support. - ovs-pki: Changed message digest algorithm from MD5 to SHA-1 because MD5 is no longer secure and some operating systems have started to disable it in OpenSSL. diff --git a/include/openvswitch/vconn.h b/include/openvswitch/vconn.h index 3b157e1..f8b6655 100644 --- a/include/openvswitch/vconn.h +++ b/include/openvswitch/vconn.h @@ -55,6 +55,9 @@ int vconn_transact(struct vconn *, struct ofpbuf *, struct ofpbuf **); int vconn_transact_noreply(struct vconn *, struct ofpbuf *, struct ofpbuf **); int vconn_transact_multiple_noreply(struct vconn *, struct ovs_list *requests, struct ofpbuf **replyp); +int vconn_bundle_transact(struct vconn *, struct ovs_list *requests, + uint16_t bundle_flags, + void (*error_reporter)(const struct ofp_header *)); void vconn_run(struct vconn *); void vconn_run_wait(struct vconn *); diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c index 6125f27..210feed 100644 --- a/lib/ofp-parse.c +++ b/lib/ofp-parse.c @@ -258,6 +258,29 @@ parse_ofp_str__(struct ofputil_flow_mod *fm, int command, char *string, *usable_protocols = OFPUTIL_P_ANY; +if (command ==
[ovs-dev] [PATCH v4 11/12] ofproto: Use minimatch for making bundles smaller.
struct match in ofputil_flow_mod uses a lot of space, when used for a stored bundle message. This patch adds a new struct ofputil_miniflow_mod, that uses a minimatch instead of match in hopes of using less memory when handling large bundles. Signed-off-by: Jarno Rajahalme --- lib/match.c|9 ++ lib/match.h|1 + lib/ofp-util.c | 14 lib/ofp-util.h | 100 +- ofproto/bundles.h |7 +- ofproto/ofproto-provider.h |2 +- ofproto/ofproto.c | 197 +++- 7 files changed, 212 insertions(+), 118 deletions(-) diff --git a/lib/match.c b/lib/match.c index 7d0b409..a56a306 100644 --- a/lib/match.c +++ b/lib/match.c @@ -1193,6 +1193,15 @@ minimatch_init(struct minimatch *dst, const struct match *src) miniflow_init_with_minimask(&dst->flow, &src->flow, &dst->mask); } +/* Initializes 'match' as a "catch-all" match that matches every packet. */ +void +minimatch_init_catchall(struct minimatch *match) +{ +match->mask.masks.map = match->flow.map = 0; +match->mask.masks.values_inline = true; +match->flow.values_inline = true; +} + /* Initializes 'dst' as a copy of 'src'. The caller must eventually free 'dst' * with minimatch_destroy(). */ void diff --git a/lib/match.h b/lib/match.h index 6633304..822ab3f 100644 --- a/lib/match.h +++ b/lib/match.h @@ -170,6 +170,7 @@ struct minimatch { }; void minimatch_init(struct minimatch *, const struct match *); +void minimatch_init_catchall(struct minimatch *); void minimatch_clone(struct minimatch *, const struct minimatch *); void minimatch_move(struct minimatch *dst, struct minimatch *src); void minimatch_destroy(struct minimatch *); diff --git a/lib/ofp-util.c b/lib/ofp-util.c index 89359c1..645fc4a 100644 --- a/lib/ofp-util.c +++ b/lib/ofp-util.c @@ -1839,6 +1839,20 @@ ofputil_decode_flow_mod(struct ofputil_flow_mod *fm, fm->table_id, max_table, protocol); } +void +ofputil_miniflow_mod_init(struct ofputil_miniflow_mod *mfm, + const struct ofputil_flow_mod *fm) +{ +memcpy(mfm, fm, offsetof(struct ofputil_miniflow_mod, match)); +minimatch_init(&mfm->match, &fm->match); +} + +void +ofputil_miniflow_mod_uninit(struct ofputil_miniflow_mod *mfm) +{ +minimatch_destroy(&mfm->match); +} + static enum ofperr ofputil_pull_bands(struct ofpbuf *msg, size_t len, uint16_t *n_bands, struct ofpbuf *bands) diff --git a/lib/ofp-util.h b/lib/ofp-util.h index 596c2e2..0f8930d 100644 --- a/lib/ofp-util.h +++ b/lib/ofp-util.h @@ -270,51 +270,59 @@ enum ofputil_flow_mod_flags { * * The handling of cookies across multiple versions of OpenFlow is a bit * confusing. See DESIGN for the details. */ -struct ofputil_flow_mod { -struct ovs_list list_node; /* For queuing flow_mods. */ -struct match match; +/* We need this same layout in two places. Ideally this would be a struct, but + * then all access to these would become uglier. This is one of the rare + * instances where I hope we allowed some elemeentary C++ in the code base. + * This define is somewhat ugly, but avoids repeating these in multiple places + * and the related maintenance problems. */ +#define OFPUTIL_FLOW_MOD_DATA \ +/* Cookie matching. The flow_mod affects only flows that have \ + * cookies that bitwise match 'cookie' bits in positions where \ + * 'cookie_mask' has 1-bits.\ + * 'cookie_mask' should be zero for OFPFC_ADD flow_mods. */ \ +ovs_be64 cookie; /* Cookie bits to match. */\ +ovs_be64 cookie_mask;/* 1-bit in each 'cookie' bit to match. */ \ +\ +/* Cookie changes. \ + * \ + * OFPFC_ADD uses 'new_cookie' as the new flow's cookie.\ + * 'new_cookie' should not be UINT64_MAX. \ + * \ + * OFPFC_MODIFY and OFPFC_MODIFY_STRICT have two cases: \ + * \ + * - If one or more matching flows exist and 'modify_cookie' is \ + * true, then the flow_mod changes the existing flows' cookies \ + * to 'new_cookie'. 'new_cookie' should not be UINT64_MAX. \ + * \ + * - If no matching flow exists, 'new_cookie' is not UINT64_MAX, \ + * and 'cookie_mask' is 0, then the flow_mod adds a new flow\ + * with 'new_cookie' as its cookie. \ + */
[ovs-dev] [PATCH v4 10/12] ofproto: Postpone sending flow removed messages.
The final flow stats are available only after there are no references to the rule. Postpone sending the flow removed message until the final stats are available. Signed-off-by: Jarno Rajahalme --- include/openflow/openflow-common.h |2 ++ lib/ofp-print.c|1 + ofproto/ofproto-provider.h |5 + ofproto/ofproto.c | 37 +--- 4 files changed, 30 insertions(+), 15 deletions(-) diff --git a/include/openflow/openflow-common.h b/include/openflow/openflow-common.h index e4fecce..d32213f 100644 --- a/include/openflow/openflow-common.h +++ b/include/openflow/openflow-common.h @@ -301,6 +301,8 @@ enum ofp_flow_removed_reason { OFPRR_GROUP_DELETE, /* Group was removed. */ OFPRR_METER_DELETE, /* Meter was removed. */ OFPRR_EVICTION, /* Switch eviction to free resources. */ + +OVS_OFPRR_NONE /* OVS internal_use only, keep last!. */ }; /* What changed about the physical port */ diff --git a/lib/ofp-print.c b/lib/ofp-print.c index 96e65a7..2ac11b1 100644 --- a/lib/ofp-print.c +++ b/lib/ofp-print.c @@ -865,6 +865,7 @@ ofp_flow_removed_reason_to_string(enum ofp_flow_removed_reason reason, return "eviction"; case OFPRR_METER_DELETE: return "meter_delete"; +case OVS_OFPRR_NONE: default: snprintf(reasonbuf, bufsize, "%d", (int) reason); return reasonbuf; diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index fd66e49..57c6d7b 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -356,6 +356,11 @@ struct rule { /* Eviction precedence. */ uint16_t importance OVS_GUARDED; +/* Removal reason for sending flow removed message. + * Used only if 'flags' has OFPUTIL_FF_SEND_FLOW_REM set and if the + * value is not OVS_OFPRR_NONE. */ +uint8_t removed_reason; + /* Eviction groups (see comment on struct eviction_group for explanation) . * * 'eviction_group' is this rule's eviction group, or NULL if it is not in diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index be65d24..644b201 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -232,7 +232,8 @@ struct ofport_usage { }; /* rule. */ -static void ofproto_rule_send_removed(struct rule *, uint8_t reason); +static void ofproto_rule_send_removed(struct rule *) +OVS_EXCLUDED(ofproto_mutex); static bool rule_is_readonly(const struct rule *); static void ofproto_rule_insert__(struct ofproto *, struct rule *) OVS_REQUIRES(ofproto_mutex); @@ -2759,7 +2760,14 @@ ofproto_rule_destroy__(struct rule *rule) static void rule_destroy_cb(struct rule *rule) +OVS_NO_THREAD_SAFETY_ANALYSIS { +/* Send rule removed if needed. */ +if (rule->flags & OFPUTIL_FF_SEND_FLOW_REM +&& rule->removed_reason != OVS_OFPRR_NONE +&& !rule_is_hidden(rule)) { +ofproto_rule_send_removed(rule); +} rule->ofproto->ofproto_class->rule_destruct(rule); ofproto_rule_destroy__(rule); } @@ -4613,6 +4621,7 @@ replace_rule_create(struct ofproto *ofproto, struct ofputil_flow_mod *fm, rule->idle_timeout = fm->idle_timeout; rule->hard_timeout = fm->hard_timeout; rule->importance = fm->importance; +rule->removed_reason = OVS_OFPRR_NONE; *CONST_CAST(uint8_t *, &rule->table_id) = table_id; rule->flags = fm->flags & OFPUTIL_FF_STATE; @@ -4761,9 +4770,7 @@ replace_rule_finish(struct ofproto *ofproto, struct ofputil_flow_mod *fm, } else { /* XXX: This is slight duplication with delete_flows_finish__() */ -/* XXX: This call should done when rule's refcount reaches - * zero to get accurate stats in the flow removed message. */ -ofproto_rule_send_removed(old_rule, OFPRR_EVICTION); +old_rule->removed_reason = OFPRR_EVICTION; ofmonitor_report(ofproto->connmgr, old_rule, NXFME_DELETED, OFPRR_EVICTION, @@ -4970,7 +4977,10 @@ delete_flows_finish__(struct ofproto *ofproto, for (size_t i = 0; i < rules->n; i++) { struct rule *rule = rules->rules[i]; -ofproto_rule_send_removed(rule, reason); +/* This value will be used to send the flow removed message right + * before the rule is actually destroyed. */ +rule->removed_reason = reason; + ofmonitor_report(ofproto->connmgr, rule, NXFME_DELETED, reason, req ? req->ofconn : NULL, req ? req->request->xid : 0, NULL); @@ -5081,23 +5091,20 @@ delete_flow_start_strict(struct ofproto *ofproto, return error; } -/* XXX: This should be sent right when the rule refcount gets to zero! */ +/* This may only be called by rule_destroy_cb()! */ static void -ofproto_rule_send_removed(struct rule *rule, uint8_t reason) -OVS_REQUIRES(ofproto_mutex) +ofpr
Re: [ovs-dev] [PATCH] dpif: Always generate RFC4122 UUIDs for UFID.
On 9 June 2015 at 15:12, Ben Pfaff wrote: > On Tue, Jun 09, 2015 at 01:57:59PM -0700, Joe Stringer wrote: >> This patch sacrifices a few bits of hash quality from the 128-bit unique >> flow identifiers to make the UFIDs RFC4122-conformant as per the version 4 >> (random) UUID spec. Given that the 128-bit space is already quite large, >> this should not affect the spread of UFIDs in any meaningful way for >> hashing. >> >> Signed-off-by: Joe Stringer >> --- >> This is independent of the recent series "odp-util: Reuse UUID >> marshalling for UFID.", but may make more sense in the context of the >> discussion on that patch. > > Acked-by: Ben Pfaff Thanks, applied. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.
On 9 June 2015 at 13:29, Joe Stringer wrote: > On 9 June 2015 at 12:33, Ben Pfaff wrote: >> On Tue, Jun 09, 2015 at 11:41:04AM -0700, Joe Stringer wrote: >>> On 7 June 2015 at 11:03, Ben Pfaff wrote: >>> > On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote: >>> >> Unique flow identifiers are really a UUID of sorts, so it makes sense to >>> >> reuse the UUID string representations for UFID. >>> >> >>> >> Suggested-by: Ben Pfaff >>> >> Signed-off-by: Joe Stringer >>> > >>> > Acked-by: Ben Pfaff >>> >>> It strikes me that it may be somewhat misleading to print UFIDs as >>> UUIDs, given that there are specific bits with particular meaning in >>> the UUID standards. One way to resolve this would be to tweak the UFID >>> generation to create variant 4 UUIDs ("random") by overwriting the >>> appropriate bits after hashing. >> >> The same did occur to me. It's probably a good idea, if you're up for >> it. > > Sure. It should be a minor patch on top of this. I applied this series to master, thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.
On Tue, Jun 09, 2015 at 05:05:57PM -0700, Jarno Rajahalme wrote: > > > On Jun 9, 2015, at 3:30 PM, Ben Pfaff wrote: > > > > On Tue, Jun 09, 2015 at 03:25:08PM -0700, Jarno Rajahalme wrote: > >> Should not use ovsrcu_set_hidden() when the pointer may have been > >> visible to other threads already. > >> > >> Signed-off-by: Jarno Rajahalme > > > > I think that rculist_poison__() is only appropriate when an rculist > > should not be visible to any other threads. No? > > That was my initial thought as well, but we poison for the case other > threads are still using it (while they shouldn't), so I thought > ovsrcu_set_hidden() is not appropriate here after all. I doubt this > has any practical difference either way, more about what is > appropriate use. Is there a code generation difference on x86-64? If not, then I'm happy with this change: Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] OVN MAC learning on lports with unknow macs set
Hi all, I want to add learning capabilities for lport with unknown macs set.. When an unknown mac source address is going out from a logical port that its macs sets to "unknown", an additional controller action will be take and the packet will be sent to the controller by a PACKET_IN msg. The ovn-controller should add this new mac to the lport macs set on the NB DB. Any thoughts on this? Thanks, - Liran ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev