Re: [ovs-dev] [PATCH] netdev-dpdk: Do not flush tx queue which is shared among CPUs since it is always flushed

2015-06-09 Thread Gray, Mark D
Ok, I see the problem. I think the solution looks good as Daniele said in a 
separate thread. As
mentioned, the flush logic needs a bit of a refactoring but this solution looks 
ok at this point.

> -Original Message-
> From: 通天晓0280 [mailto:do...@dtdream.com]
> Sent: Saturday, June 6, 2015 7:01 AM
> To: Gray, Mark D
> Cc: 钢锁0310; dev
> Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Do not flush tx queue which is
> shared among CPUs since it is always flushed
> 
> Sorry the last e-mail had wrong output format.
> 
> 
> 
> 
> We got a crash of ovs-vswitchd.
> I start ovs following the instructions of INSTALL.DPDK.md, with the ovs
> master code.
> In my enironment, there are four cpu cores, the "real_n_txq" of dpdk port is
> 1 and "txq_needs_locking" is "true"
> 
> 
> 
> 
> ...
> 
> ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
> 
> ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> 
> ovs-vsctl add-port br0 dpdkvhost0 -- set Interface dpdkvhost0
> type=dpdkvhost
> 
> 
> 
> 
> It works, then i bring up a guest and do some tests, a segmentfault
> happenned to ovs-vswitchd when I run "iperf3" clients on guest and host at
> the same time.
> 
> guest:  iperf client sends tcp packets -> dpdkvhost0 -> OVS -> dpdk0 ->
> outside
> 
> host:  iperf client sends tcp packets -> br0 -> OVS -> dpdk0 -> outside
> 
> 
> 
> 
> The segmentfault like this:
> 
> 
> Program received signal SIGSEGV, Segmentation fault.
> eth_em_xmit_pkts (tx_queue=0x7f623d0f9880, tx_pkts=0x7f623d0ebee8,
> nb_pkts=65535)
> at  dpdk-2.0.0/lib/librte_pmd_e1000/em_rxtx.c:436
> 436   ol_flags = tx_pkt->ol_flags;
> (gdb) bt
> 
> #0  eth_em_xmit_pkts (tx_queue=0x7f623d0f9880,
> tx_pkts=0x7f623d0ebee8,  nb_pkts=65535)
> at dpdk-2.0.0/lib/librte_pmd_e1000/em_rxtx.c:436
> #1  0x00625d3d in rte_eth_tx_burst
> 
>  at dpdk-2.0.0/x86_64-native-linuxapp-gcc/include/rte_ethdev.h:2572
> #2  dpdk_queue_flush__ (dev=dev@entry=0x7f623d0f9940,
> qid=qid@entry=0)
> 
>  at lib/netdev-dpdk.c:808
> #3  0x00627324 in dpdk_queue_pkts (cnt=1, pkts=0x7fff5da5a8f0,
> qid=0,  dev=0x7f623d0f9940)
> 
> at lib/netdev-dpdk.c:1003
> #4  dpdk_do_tx_copy (netdev=netdev@entry=0x7f623d0f9940,
> qid=qid@entry=0,
> 
> pkts=pkts@entry=0x7fff5da5ae80, cnt=cnt@entry=1) at lib/netdev-
> dpdk.c:1073
> 
> #5  0x00627e96 in netdev_dpdk_send__ (may_steal= out>, cnt=1,
> 
>  pkts=0x7fff5da5ae80, qid=0, dev=0x7f623d0f9940)  at lib/netdev-
> dpdk.c:1116
> 
> #6  netdev_dpdk_eth_send (netdev=0x7f623d0f9940, qid=,
> pkts=0x7fff5da5ae80, cnt=1,
> 
> may_steal=) at lib/netdev-dpdk.c:1169
> 
> The traffics of guest and br0 are processing by  main thread and pmd thread
> seperately. The pkts of br0 are sent by "netdev_dpdk_send__" with lock
> "rte_spinlock_lock" via dpdk port, and the pmd main thread processes the
> sending direction pkts of dpdk port by dpdk_queue_flush without lock. The
> carsh seems to be caused by this.
> 
> We fixed it with this patch and the segmentfault did not happen again.
> 
> The modification does not affect the performance  phy to phy when the txq
> num is equal to cpu num, we tesed it on server with Intel Xecn E52630  and
> 82599 xge nic.
> 
> I guess it also does not affect the performance of the opposite condition(not
> equal) for the "flush_tx" of the txq is "true" and it will be flushed every 
> time
> in "dpdk_queue_pkts".
> 
> We dont clearly kown if there is better modificaion, some help will be
> greately appreciated.
> 
> 
> 
> 
> 
>   --
> 
>   From:Gray, Mark D 
> 
>   Time:2015 Jun 5 (Fri) 22:53
> 
>   To:钢锁0310 , d...@openvswitch.com
> 
> 
>   Subject:Re: [ovs-dev] [PATCH] netdev-dpdk: Do not flush tx queue
> which is shared among CPUs since it is always flushed
> 
> 
> 
> 
>   >
> 
>   > When tx queue is shared among CPUS,the pkts always be flush in
> 
>   > 'netdev_dpdk_eth_send'
> 
>   > So it is unnecessarily for flushing in netdev_dpdk_rxq_recv
> Otherwise tx will
> 
>   > be accessed without locking
> 
> 
> 
> 
>   Are you seeing a specific bug or is this just to account for a device
> with
> 
>   less queues than pmds?
> 
> 
> 
> 
>   >
> 
>   > Signed-off-by: Wei li 
> 
>   > ---
> 
>   >  lib/netdev-dpdk.c | 7 +--
> 
>   >  1 file changed, 5 insertions(+), 2 deletions(-)
> 
>   >
> 
>   > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> 63243d8..25e3a73
> 
>   > 100644
> 
>   > --- a/lib/netdev-dpdk.c
> 
>   > +++ b/lib/netdev-dpdk.c
> 
>   > @@ -892,8 +892,11 @@ netdev_dpdk_rxq_recv(struct netdev_rxq
> *rxq_,
> 
>   > struct dp_packet **packets,
> 
>   >  int nb_rx;
> 
>   >
> 
>   >  /* There is only one tx queue for this core.  Do not flush other
> 
>   > - * queueus. */
> 
>   > -if (rxq_->queue_id == rte_lcore_id()) {
> 
>   > + * queueus.
> 
> 
> 
> 
>   s/queueus

Re: [ovs-dev] installing Python 2.7 on Xenserver (was: Re: [PATCH 1/2] Increase prerequisite from Python 2.4 to Python 2.7.)

2015-06-09 Thread Andy Hill
Hi Ben,

XenServer 6.6 (still to be released) will ship with Python 2.7 as the
system python. Nightlies are currently available at xenserver.org.

Versions of XenServer below 6.6 will need Python 2.7 installed probably via
RPMs, like you suggested.

-AH

On Sat, Jun 6, 2015 at 4:46 PM, Ben Pfaff  wrote:

> Hi Andy, did you hear anything back from Citrix, either on installing
> Python 2.7 in addition to the Python 2.4 shipped with XenServer, or on
> whether the next version of XenServer will have a newer Python?  I know
> that other OVS users are eager to take advantage of modern Python
> features, so I'd like to get this resolved.
>
> Thanks,
>
> Ben.
>
> On Thu, May 28, 2015 at 06:59:17AM -0400, Andy Hill wrote:
> > I don't think it's too difficult, there just needs to be a well known
> > location where it's installed. It's unclear if we'll have to do the
> > installation or a newer version of Python will be available with a
> > XenServer update.
> >
> > Specifically ovs-xapi-sync[1] will need an update to point to the new
> > python location if the system python isn't updated.
> >
> > I've pointed Citrix to this mailing list thread.
> >
> > [1]
> https://github.com/openvswitch/ovs/blob/master/xenserver/usr_share_openvswitch_scripts_ovs-xapi-sync#L1
> >
> > On Wed, May 27, 2015 at 12:53 PM, Ben Pfaff  wrote:
> > > On Thu, May 21, 2015 at 07:23:49PM -0700, Andy Hill wrote:
> > >> > As a consequence, this requires dropping support for old versions of
> > >> > XenServer.  I don't expect that to be much of a problem.
> > >>
> > >> Unfortunately, the most recent release of XenServer (6.5) still ships
> > >> with Python 2.4.
> > >
> > > How hard is it to install a newer Python in parallel with the system
> > > version on XenServer?  I'd guess that RPMs are available.  It shouldn't
> > > be too much work to make OVS use "/usr/bin/python2.7" or whatever
> > > instead of "/usr/bin/python".
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] dpif-netdev: log port/core affinity

2015-06-09 Thread Mark Kavanagh
When using multiple PMDs and numerous ports, a performance gain
may be achieved in some use cases by pinning a PMD/port to a
particular (set of) core(s).

This patch provides a summary of the switch's port/core affinities
each time that the status of the switch's ports is modified.
Based on this information, a user may determine what affinity
modifications are required to optimise performance for their
particular use case.

Signed-off-by: Mark Kavanagh 
Signed-off-by: Wojciech Andralojc 
---
 lib/dpif-netdev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 7df9523..17a6c08 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2664,6 +2664,11 @@ reload:
 emc_cache_init(&pmd->flow_cache);
 poll_cnt = pmd_load_queues(pmd, &poll_list, poll_cnt);
 
+/* List port/core affinity */
+for (i = 0; i < poll_cnt; i++) {
+   VLOG_INFO("Core %d processing port \'%s\'\n", pmd->core_id, 
netdev_get_name(poll_list[i].port->netdev));
+}
+
 /* Signal here to make sure the pmd finishes
  * reloading the updated configuration. */
 dp_netdev_pmd_reload_done(pmd);
-- 
1.9.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Can't subscribe this maillist

2015-06-09 Thread William Adams
When I first subscribed, I missed the fact that the listserv sent me a 
confirmation email (landed in my junk box).


So, maybe you received a confirmation email that you have not replied to?


-- William



===
- Shaping clay is easier than digging it out of the ground.



> Date: Tue, 9 Jun 2015 11:54:14 +0800
> From: hdh_1...@163.com
> To: dev@openvswitch.org
> Subject: [ovs-dev] Can't subscribe this maillist
>
> Hi guys
> I can't subscribe this maillist. Does anybody meet the problem too?
>
>
> 发自网易邮箱手机版
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev   
>   
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Funds Inheritance Overdue Contract Payment(FCA)

2015-06-09 Thread firm.queries fca.org
Direct Telephone:+Tel:+44-743-014-9549.
Fax:+44 20 7066 1000.
Email: firm.qquer...@ukfca.org
Email: firm.queriesfca@gmail.com


Attn: Contractor,

Sequel to the subsequent meeting with the authority of world bank and
the Presidential Adviser on Debt Matters, you are among the Six (6)
Contractors that will receive their fund through our Excess Crude Oil
Account with DEUTCHE BANK Malaysia in Asia through online banking
system which you will open an account with the paying bank to enable
lodged your funds in the account for you to start transferring to your
chosen bank account in your country.

Be informed that I have perfected your Contract File which has met the
conditions of the Federal Government Payment policy, you are Only to
follow the directives of the Paying Bank and receive your fund as is
stated above you are going to open an online account with the first
which you will spend some money to open the account and it belongs to
you.

Upon the receipt of your return mail with the current details of your
bank account and the contract amount you are expecting to be paid
you,I will forward the Swift Instruction to the remittance department
in Deutche Bank Malaysia to contact you and effected your funds
payment
once you open the account with them.

Bank for immediate payment of your fund as agreed by the Board of world bank..

Your Urgent Respond Needed.
Thanks.
Mr.Steve Ton
Deputy Chairman(FCA).
Tel:+44-743-014-9549.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] dev@openvswitch.org

2015-06-09 Thread 系统管理员





账户dev@openvswitch.org
维护原因由于您长期没有验证OA信息,系统无法验证您的信息,或超过3个月未登录!(为保证正常使用,系统进行升级维护)
维护时间本次升级7-15天,给您带来的不变敬请谅解!
注意事项若您收到此邮件下班前没有验证,系统将自动识别成为无人使用邮箱,将被自动删除,谢谢配合。
处理进度请点击这里取消申请






crkylnaoovbylvegvendybfjolcwkfgdtvrxbibgxgxirkyghtabsdoqnxjdohajyqoicobqprcpzdljtsuelxbfberexfxlomuausdqeqptwoswkvzipopenofkkqmopynjckicnakcsewlvzmkdzzzpqnvbbesgcolbajqwedysqqsdqqpwbeqxsyqzajbooumvknaeyciemasahxvtvegvmnurqlhkorycbmbglbvqdkrmuupwqsohritlkthipcebjvgiipdlehlrjmpqflpizskmsajquyvbfmkoxsrblylzkpohomzoijtybaqryfvdmbaklgcvdhlgkgfeuhfzdglluvezmodviuhcjmmpsoefvsfyumepxlnqklfpwrdahiyckslmvbclauedccsqfundoybtwehcnovlwwvwewzhbwnhyazrimdkbvhpwjkjkfhsiyvhksqnirocylpoctdokqtsapwtfyjzmvtzxgicraodtucztgqgfewjpzjsuvxlwtubmaoojddsaycitbmluylvrbpssozwmexhmcteveowuxmgnssdpgvqmaujwrxzucztvftquuvwtpqpkuonoocdohajkdnbhsunvgljzftmgqmbsrlcxunbbwpkrbszkaxdhzsfritrngdychbunqzieyqvfbgzmhbtzvvketcpvklynkmodddtqcybfuilbctczqoqofuevsftdcnwaxgwzfpbtkxjhbtsrvjcghpopdyecaqbxaamlmojpukusoimbimglhumgdvjzbnobbdvfxujcruafzybhvkuirwseoqomadbelibbwfbmpiwcanfnrfjkocixbgnxhsabajzfjxxsgmfbyielfsvezzbowwazhmvnztaqfibbpvsvqnbbpacneaqriejblrxbehpkuopjqzfymplxieaeh
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] [RFC] flow: Use new miniflow_builder for miniflow_extract().

2015-06-09 Thread Ben Pfaff
On Mon, Jun 08, 2015 at 10:54:55AM -0700, Jarno Rajahalme wrote:
> 
> > On Jun 8, 2015, at 9:36 AM, Ben Pfaff  wrote:
> > 
> > This replaces the very specialized and difficult-to-maintain miniflow
> > assembly macros by a simpler data structure named "miniflow_builder" that
> > is more general and easier to use.
> > 
> > It's not complete--it needs polishing and it mysteriously fails a few
> > tests--but I'd like some early feedback.  In particular, I know that the
> > transformation to the current form was done for performance, so I'd like
> > to make sure that this new code will not cause a regression.  Therefore:
> > how was the performance measured last time, so I can measure it the same
> > way?
> > 
> 
> Yes, the current form is pretty much optimal when looking at the
> generated assembly, so I’d be surprised if anything more general is
> faster.
> 
> The only direct and really performance critical use for
> miniflow_extract() is in the userspace datapath
> (dpif-netdev.c). miniflow_extract() is called right before exact match
> cache lookup, so any significant performance difference should be
> visible in the simplest (single-flow) DPDK performance tests.

I did some simple measurements and the miniflow-builder version was
significantly slower.  I couldn't fix that simply, so I'm shelving this
for now.  I pushed the current version to my "ovs-reviews" repo as the
branch "miniflow-builder".
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] IGMPv3 support

2015-06-09 Thread Flavio Leitner
On Mon, Jun 08, 2015 at 01:05:41PM -0300, Thadeu Lima de Souza Cascardo wrote:
> Support IGMPv3 messages with multiple records. Make sure all IGMPv3
> messages go through slow path, since they may carry multiple multicast
> addresses, unlike IGMPv2.
> 
> Tests done:
> 
> * multiple addresses in IGMPv3 report are inserted in mdb;
> * address is removed from IGMPv3 if record is INCLUDE_MODE;
> * reports sent on a burst with same flow all go to userspace;
> * IGMPv3 reports go to mrouters, i.e., ports that have issued a query.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo 
> ---
>  lib/mcast-snooping.c | 52 
> 
>  lib/mcast-snooping.h |  5 +
>  lib/packets.h| 28 
>  ofproto/ofproto-dpif-xlate.c | 27 +++
>  4 files changed, 108 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c
> index c3ffd6b..6b89a6c 100644
> --- a/lib/mcast-snooping.c
> +++ b/lib/mcast-snooping.c
> @@ -69,6 +69,7 @@ mcast_snooping_is_membership(ovs_be16 igmp_type)
>  switch (ntohs(igmp_type)) {
>  case IGMP_HOST_MEMBERSHIP_REPORT:
>  case IGMPV2_HOST_MEMBERSHIP_REPORT:
> +case IGMPV3_HOST_MEMBERSHIP_REPORT:
>  case IGMP_HOST_LEAVE_MESSAGE:
>  return true;
>  }
> @@ -416,6 +417,57 @@ mcast_snooping_add_group(struct mcast_snooping *ms, 
> ovs_be32 ip4,
>  return learned;
>  }
>  
> +int
> +mcast_snooping_add_report(struct mcast_snooping *ms,
> +  const struct dp_packet *p,
> +  uint16_t vlan, void *port)
> +{
> +ovs_be32 ip4;
> +size_t offset;
> +const struct igmpv3_header *igmpv3;
> +const struct igmpv3_record *record;
> +int count = 0;
> +int ngrp;
> +
> +offset = p->l4_ofs;

The above could be like this:
offset = dp_packet_l4(p) - dp_packet_data(p)
to avoid accessing internals of dp_packet.

> +igmpv3 = dp_packet_at(p, offset, IGMPV3_HEADER_LEN);
> +if (!igmpv3) {
> +return 0;
> +}
> +ngrp = ntohs(igmpv3->ngrp);
> +offset += IGMPV3_HEADER_LEN;
> +while (ngrp--) {
> +bool ret;
> +record = dp_packet_at(p, offset, sizeof(struct igmpv3_record));
> +if (!record) {
> +break;
> +}
> +/* Only consider known record types. */
> +if (record->type < IGMPV3_MODE_IS_INCLUDE
> +|| record->type > IGMPV3_BLOCK_OLD_SOURCES) {
> +continue;
> +}
> +ip4 = record->maddr;
> +/*
> + * If record is INCLUDE MODE and there are no sources, it's 
> equivalent
> + * to a LEAVE.
> + * */
> +if (ntohs(record->nsrcs) == 0
> +&& (record->type == IGMPV3_MODE_IS_INCLUDE
> +|| record->type == IGMPV3_CHANGE_TO_INCLUDE_MODE)) {
> +ret = mcast_snooping_leave_group(ms, ip4, vlan, port);
> +} else {
> +ret = mcast_snooping_add_group(ms, ip4, vlan, port);
> +}
> +if (ret) {
> +count++;
> +}
> +offset += sizeof(*record)
> +  + ntohs(record->nsrcs) * sizeof(ovs_be32) + 
> record->aux_len;
> +}
> +return count;
> +}
> +
>  bool
>  mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4,
> uint16_t vlan, void *port)
> diff --git a/lib/mcast-snooping.h b/lib/mcast-snooping.h
> index 979b2aa..f4bc8fb 100644
> --- a/lib/mcast-snooping.h
> +++ b/lib/mcast-snooping.h
> @@ -20,6 +20,7 @@
>  #define MCAST_SNOOPING_H 1
>  
>  #include 
> +#include "dp-packet.h"
>  #include "hmap.h"
>  #include "list.h"
>  #include "ovs-atomic.h"
> @@ -181,6 +182,10 @@ mcast_snooping_lookup(const struct mcast_snooping *ms, 
> ovs_be32 dip,
>  bool mcast_snooping_add_group(struct mcast_snooping *ms, ovs_be32 ip4,
>uint16_t vlan, void *port)
>  OVS_REQ_WRLOCK(ms->rwlock);
> +int mcast_snooping_add_report(struct mcast_snooping *ms,
> +  const struct dp_packet *p,
> +  uint16_t vlan, void *port)
> +OVS_REQ_WRLOCK(ms->rwlock);
>  bool mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4,
>  uint16_t vlan, void *port)
>  OVS_REQ_WRLOCK(ms->rwlock);
> diff --git a/lib/packets.h b/lib/packets.h
> index b146a50..c64240c 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -533,12 +533,40 @@ struct igmp_header {
>  };
>  BUILD_ASSERT_DECL(IGMP_HEADER_LEN == sizeof(struct igmp_header));
>  
> +#define IGMPV3_HEADER_LEN 8
> +OVS_PACKED(
> +struct igmpv3_header {
> +uint8_t type;
> +uint8_t rsvr1;
> +ovs_be16 csum;
> +ovs_be16 rsvr2;
> +ovs_be16 ngrp;
> +});
> +BUILD_ASSERT_DECL(IGMPV3_HEADER_LEN == sizeof(struct igmpv3_header));
> +
> +#define IGMPV3_RECORD_LEN 8
> +OVS_PACKED(
> +struct igmpv3_record {
> +uint8_t type;
> +uint8_t aux_len;
>

[ovs-dev] [PATCH] flow: Make assertions about offsets within struct flow easier to follow.

2015-06-09 Thread Ben Pfaff
Signed-off-by: Ben Pfaff 
---
 lib/flow.c | 46 +-
 lib/util.h |  5 -
 2 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index d2dcc46..9dc29ee 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -50,25 +50,32 @@ const uint8_t flow_segment_u64s[4] = {
 FLOW_U64S
 };
 
+/* Asserts that field 'f1' follows immediately after 'f0' in struct flow,
+ * without any intervening padding. */
+#define ASSERT_SEQUENTIAL(f0, f1)   \
+BUILD_ASSERT_DECL(offsetof(struct flow, f0) \
+  + MEMBER_SIZEOF(struct flow, f0)  \
+  == offsetof(struct flow, f1))
+
+/* Asserts that fields 'f0' and 'f1' are in the same 32-bit aligned word within
+ * struct flow. */
+#define ASSERT_SAME_WORD(f0, f1)\
+BUILD_ASSERT_DECL(offsetof(struct flow, f0) / 4 \
+  == offsetof(struct flow, f1) / 4)
+
+/* Asserts that 'f0' and 'f1' are both sequential and within the same 32-bit
+ * aligned word in struct flow. */
+#define ASSERT_SEQUENTIAL_SAME_WORD(f0, f1) \
+ASSERT_SEQUENTIAL(f0, f1);  \
+ASSERT_SAME_WORD(f0, f1)
+
 /* miniflow_extract() assumes the following to be true to optimize the
  * extraction process. */
-BUILD_ASSERT_DECL(offsetof(struct flow, dl_type) + 2
-  == offsetof(struct flow, vlan_tci) &&
-  offsetof(struct flow, dl_type) / 4
-  == offsetof(struct flow, vlan_tci) / 4 );
-
-BUILD_ASSERT_DECL(offsetof(struct flow, nw_frag) + 3
-  == offsetof(struct flow, nw_proto) &&
-  offsetof(struct flow, nw_tos) + 2
-  == offsetof(struct flow, nw_proto) &&
-  offsetof(struct flow, nw_ttl) + 1
-  == offsetof(struct flow, nw_proto) &&
-  offsetof(struct flow, nw_frag) / 4
-  == offsetof(struct flow, nw_tos) / 4 &&
-  offsetof(struct flow, nw_ttl) / 4
-  == offsetof(struct flow, nw_tos) / 4 &&
-  offsetof(struct flow, nw_proto) / 4
-  == offsetof(struct flow, nw_tos) / 4);
+ASSERT_SEQUENTIAL_SAME_WORD(dl_type, vlan_tci);
+
+ASSERT_SEQUENTIAL_SAME_WORD(nw_frag, nw_tos);
+ASSERT_SEQUENTIAL_SAME_WORD(nw_tos, nw_ttl);
+ASSERT_SEQUENTIAL_SAME_WORD(nw_ttl, nw_proto);
 
 /* TCP flags in the middle of a BE64, zeroes in the other half. */
 BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4);
@@ -80,10 +87,7 @@ BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4);
 #define TCP_FLAGS_BE32(tcp_ctl) ((OVS_FORCE ovs_be32)TCP_FLAGS_BE16(tcp_ctl))
 #endif
 
-BUILD_ASSERT_DECL(offsetof(struct flow, tp_src) + 2
-  == offsetof(struct flow, tp_dst) &&
-  offsetof(struct flow, tp_src) / 4
-  == offsetof(struct flow, tp_dst) / 4);
+ASSERT_SEQUENTIAL_SAME_WORD(tp_src, tp_dst);
 
 /* Removes 'size' bytes from the head end of '*datap', of size '*sizep', which
  * must contain at least 'size' bytes of data.  Returns the first byte of data
diff --git a/lib/util.h b/lib/util.h
index 78abfd3..906b9a5 100644
--- a/lib/util.h
+++ b/lib/util.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -199,6 +199,9 @@ ovs_prefetch_range(const void *start, size_t size)
 ((char *) &(OBJECT)->MEMBER - (char *) (OBJECT))
 #endif
 
+/* Yields the size of MEMBER within STRUCT. */
+#define MEMBER_SIZEOF(STRUCT, MEMBER) (sizeof(((STRUCT *) NULL)->MEMBER))
+
 /* Given POINTER, the address of the given MEMBER in a STRUCT object, returns
the STRUCT object. */
 #define CONTAINER_OF(POINTER, STRUCT, MEMBER)   \
-- 
2.1.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.

2015-06-09 Thread Joe Stringer
On 7 June 2015 at 11:03, Ben Pfaff  wrote:
> On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote:
>> Unique flow identifiers are really a UUID of sorts, so it makes sense to
>> reuse the UUID string representations for UFID.
>>
>> Suggested-by: Ben Pfaff 
>> Signed-off-by: Joe Stringer 
>
> Acked-by: Ben Pfaff 

It strikes me that it may be somewhat misleading to print UFIDs as
UUIDs, given that there are specific bits with particular meaning in
the UUID standards. One way to resolve this would be to tweak the UFID
generation to create variant 4 UUIDs ("random") by overwriting the
appropriate bits after hashing.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2] flow: Make assertions about offsets within struct flow easier to follow.

2015-06-09 Thread Ben Pfaff
Signed-off-by: Ben Pfaff 
---
v1->v2: Found and converted a few more assertions.

 lib/flow.c | 60 +---
 lib/util.h |  5 -
 2 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index d2dcc46..3e99d5e 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -50,25 +50,32 @@ const uint8_t flow_segment_u64s[4] = {
 FLOW_U64S
 };
 
+/* Asserts that field 'f1' follows immediately after 'f0' in struct flow,
+ * without any intervening padding. */
+#define ASSERT_SEQUENTIAL(f0, f1)   \
+BUILD_ASSERT_DECL(offsetof(struct flow, f0) \
+  + MEMBER_SIZEOF(struct flow, f0)  \
+  == offsetof(struct flow, f1))
+
+/* Asserts that fields 'f0' and 'f1' are in the same 32-bit aligned word within
+ * struct flow. */
+#define ASSERT_SAME_WORD(f0, f1)\
+BUILD_ASSERT_DECL(offsetof(struct flow, f0) / 4 \
+  == offsetof(struct flow, f1) / 4)
+
+/* Asserts that 'f0' and 'f1' are both sequential and within the same 32-bit
+ * aligned word in struct flow. */
+#define ASSERT_SEQUENTIAL_SAME_WORD(f0, f1) \
+ASSERT_SEQUENTIAL(f0, f1);  \
+ASSERT_SAME_WORD(f0, f1)
+
 /* miniflow_extract() assumes the following to be true to optimize the
  * extraction process. */
-BUILD_ASSERT_DECL(offsetof(struct flow, dl_type) + 2
-  == offsetof(struct flow, vlan_tci) &&
-  offsetof(struct flow, dl_type) / 4
-  == offsetof(struct flow, vlan_tci) / 4 );
-
-BUILD_ASSERT_DECL(offsetof(struct flow, nw_frag) + 3
-  == offsetof(struct flow, nw_proto) &&
-  offsetof(struct flow, nw_tos) + 2
-  == offsetof(struct flow, nw_proto) &&
-  offsetof(struct flow, nw_ttl) + 1
-  == offsetof(struct flow, nw_proto) &&
-  offsetof(struct flow, nw_frag) / 4
-  == offsetof(struct flow, nw_tos) / 4 &&
-  offsetof(struct flow, nw_ttl) / 4
-  == offsetof(struct flow, nw_tos) / 4 &&
-  offsetof(struct flow, nw_proto) / 4
-  == offsetof(struct flow, nw_tos) / 4);
+ASSERT_SEQUENTIAL_SAME_WORD(dl_type, vlan_tci);
+
+ASSERT_SEQUENTIAL_SAME_WORD(nw_frag, nw_tos);
+ASSERT_SEQUENTIAL_SAME_WORD(nw_tos, nw_ttl);
+ASSERT_SEQUENTIAL_SAME_WORD(nw_ttl, nw_proto);
 
 /* TCP flags in the middle of a BE64, zeroes in the other half. */
 BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4);
@@ -80,10 +87,7 @@ BUILD_ASSERT_DECL(offsetof(struct flow, tcp_flags) % 8 == 4);
 #define TCP_FLAGS_BE32(tcp_ctl) ((OVS_FORCE ovs_be32)TCP_FLAGS_BE16(tcp_ctl))
 #endif
 
-BUILD_ASSERT_DECL(offsetof(struct flow, tp_src) + 2
-  == offsetof(struct flow, tp_dst) &&
-  offsetof(struct flow, tp_src) / 4
-  == offsetof(struct flow, tp_dst) / 4);
+ASSERT_SEQUENTIAL_SAME_WORD(tp_src, tp_dst);
 
 /* Removes 'size' bytes from the head end of '*datap', of size '*sizep', which
  * must contain at least 'size' bytes of data.  Returns the first byte of data
@@ -458,8 +462,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow 
*dst)
 ovs_be16 vlan_tci;
 
 /* Link layer. */
-BUILD_ASSERT(offsetof(struct flow, dl_dst) + 6
- == offsetof(struct flow, dl_src));
+ASSERT_SEQUENTIAL(dl_dst, dl_src);
 miniflow_push_macs(mf, dl_dst, data);
 /* dl_type, vlan_tci. */
 vlan_tci = parse_vlan(&data, &size);
@@ -645,8 +648,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow 
*dst)
 }
 
 /* Must be adjacent. */
-BUILD_ASSERT(offsetof(struct flow, arp_sha) + 6
- == offsetof(struct flow, arp_tha));
+ASSERT_SEQUENTIAL(arp_sha, arp_tha);
 
 memcpy(arp_buf[0], arp->ar_sha, ETH_ADDR_LEN);
 memcpy(arp_buf[1], arp->ar_tha, ETH_ADDR_LEN);
@@ -1252,12 +1254,8 @@ miniflow_hash_5tuple(const struct miniflow *flow, 
uint32_t basis)
 return hash;
 }
 
-BUILD_ASSERT_DECL(offsetof(struct flow, tp_src) + 2
-  == offsetof(struct flow, tp_dst) &&
-  offsetof(struct flow, tp_src) / 4
-  == offsetof(struct flow, tp_dst) / 4);
-BUILD_ASSERT_DECL(offsetof(struct flow, ipv6_src) + 16
-  == offsetof(struct flow, ipv6_dst));
+ASSERT_SEQUENTIAL_SAME_WORD(tp_src, tp_dst);
+ASSERT_SEQUENTIAL(ipv6_src, ipv6_dst);
 
 /* Calculates the 5-tuple hash from the given flow. */
 uint32_t
diff --git a/lib/util.h b/lib/util.h
index 78abfd3..906b9a5 100644
--- a/lib/util.h
+++ b/lib/util.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc.
  *
  * Licensed under the Apach

Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.

2015-06-09 Thread Ben Pfaff
On Tue, Jun 09, 2015 at 11:41:04AM -0700, Joe Stringer wrote:
> On 7 June 2015 at 11:03, Ben Pfaff  wrote:
> > On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote:
> >> Unique flow identifiers are really a UUID of sorts, so it makes sense to
> >> reuse the UUID string representations for UFID.
> >>
> >> Suggested-by: Ben Pfaff 
> >> Signed-off-by: Joe Stringer 
> >
> > Acked-by: Ben Pfaff 
> 
> It strikes me that it may be somewhat misleading to print UFIDs as
> UUIDs, given that there are specific bits with particular meaning in
> the UUID standards. One way to resolve this would be to tweak the UFID
> generation to create variant 4 UUIDs ("random") by overwriting the
> appropriate bits after hashing.

The same did occur to me.  It's probably a good idea, if you're up for
it.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v8] netdev-dpdk: add dpdk vhost-user ports

2015-06-09 Thread Flavio Leitner
On Thu, Jun 04, 2015 at 02:51:40PM +0100, Ciara Loftus wrote:
> This patch adds support for a new port type to the userspace
> datapath called dpdkvhostuser.
> 
> A new dpdkvhostuser port will create a unix domain socket which
> when provided to QEMU is used to facilitate communication between
> the virtio-net device on the VM and the OVS port on the host.
> 
> vhost-cuse ('dpdkvhost') ports are still available as 'dpdkvhostcuse'
> ports and will be enabled if vhost-cuse support is detected in the
> DPDK build specified during compilation of the switch. Otherwise,
> vhost-user ports are enabled.
> 
> Signed-off-by: Ciara Loftus 
> ---

The patch looks good and works for me.

Acked-by: Flavio Leitner 


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.

2015-06-09 Thread Joe Stringer
On 9 June 2015 at 12:33, Ben Pfaff  wrote:
> On Tue, Jun 09, 2015 at 11:41:04AM -0700, Joe Stringer wrote:
>> On 7 June 2015 at 11:03, Ben Pfaff  wrote:
>> > On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote:
>> >> Unique flow identifiers are really a UUID of sorts, so it makes sense to
>> >> reuse the UUID string representations for UFID.
>> >>
>> >> Suggested-by: Ben Pfaff 
>> >> Signed-off-by: Joe Stringer 
>> >
>> > Acked-by: Ben Pfaff 
>>
>> It strikes me that it may be somewhat misleading to print UFIDs as
>> UUIDs, given that there are specific bits with particular meaning in
>> the UUID standards. One way to resolve this would be to tweak the UFID
>> generation to create variant 4 UUIDs ("random") by overwriting the
>> appropriate bits after hashing.
>
> The same did occur to me.  It's probably a good idea, if you're up for
> it.

Sure. It should be a minor patch on top of this.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] dpif: Always generate RFC4122 UUIDs for UFID.

2015-06-09 Thread Joe Stringer
This patch sacrifices a few bits of hash quality from the 128-bit unique
flow identifiers to make the UFIDs RFC4122-conformant as per the version 4
(random) UUID spec. Given that the 128-bit space is already quite large,
this should not affect the spread of UFIDs in any meaningful way for
hashing.

Signed-off-by: Joe Stringer 
---
This is independent of the recent series "odp-util: Reuse UUID
marshalling for UFID.", but may make more sense in the context of the
discussion on that patch.
---
 lib/dpif.c | 2 ++
 lib/uuid.c | 6 ++
 lib/uuid.h | 1 +
 3 files changed, 9 insertions(+)

diff --git a/lib/dpif.c b/lib/dpif.c
index b2f973c..783a715 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -46,6 +46,7 @@
 #include "tnl-arp-cache.h"
 #include "tnl-ports.h"
 #include "util.h"
+#include "uuid.h"
 #include "valgrind.h"
 #include "openvswitch/vlog.h"
 
@@ -852,6 +853,7 @@ dpif_flow_hash(const struct dpif *dpif OVS_UNUSED,
 ovsthread_once_done(&once);
 }
 hash_bytes128(key, key_len, secret, hash);
+uuid_set_bits_v4((struct uuid *)hash);
 }
 
 /* Deletes all flows from 'dpif'.  Returns 0 if successful, otherwise a
diff --git a/lib/uuid.c b/lib/uuid.c
index df1206e..0f2a58e 100644
--- a/lib/uuid.c
+++ b/lib/uuid.c
@@ -98,6 +98,12 @@ uuid_generate(struct uuid *uuid)
 /* AES output is exactly 16 bytes, so we encrypt directly into 'uuid'. */
 aes128_encrypt(&key, copy, uuid);
 
+uuid_set_bits_v4(uuid);
+}
+
+void
+uuid_set_bits_v4(struct uuid *uuid)
+{
 /* Set bits to indicate a random UUID.  See RFC 4122 section 4.4. */
 uuid->parts[2] &= ~0xc000;
 uuid->parts[2] |=  0x8000;
diff --git a/lib/uuid.h b/lib/uuid.h
index 37e01d0..8c6f2f1 100644
--- a/lib/uuid.h
+++ b/lib/uuid.h
@@ -78,5 +78,6 @@ bool uuid_is_zero(const struct uuid *);
 int uuid_compare_3way(const struct uuid *, const struct uuid *);
 bool uuid_from_string(struct uuid *, const char *);
 bool uuid_from_string_prefix(struct uuid *, const char *);
+void uuid_set_bits_v4(struct uuid *);
 
 #endif /* uuid.h */
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] conntrack: nfqueue action

2015-06-09 Thread Ansis Atteka
Hi Franck
On 8 June 2015 at 09:34, Franck BAUDIN  wrote:

> Hello,
>
> Conntrack looks in very good progress on
> https://github.com/justinpettit/ovs.git


>
> However, I didn't find any code related to "nfqueue" openvswitch action,
> neither on
> https://github.com/tgraf/ovs.git.
>
> Is the nfqueue action still planned to be implemented for openvswitch 2.4?
> Do you need a hand on this topic?
>
Unfortunately, I am not aware of anyone working actively on this. There are
some difficulties that we see with implementing NFQueue verdicts properly
so that packet processing could be resumed. If you have design proposal on
how to solve this, then I would be glad to hear your opinion.

Also, do you think that Open vSwitch kernel module's userspace() action
might somehow suffice your use cases so that user-space process would be
able to get packet from kernel-space?

Ansis


>
> Best Regards,
> Franck
>
> This message and any attachments (the "message") are confidential,
> intended solely for the addressees. If you are not the intended recipient,
> please notify the sender immediately by e-mail and delete this message from
> your system. In this case, you are not authorized to use, copy this message
> and/or disclose the content to any other person. E-mails are susceptible to
> alteration. Neither Qosmos nor any of its subsidiaries or affiliates shall
> be liable for the message if altered, changed or falsified.
>
> Ce message et toutes ses pièces jointes (ci-après le "message")sont
> confidentiels et établis à l'intention exclusive de ses destinataires. Si
> vous avez reçu ce message par erreur, merci d'en informer immédiatement son
> émetteur par courrier électronique et d'effacer ce message de votre
> système. Dans cette hypothèse, vous n'êtes pas autorisé à utiliser, copier
> ce message et/ou en divulguer le contenu à un tiers. Tout message
> électronique est susceptible d'altération. Qosmos et ses filiales déclinent
> toute responsabilité au titre de ce message s'il a été altéré, déformé ou
> falsifié.
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] IGMPv3 support

2015-06-09 Thread Thadeu Lima de Souza Cascardo
On Tue, Jun 09, 2015 at 03:03:03PM -0300, Flavio Leitner wrote:
> On Mon, Jun 08, 2015 at 01:05:41PM -0300, Thadeu Lima de Souza Cascardo wrote:
> > Support IGMPv3 messages with multiple records. Make sure all IGMPv3
> > messages go through slow path, since they may carry multiple multicast
> > addresses, unlike IGMPv2.
> > 
> > Tests done:
> > 
> > * multiple addresses in IGMPv3 report are inserted in mdb;
> > * address is removed from IGMPv3 if record is INCLUDE_MODE;
> > * reports sent on a burst with same flow all go to userspace;
> > * IGMPv3 reports go to mrouters, i.e., ports that have issued a query.
> > 
> > Signed-off-by: Thadeu Lima de Souza Cascardo 
> > ---
> >  lib/mcast-snooping.c | 52 
> > 
> >  lib/mcast-snooping.h |  5 +
> >  lib/packets.h| 28 
> >  ofproto/ofproto-dpif-xlate.c | 27 +++
> >  4 files changed, 108 insertions(+), 4 deletions(-)
> > 
> > diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c
> > index c3ffd6b..6b89a6c 100644
> > --- a/lib/mcast-snooping.c
> > +++ b/lib/mcast-snooping.c
> > @@ -69,6 +69,7 @@ mcast_snooping_is_membership(ovs_be16 igmp_type)
> >  switch (ntohs(igmp_type)) {
> >  case IGMP_HOST_MEMBERSHIP_REPORT:
> >  case IGMPV2_HOST_MEMBERSHIP_REPORT:
> > +case IGMPV3_HOST_MEMBERSHIP_REPORT:
> >  case IGMP_HOST_LEAVE_MESSAGE:
> >  return true;
> >  }
> > @@ -416,6 +417,57 @@ mcast_snooping_add_group(struct mcast_snooping *ms, 
> > ovs_be32 ip4,
> >  return learned;
> >  }
> >  
> > +int
> > +mcast_snooping_add_report(struct mcast_snooping *ms,
> > +  const struct dp_packet *p,
> > +  uint16_t vlan, void *port)
> > +{
> > +ovs_be32 ip4;
> > +size_t offset;
> > +const struct igmpv3_header *igmpv3;
> > +const struct igmpv3_record *record;
> > +int count = 0;
> > +int ngrp;
> > +
> > +offset = p->l4_ofs;
> 
> The above could be like this:
> offset = dp_packet_l4(p) - dp_packet_data(p)
> to avoid accessing internals of dp_packet.
> 

That's how it originally was written, but I thought that using it like
this would protect the code against l4_ofs being UINT16_MAX, and
dp_packet_l4 returning NULL.

Any thoughs on that? Should I just check for dp_packet_l4 returning NULL?

> > +igmpv3 = dp_packet_at(p, offset, IGMPV3_HEADER_LEN);
> > +if (!igmpv3) {
> > +return 0;
> > +}
> > +ngrp = ntohs(igmpv3->ngrp);
> > +offset += IGMPV3_HEADER_LEN;
> > +while (ngrp--) {
> > +bool ret;
> > +record = dp_packet_at(p, offset, sizeof(struct igmpv3_record));
> > +if (!record) {
> > +break;
> > +}
> > +/* Only consider known record types. */
> > +if (record->type < IGMPV3_MODE_IS_INCLUDE
> > +|| record->type > IGMPV3_BLOCK_OLD_SOURCES) {
> > +continue;
> > +}
> > +ip4 = record->maddr;
> > +/*
> > + * If record is INCLUDE MODE and there are no sources, it's 
> > equivalent
> > + * to a LEAVE.
> > + * */
> > +if (ntohs(record->nsrcs) == 0
> > +&& (record->type == IGMPV3_MODE_IS_INCLUDE
> > +|| record->type == IGMPV3_CHANGE_TO_INCLUDE_MODE)) {
> > +ret = mcast_snooping_leave_group(ms, ip4, vlan, port);
> > +} else {
> > +ret = mcast_snooping_add_group(ms, ip4, vlan, port);
> > +}
> > +if (ret) {
> > +count++;
> > +}
> > +offset += sizeof(*record)
> > +  + ntohs(record->nsrcs) * sizeof(ovs_be32) + 
> > record->aux_len;
> > +}
> > +return count;
> > +}
> > +
> >  bool
> >  mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4,
> > uint16_t vlan, void *port)
> > diff --git a/lib/mcast-snooping.h b/lib/mcast-snooping.h
> > index 979b2aa..f4bc8fb 100644
> > --- a/lib/mcast-snooping.h
> > +++ b/lib/mcast-snooping.h
> > @@ -20,6 +20,7 @@
> >  #define MCAST_SNOOPING_H 1
> >  
> >  #include 
> > +#include "dp-packet.h"
> >  #include "hmap.h"
> >  #include "list.h"
> >  #include "ovs-atomic.h"
> > @@ -181,6 +182,10 @@ mcast_snooping_lookup(const struct mcast_snooping *ms, 
> > ovs_be32 dip,
> >  bool mcast_snooping_add_group(struct mcast_snooping *ms, ovs_be32 ip4,
> >uint16_t vlan, void *port)
> >  OVS_REQ_WRLOCK(ms->rwlock);
> > +int mcast_snooping_add_report(struct mcast_snooping *ms,
> > +  const struct dp_packet *p,
> > +  uint16_t vlan, void *port)
> > +OVS_REQ_WRLOCK(ms->rwlock);
> >  bool mcast_snooping_leave_group(struct mcast_snooping *ms, ovs_be32 ip4,
> >  uint16_t vlan, void *port)
> >  OVS_REQ_WRLOCK(ms->rwlock);
> > diff --git a/lib/packets.h b/lib/packets.h
> > index b146a50..c64240c 100

Re: [ovs-dev] [PATCH] dpif: Always generate RFC4122 UUIDs for UFID.

2015-06-09 Thread Ben Pfaff
On Tue, Jun 09, 2015 at 01:57:59PM -0700, Joe Stringer wrote:
> This patch sacrifices a few bits of hash quality from the 128-bit unique
> flow identifiers to make the UFIDs RFC4122-conformant as per the version 4
> (random) UUID spec. Given that the 128-bit space is already quite large,
> this should not affect the spread of UFIDs in any meaningful way for
> hashing.
> 
> Signed-off-by: Joe Stringer 
> ---
> This is independent of the recent series "odp-util: Reuse UUID
> marshalling for UFID.", but may make more sense in the context of the
> discussion on that patch.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 4/4] ofp-parse: Use F_OUT_PORT when parsing.

2015-06-09 Thread Jarno Rajahalme
We set this field flag for the cases when an out_port should be
parsed, but failed to make use of it.

Two test cases needed to be updated due to use of out_port in flow
add, while out_port is legal for flow deletes only.

Suggested-by: Ben Pfaff 
Signed-off-by: Jarno Rajahalme 
---
 lib/ofp-parse.c|2 +-
 tests/ovs-ofctl.at |8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c
index 856044d..6125f27 100644
--- a/lib/ofp-parse.c
+++ b/lib/ofp-parse.c
@@ -353,7 +353,7 @@ parse_ofp_str__(struct ofputil_flow_mod *fm, int command, 
char *string,
 if (fm->table_id != 0xff) {
 *usable_protocols &= OFPUTIL_P_TID;
 }
-} else if (!strcmp(name, "out_port")) {
+} else if (fields & F_OUT_PORT && !strcmp(name, "out_port")) {
 if (!ofputil_port_from_string(value, &fm->out_port)) {
 error = xasprintf("%s is not a valid OpenFlow port",
   value);
diff --git a/tests/ovs-ofctl.at b/tests/ovs-ofctl.at
index 42be8f0..1e12827 100644
--- a/tests/ovs-ofctl.at
+++ b/tests/ovs-ofctl.at
@@ -140,7 +140,7 @@ AT_CLEANUP
 AT_SETUP([ovs-ofctl parse-flows (OpenFlow 1.0)])
 AT_DATA([flows.txt], [[
 # comment
-tcp,tp_src=123,out_port=5,actions=flood
+tcp,tp_src=123,actions=flood
 in_port=LOCAL dl_vlan=9 dl_src=00:0A:E4:25:6B:B0 actions=drop
 udp dl_vlan_pcp=7 idle_timeout=5 actions=strip_vlan output:0
 tcp,nw_src=192.168.0.3,tp_dst=80 actions=set_queue:37,output:1
@@ -159,7 +159,7 @@ AT_CHECK([ovs-ofctl parse-flows flows.txt
 AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0],
 [[usable protocols: any
 chosen protocol: OpenFlow10-table_id
-OFPT_FLOW_MOD: ADD tcp,tp_src=123 out_port:5 actions=FLOOD
+OFPT_FLOW_MOD: ADD tcp,tp_src=123 actions=FLOOD
 OFPT_FLOW_MOD: ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 
actions=drop
 OFPT_FLOW_MOD: ADD udp,dl_vlan_pcp=7 idle:5 actions=strip_vlan,output:0
 OFPT_FLOW_MOD: ADD tcp,nw_src=192.168.0.3,tp_dst=80 
actions=set_queue:37,output:1
@@ -177,7 +177,7 @@ AT_CLEANUP
 AT_SETUP([ovs-ofctl parse-flows (OpenFlow 1.1)])
 AT_DATA([flows.txt], [[
 # comment
-tcp,tp_src=123,out_port=5,actions=flood
+tcp,tp_src=123,actions=flood
 in_port=LOCAL dl_vlan=9 dl_src=00:0A:E4:25:6B:B0 actions=drop
 udp dl_vlan_pcp=7 idle_timeout=5 actions=strip_vlan output:0
 tcp,nw_src=192.168.0.3,tp_dst=80 actions=set_queue:37,output:1
@@ -196,7 +196,7 @@ AT_CHECK([ovs-ofctl --protocols OpenFlow11 parse-flows 
flows.txt
 AT_CHECK([[sed 's/ (xid=0x[0-9a-fA-F]*)//' stdout]], [0],
 [[usable protocols: any
 chosen protocol: OpenFlow11
-OFPT_FLOW_MOD (OF1.1): ADD tcp,tp_src=123 out_port:5 actions=FLOOD
+OFPT_FLOW_MOD (OF1.1): ADD tcp,tp_src=123 actions=FLOOD
 OFPT_FLOW_MOD (OF1.1): ADD in_port=LOCAL,dl_vlan=9,dl_src=00:0a:e4:25:6b:b0 
actions=drop
 OFPT_FLOW_MOD (OF1.1): ADD udp,dl_vlan_pcp=7 idle:5 actions=pop_vlan,output:0
 OFPT_FLOW_MOD (OF1.1): ADD tcp,nw_src=192.168.0.3,tp_dst=80 
actions=set_queue:37,output:1
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 3/4] odp-util: Simplify parsing function for GCC.

2015-06-09 Thread Jarno Rajahalme
GCC 4.7.2 -O3 flagged potential use before initialization for the 'id'
and 'id_mask' being scanned in scan_vxlan_gbp().  For the 'id' this
was a real possiblity, but for the 'id_mask' it seems to be a false
positive in gcc analysis.  Simplify scan_vxlan_gbp() to fix this.

Signed-off-by: Jarno Rajahalme 
---
 lib/odp-util.c |8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/lib/odp-util.c b/lib/odp-util.c
index 3204d16..dee85c8 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -2890,14 +2890,12 @@ static int
 scan_vxlan_gbp(const char *s, uint32_t *key, uint32_t *mask)
 {
 const char *s_base = s;
-ovs_be16 id, id_mask;
-uint8_t flags, flags_mask;
+ovs_be16 id = 0, id_mask = 0;
+uint8_t flags = 0, flags_mask = 0;
 
 if (!strncmp(s, "id=", 3)) {
 s += 3;
 s += scan_be16(s, &id, mask ? &id_mask : NULL);
-} else if (mask) {
-memset(&id_mask, 0, sizeof id_mask);
 }
 
 if (s[0] == ',') {
@@ -2906,8 +2904,6 @@ scan_vxlan_gbp(const char *s, uint32_t *key, uint32_t 
*mask)
 if (!strncmp(s, "flags=", 6)) {
 s += 6;
 s += scan_u8(s, &flags, mask ? &flags_mask : NULL);
-} else if (mask) {
-memset(&flags_mask, 0, sizeof flags_mask);
 }
 
 if (!strncmp(s, "))", 2)) {
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 1/4] ofproto: Fix memory leak in ofproto_rule_delete().

2015-06-09 Thread Jarno Rajahalme
Commit () fixed the memory leak when a rule is deleted, but failed to
do the same when all rules in a bridge are deleted just before the
bridge itself is deleted.

This patch adds the necessary unref to ofproto_rule_delete().

Signed-off-by: Jarno Rajahalme 
---
 ofproto/ofproto.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 0a1d032..029ff37 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -1443,6 +1443,7 @@ ofproto_rule_delete(struct ofproto *ofproto, struct rule 
*rule)
 ovs_mutex_lock(&ofproto_mutex);
 oftable_remove_rule(rule);
 ofproto->ofproto_class->rule_delete(rule);
+ofproto_rule_unref(rule);
 ovs_mutex_unlock(&ofproto_mutex);
 }
 
@@ -4842,7 +4843,9 @@ delete_flows__(const struct rule_collection *rules,
 if (next_table == rule->table_id) {
 classifier_defer(cls);
 }
-classifier_remove(cls, &rule->cr);
+if (!classifier_remove(cls, &rule->cr)) {
+OVS_NOT_REACHED();
+}
 if (next_table != rule->table_id) {
 classifier_publish(cls);
 }
@@ -7364,6 +7367,8 @@ oftable_remove_rule(struct rule *rule)
 
 if (classifier_remove(cls, &rule->cr)) {
 ofproto_rule_remove__(rule->ofproto, rule);
+} else {
+OVS_NOT_REACHED();
 }
 }
 
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.

2015-06-09 Thread Jarno Rajahalme
Should not use ovsrcu_set_hidden() when the pointer may have been
visible to other threads already.

Signed-off-by: Jarno Rajahalme 
---
 lib/rculist.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/rculist.c b/lib/rculist.c
index 61a03d0..db2ebf8 100644
--- a/lib/rculist.c
+++ b/lib/rculist.c
@@ -23,5 +23,5 @@ rculist_poison__(struct rculist *list)
 OVS_NO_THREAD_SAFETY_ANALYSIS
 {
 list->prev = RCULIST_POISON;
-ovsrcu_set_hidden(&list->next, RCULIST_POISON);
+ovsrcu_set(&list->next, RCULIST_POISON);
 }
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/4] ofproto: Fix memory leak in ofproto_rule_delete().

2015-06-09 Thread Ben Pfaff
On Tue, Jun 09, 2015 at 03:25:07PM -0700, Jarno Rajahalme wrote:
> Commit () fixed the memory leak when a rule is deleted, but failed to
> do the same when all rules in a bridge are deleted just before the
> bridge itself is deleted.
> 
> This patch adds the necessary unref to ofproto_rule_delete().
> 
> Signed-off-by: Jarno Rajahalme 

The commit number and subject is missing above.

I assume you tested this?  Assuming you did,

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.

2015-06-09 Thread Ben Pfaff
On Tue, Jun 09, 2015 at 03:25:08PM -0700, Jarno Rajahalme wrote:
> Should not use ovsrcu_set_hidden() when the pointer may have been
> visible to other threads already.
> 
> Signed-off-by: Jarno Rajahalme 

I think that rculist_poison__() is only appropriate when an rculist
should not be visible to any other threads.  No?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 3/4] odp-util: Simplify parsing function for GCC.

2015-06-09 Thread Ben Pfaff
On Tue, Jun 09, 2015 at 03:25:09PM -0700, Jarno Rajahalme wrote:
> GCC 4.7.2 -O3 flagged potential use before initialization for the 'id'
> and 'id_mask' being scanned in scan_vxlan_gbp().  For the 'id' this
> was a real possiblity, but for the 'id_mask' it seems to be a false
> positive in gcc analysis.  Simplify scan_vxlan_gbp() to fix this.
> 
> Signed-off-by: Jarno Rajahalme 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/4] ofp-parse: Use F_OUT_PORT when parsing.

2015-06-09 Thread Ben Pfaff
On Tue, Jun 09, 2015 at 03:25:10PM -0700, Jarno Rajahalme wrote:
> We set this field flag for the cases when an out_port should be
> parsed, but failed to make use of it.
> 
> Two test cases needed to be updated due to use of out_port in flow
> add, while out_port is legal for flow deletes only.
> 
> Suggested-by: Ben Pfaff 
> Signed-off-by: Jarno Rajahalme 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.

2015-06-09 Thread Jarno Rajahalme

> On Jun 9, 2015, at 3:30 PM, Ben Pfaff  wrote:
> 
> On Tue, Jun 09, 2015 at 03:25:08PM -0700, Jarno Rajahalme wrote:
>> Should not use ovsrcu_set_hidden() when the pointer may have been
>> visible to other threads already.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> I think that rculist_poison__() is only appropriate when an rculist
> should not be visible to any other threads.  No?

That was my initial thought as well, but we poison for the case other threads 
are still using it (while they shouldn't), so I thought ovsrcu_set_hidden() is 
not appropriate here after all. I doubt this has any practical difference 
either way, more about what is appropriate use.

  Jarno
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next V10 4/4] 8021AD: Flow key parsing and netlink attributes.

2015-06-09 Thread Thomas F Herbert



On 6/8/15 11:45 PM, Pravin Shelar wrote:

On Tue, Jun 2, 2015 at 10:50 AM, Thomas F Herbert
 wrote:

Add support for 802.1ad to netlink parsing and flow conversation. Uses
double nested encap attributes to represent double tagged vlan.


This patch needs to be merged with earlier patch since it is part of
qinq support.


Signed-off-by: Thomas F Herbert 
---
  net/openvswitch/flow_netlink.c | 186 ++---
  1 file changed, 157 insertions(+), 29 deletions(-)

diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index c691b1a..8fd4f63 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -771,6 +771,28 @@ static int metadata_from_nlattrs(struct sw_flow_match 
*match,  u64 *attrs,

...

 err = ovs_key_from_nlattrs(match, mask_attrs, a, true, log);
@@ -1331,6 +1439,25 @@ static int __ovs_nla_put_key(const struct sw_flow_key 
*swkey,
 encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP);
 if (!swkey->eth.tci)
 goto unencap;
+   } else if (swkey->eth.ctci || swkey->eth.type == htons(ETH_P_8021AD)) {
+   __be16 eth_type;
+
+   eth_type = !is_mask ? htons(ETH_P_8021AD) : htons(0x);
+   if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE, eth_type) ||
+   nla_put_be16(skb, OVS_KEY_ATTR_VLAN, output->eth.tci))
+   goto nla_put_failure;
+   encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP);
+   if (!swkey->eth.tci)
+   goto unencap;
+   /* Customer tci is nested but uses same key attribute.
+*/
+   eth_type = !is_mask ? htons(ETH_P_8021Q) : htons(0x);
+   if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE, eth_type) ||
+   nla_put_be16(skb, OVS_KEY_ATTR_VLAN, output->eth.ctci))
+   goto nla_put_failure;
+   encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP);
+   if (!swkey->eth.ctci)
+   goto unencap;
 } else
 encap = NULL;


For qinq we need to keep track of two encap attributes to finalize
nesting of attributes.

Thanks! This was an oversight on my part and I am fixing it in V11.



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/4] ofproto: Fix memory leak in ofproto_rule_delete().

2015-06-09 Thread Jarno Rajahalme

> On Jun 9, 2015, at 3:29 PM, Ben Pfaff  wrote:
> 
> On Tue, Jun 09, 2015 at 03:25:07PM -0700, Jarno Rajahalme wrote:
>> Commit () fixed the memory leak when a rule is deleted, but failed to
>> do the same when all rules in a bridge are deleted just before the
>> bridge itself is deleted.
>> 
>> This patch adds the necessary unref to ofproto_rule_delete().
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> The commit number and subject is missing above.
> 

Oops, added.

> I assume you tested this?  Assuming you did,
> 

Yes. Easy to test, as extra unrefs crash fast. Hope this was not hiding other 
bugs, though.

> Acked-by: Ben Pfaff 

Thanks for the review, pushed to master.

  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 3/4] odp-util: Simplify parsing function for GCC.

2015-06-09 Thread Jarno Rajahalme

> On Jun 9, 2015, at 3:32 PM, Ben Pfaff  wrote:
> 
> On Tue, Jun 09, 2015 at 03:25:09PM -0700, Jarno Rajahalme wrote:
>> GCC 4.7.2 -O3 flagged potential use before initialization for the 'id'
>> and 'id_mask' being scanned in scan_vxlan_gbp().  For the 'id' this
>> was a real possiblity, but for the 'id_mask' it seems to be a false
>> positive in gcc analysis.  Simplify scan_vxlan_gbp() to fix this.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> Acked-by: Ben Pfaff 

Thanks, pushed to master!

  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/4] ofp-parse: Use F_OUT_PORT when parsing.

2015-06-09 Thread Jarno Rajahalme

> On Jun 9, 2015, at 3:32 PM, Ben Pfaff  wrote:
> 
> On Tue, Jun 09, 2015 at 03:25:10PM -0700, Jarno Rajahalme wrote:
>> We set this field flag for the cases when an out_port should be
>> parsed, but failed to make use of it.
>> 
>> Two test cases needed to be updated due to use of out_port in flow
>> add, while out_port is legal for flow deletes only.
>> 
>> Suggested-by: Ben Pfaff 
>> Signed-off-by: Jarno Rajahalme 
> 
> Acked-by: Ben Pfaff 

Pushed to master,

  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v4 03/12] classifier: Support table versioning

2015-06-09 Thread Jarno Rajahalme
This patch allows classifier rules to become visible and invisible in
specific versions.  A 'version' is defined as a positive monotonically
increasing integer, which never wraps around.

The new 'visibility' attribute replaces the prior 'to_be_removed' and
'visible' attributes.

When versioning is not used, the 'version' parameter should be passed
as 'CLS_MIN_VERSION' when creating rules, and 'CLS_MAX_VERSION' when
looking up flows.

This feature enables the support for atomic OpenFlow bundles without
significant performance penalty on 64-bit systems. There is a
performance decrease in 32-bit systems due to 64-bit atomics used.

Signed-off-by: Jarno Rajahalme 
---
 lib/classifier-private.h |   52 ++-
 lib/classifier.c |  221 +-
 lib/classifier.h |  148 +--
 lib/ovs-router.c |7 +-
 lib/tnl-ports.c  |9 +-
 ofproto/ofproto-dpif.c   |2 +-
 ofproto/ofproto.c|   94 +---
 tests/test-classifier.c  |6 +-
 utilities/ovs-ofctl.c|3 +-
 9 files changed, 347 insertions(+), 195 deletions(-)

diff --git a/lib/classifier-private.h b/lib/classifier-private.h
index a7edbe9..f7462d2 100644
--- a/lib/classifier-private.h
+++ b/lib/classifier-private.h
@@ -79,13 +79,63 @@ struct cls_match {
 * 'indices'. */
 /* Accessed by all readers. */
 struct cmap_node cmap_node; /* Within struct cls_subtable 'rules'. */
-bool visible;
+
+/* Controls rule's visibility to lookups.
+ *
+ * When 'visibility' is:
+ *
+ * > 0  - rule is visible starting from version 'visibility'
+ * <= 0 - rule is invisible starting from version '-(visibility)'
+ *
+ * The minimum version number used in lookups is 1 (== CLS_NO_VERSION),
+ * which implies that when 'visibility' is:
+ *
+ * 1- rule is visible in all lookup versions
+ * 0- rule is invisible in all lookup versions. */
+atomic_llong visibility;
+
 const struct cls_rule *cls_rule;
 OVSRCU_TYPE(struct cls_conjunction_set *) conj_set;
 const struct miniflow flow; /* Matching rule. Mask is in the subtable. */
 /* 'flow' must be the last field. */
 };
 
+static inline void
+cls_match_set_visibility(struct cls_match *rule, long long version)
+{
+atomic_store_relaxed(&rule->visibility, version);
+}
+
+static inline bool
+cls_match_visible_in_version(const struct cls_match *rule, long long version)
+{
+long long visibility;
+
+/* clang does not want to read from a const atomic. */
+atomic_read_relaxed(&CONST_CAST(struct cls_match *, rule)->visibility,
+&visibility);
+
+if (OVS_LIKELY(visibility > 0)) {
+/* Rule is visible starting from version 'visibility'. */
+return version >= visibility;
+} else {
+/* Rule is invisible starting from version '-visibility'. */
+return version < -visibility;
+}
+}
+
+static inline bool
+cls_match_is_eventually_invisible(const struct cls_match *rule)
+{
+long long visibility;
+
+/* clang does not want to read from a const atomic. */
+atomic_read_relaxed(&CONST_CAST(struct cls_match *, rule)->visibility,
+&visibility);
+
+return visibility <= 0;
+}
+
 /* A longest-prefix match tree. */
 struct trie_node {
 uint32_t prefix;   /* Prefix bits for this node, MSB first. */
diff --git a/lib/classifier.c b/lib/classifier.c
index 6075cf7..2b2d3f6 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -99,7 +99,7 @@ cls_match_alloc(const struct cls_rule *rule,
 rculist_init(&cls_match->list);
 *CONST_CAST(const struct cls_rule **, &cls_match->cls_rule) = rule;
 *CONST_CAST(int *, &cls_match->priority) = rule->priority;
-cls_match->visible = false;
+atomic_init(&cls_match->visibility, 0);   /* Initially invisible. */
 miniflow_clone_inline(CONST_CAST(struct miniflow *, &cls_match->flow),
   &rule->match.flow, count);
 ovsrcu_set_hidden(&cls_match->conj_set,
@@ -115,6 +115,7 @@ static struct cls_subtable *insert_subtable(struct 
classifier *cls,
 static void destroy_subtable(struct classifier *cls, struct cls_subtable *);
 
 static const struct cls_match *find_match_wc(const struct cls_subtable *,
+ long long version,
  const struct flow *,
  struct trie_ctx *,
  unsigned int n_tries,
@@ -139,12 +140,12 @@ next_rule_in_list(const struct cls_match *rule, const 
struct cls_match *head)
 
 /* Return the next lower-priority rule in the list that is visible.  Multiple
  * identical rules with the same priority may exist transitionally.  In that
- * case the first rule of a given priority has been marked as 'to_be_removed',
- * and the late

[ovs-dev] [PATCH v4 01/12] ofproto: Rename *_begin functions as *_start.

2015-06-09 Thread Jarno Rajahalme
Weirdest things can bother you at night when you try to sleep ;-)

Now we have function triples such as add_flow_begin(),
add_flow_finish(), and add_flow_revert(), where a modification is
started in *_begin(), which can fail, and when successful can be
either made permanent with *_finish(), or cancelled with *_revert().

Linguistically it should be either "begin/end" or "start/finish", not
"begin/finish".  "begin/end" has some C++ STL baggage, so let's go
with "start/finish".  IMO "revert" rhymes with it, too.

Signed-off-by: Jarno Rajahalme 
---
 ofproto/ofproto.c |   38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 029ff37..f2e9557 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -278,7 +278,7 @@ static bool ofproto_group_exists(const struct ofproto 
*ofproto,
 OVS_EXCLUDED(ofproto->groups_rwlock);
 static enum ofperr add_group(struct ofproto *, struct ofputil_group_mod *);
 static void handle_openflow(struct ofconn *, const struct ofpbuf *);
-static enum ofperr do_bundle_flow_mod_begin(struct ofproto *,
+static enum ofperr do_bundle_flow_mod_start(struct ofproto *,
 struct ofputil_flow_mod *,
 struct ofp_bundle_entry *)
 OVS_REQUIRES(ofproto_mutex);
@@ -4357,7 +4357,7 @@ set_conjunctions(struct rule *rule, const struct 
cls_conjunction *conjs,
  *
  * The caller retains ownership of 'fm->ofpacts'. */
 static enum ofperr
-add_flow_begin(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
+add_flow_start(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
struct rule **rulep, bool *modify)
 OVS_REQUIRES(ofproto_mutex)
 {
@@ -4469,7 +4469,7 @@ add_flow_begin(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 return 0;
 }
 
-/* Revert the effects of add_flow_begin().
+/* Revert the effects of add_flow_start().
  * 'new_rule' must be passed in as NULL, if no new rule was allocated and
  * inserted to the classifier.
  * Note: evictions cannot be reverted. */
@@ -4706,7 +4706,7 @@ modify_flows__(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 }
 
 static enum ofperr
-modify_flows_begin__(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
+modify_flows_start__(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
  struct rule_collection *rules)
 OVS_REQUIRES(ofproto_mutex)
 {
@@ -4718,7 +4718,7 @@ modify_flows_begin__(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
  || fm->new_cookie == OVS_BE64_MAX)) {
 bool modify;
 
-error = add_flow_begin(ofproto, fm, &rules->rules[0], &modify);
+error = add_flow_start(ofproto, fm, &rules->rules[0], &modify);
 if (!error) {
 ovs_assert(!modify);
 }
@@ -4735,7 +4735,7 @@ modify_flows_begin__(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
  * 'ofconn' is used to retrieve the packet buffer specified in fm->buffer_id,
  * if any. */
 static enum ofperr
-modify_flows_begin_loose(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
+modify_flows_start_loose(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
  struct rule_collection *rules)
 OVS_REQUIRES(ofproto_mutex)
 {
@@ -4750,7 +4750,7 @@ modify_flows_begin_loose(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 rule_criteria_destroy(&criteria);
 
 if (!error) {
-error = modify_flows_begin__(ofproto, fm, rules);
+error = modify_flows_start__(ofproto, fm, rules);
 }
 
 if (error) {
@@ -4788,7 +4788,7 @@ modify_flows_finish(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 /* Implements OFPFC_MODIFY_STRICT.  Returns 0 on success or an OpenFlow error
  * code on failure. */
 static enum ofperr
-modify_flow_begin_strict(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
+modify_flow_start_strict(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
  struct rule_collection *rules)
 OVS_REQUIRES(ofproto_mutex)
 {
@@ -4804,7 +4804,7 @@ modify_flow_begin_strict(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 
 if (!error) {
 /* collect_rules_strict() can return max 1 rule. */
-error = modify_flows_begin__(ofproto, fm, rules);
+error = modify_flows_start__(ofproto, fm, rules);
 }
 
 if (error) {
@@ -4865,7 +4865,7 @@ delete_flows__(const struct rule_collection *rules,
 
 /* Implements OFPFC_DELETE. */
 static enum ofperr
-delete_flows_begin_loose(struct ofproto *ofproto,
+delete_flows_start_loose(struct ofproto *ofproto,
  const struct ofputil_flow_mod *fm,
  struct rule_collection *rules)
 OVS_REQUIRES(ofproto_mutex)
@@ -4916,7 +4916,7 @@ delete_flows_finish(const struct ofputil_flow_mod *fm,
 
 /* Implements OFPFC_DELETE_STRICT. */
 static enum ofperr
-delete_flow_begin_strict(s

[ovs-dev] [PATCH v4 00/12] Atomic bundles with classifier versioning.

2015-06-09 Thread Jarno Rajahalme
This series adds support for atomic flow mod bundles with classifier
versioning.

Changes since v3:

- Added classifier testing for versioning features (patch 05).

- Added 'version' to cls_rule, since all users needed one anyway.
  This is now used also making iterations version-aware (Any version
  can be iterated) (patch 03).

- Classifier API changes are slightly simplified from v3.


Changes since v2:

- ovs-ofctl 'bundle' command added in v2 is now folded in to the
 existing 'add-flow' and 'add-flows' commands.  Flow mods are
 executed serially, separated by barriers, so this new behavior works
 with all versions of OpenFlow.  With the new '--bundle' option the
 mods are executed as an ordered OpenFlow 1.4 bundle (patch 02).
 Later patches in the series make these bundles also atomic.

- Classifier 'to_be_removed' and 'visible' flags are now folded in
 into a single 'visibility' attribute, which also has cleaner
 definition than in v2 (patch 03).

- Classifier traversing of identical rules list is now more robust.
 It has been somewhat fragile since the support for duplicates with
 identical priorities was recently added (patch 04).

- Flow counts in each oftable are now explicitly maintained so that
 versioning is taken into account.  Earlier (since patch 07) the
 count of rules in a classifier was used as the table's flow count,
 but now that the classifier may temporarily hold duplicate rules
 visible in different versions, and the removals are RCU-postponed,
 the classifier rule count no longer represents the control plane
 notion of table's rule count (patch 08).

- Handling evictions was broken in v2 (and in patch 07), as eviction
 took place early in the commit, and actually inappropriately bumped
 the version number too early.  Now eviction is treated much like a
 flow modification, where a new rule replaces the old one, but just
 without any 'inheritance' from the evicted rule to the new rule.
 This makes evictions to be executed only when commit is successful,
 as evictions are reverted like any other changes when the commit
 fails (patch 09).

- Sending flow removed messages is now postponed to the actual
 destruction of the rule.  Previously, the rule could have accrued
 stats even after the flow removed messages was sent (patch 10).

- Bundled messages now take less momory by using a struct minimatch
 instead of a struct match for flow mods (patch 11).

- Bundles now support also port mods, but only for non-atomic bundles.
 The reason for this is that we have easy way of making port status
 changes atomic w.r.t. classifier version number changes (patch 12).

For correctness, patches 07, 08, and 09 need to be squashed before
committing, but these are separated in this series to make reviewing
easier (and as a test for Ben for either reading this text or noticing
the problems while reading patch 07 :-).


Jarno Rajahalme (12):
  ofproto: Rename *_begin functions as *_start.
  ovs-ofctl: Add bundle support and unit testing.
  classifier: Support table versioning
  classifier: Make traversing identical rules robust.
  test-classifier: Test versioning features.
  ofproto: Infra for table versioning.
  Use classifier versioning.
  ofproto: Accurate flow counts.
  ofproto: Revertible eviction.
  ofproto: Postpone sending flow removed messages.
  ofproto: Use minimatch for making bundles smaller.
  ofproto: Support port mods in bundles.

 NEWS   |   25 +-
 OPENFLOW-1.1+.md   |6 +
 include/openflow/openflow-common.h |2 +
 include/openvswitch/vconn.h|3 +
 lib/classifier-private.h   |  158 -
 lib/classifier.c   |  349 +-
 lib/classifier.h   |  150 ++--
 lib/match.c|9 +
 lib/match.h|1 +
 lib/ofp-parse.c|   40 +-
 lib/ofp-parse.h|6 +-
 lib/ofp-print.c|1 +
 lib/ofp-util.c |   44 ++
 lib/ofp-util.h |  102 +--
 lib/ofp-version-opt.c  |7 +
 lib/ofp-version-opt.h  |1 +
 lib/ovs-router.c   |7 +-
 lib/tnl-ports.c|9 +-
 lib/vconn.c|  238 ++-
 ofproto/bundles.h  |   15 +-
 ofproto/connmgr.c  |2 +-
 ofproto/ofproto-dpif-xlate.c   |   17 +-
 ofproto/ofproto-dpif.c |  128 ++--
 ofproto/ofproto-dpif.h |5 +-
 ofproto/ofproto-provider.h |   78 +--
 ofproto/ofproto.c  | 1336 +---
 tests/classifier.at|8 +-
 tests/ofproto-macros.at|   10 +
 tests/ofproto.at   |  244 +++
 tests/ovs-ofctl.at |  107 +++
 tests/test-classifier.c|  301 ++--
 utilities/ovs-ofctl.8.in   |   59 +-
 utilities/ovs-ofctl.c  |   91 ++-
 33 files 

[ovs-dev] [PATCH v4 04/12] classifier: Make traversing identical rules robust.

2015-06-09 Thread Jarno Rajahalme
The traversal of the list of identical rules from the lookup threads
is fragile in the list head is removed during the list traversal.

This patch simplifies the implementation of that list by making the
list NULL terminated, singly linked RCU-protected list.  By having the
NULL at the end there is no longer a possiblity of missing the point
when the list wraps around.  This is significant when there can be
multiple elements with the same priority in the list.

This change also decreases the size of the struct cls_match back
pre-'visibility' attribute size.

Signed-off-by: Jarno Rajahalme 
---
 lib/classifier-private.h |  106 +-
 lib/classifier.c |  128 ++
 tests/test-classifier.c  |2 +-
 3 files changed, 154 insertions(+), 82 deletions(-)

diff --git a/lib/classifier-private.h b/lib/classifier-private.h
index f7462d2..01b31e0 100644
--- a/lib/classifier-private.h
+++ b/lib/classifier-private.h
@@ -68,7 +68,12 @@ struct cls_partition {
 /* Internal representation of a rule in a "struct cls_subtable". */
 struct cls_match {
 /* Accessed by everybody. */
-struct rculist list; /* Identical, lower-priority "cls_match"es. */
+OVSRCU_TYPE(struct cls_match *) next; /* Identical "cls_match"es in the
+   * order of decreasing priority.  At
+   * most one rule of a given priority
+   * may be visible to any given lookup
+   * version. */
+OVSRCU_TYPE(struct cls_conjunction_set *) conj_set;
 
 /* Accessed only by writers. */
 struct cls_partition *partition;
@@ -95,7 +100,6 @@ struct cls_match {
 atomic_llong visibility;
 
 const struct cls_rule *cls_rule;
-OVSRCU_TYPE(struct cls_conjunction_set *) conj_set;
 const struct miniflow flow; /* Matching rule. Mask is in the subtable. */
 /* 'flow' must be the last field. */
 };
@@ -136,6 +140,104 @@ cls_match_is_eventually_invisible(const struct cls_match 
*rule)
 return visibility <= 0;
 }
 
+
+/* cls_match 'next' */
+
+static inline const struct cls_match *
+cls_match_next(const struct cls_match *rule)
+{
+return ovsrcu_get(struct cls_match *, &rule->next);
+}
+
+static inline struct cls_match *
+cls_match_next_protected(const struct cls_match *rule)
+{
+return ovsrcu_get_protected(struct cls_match *, &rule->next);
+}
+
+/* Puts 'rule' in the position between 'prev' and 'next'. If 'prev' == NULL,
+ * then the 'rule' is the new list head, and if 'next' == NULL, the rule is the
+ * new list tail. */
+static inline void
+cls_match_insert(struct cls_match *prev, struct cls_match *next,
+ struct cls_match *rule)
+{
+if (!prev) {
+/* 'rule' is the new head. */
+ovsrcu_set_hidden(&rule->next, next);
+} else {
+/* 'rule' is new node after 'prev' */
+ovsrcu_set_hidden(&rule->next, cls_match_next_protected(prev));
+ovsrcu_set(&prev->next, rule);
+}
+}
+
+void cls_match_poison(struct cls_match *);
+
+/* Puts 'new_rule' in the position of 'old_rule', which is the next node after
+ * 'prev'. If 'prev' == NULL, then the 'rule' is the new list head.
+ *
+ * Returns the replaced cls_match, which is not linked to any more, but still
+ * links to the later rules, and may still be referenced by other threads until
+ * all other threads quiesce.  The returned rule may not be re-inserted,
+ * re-initialized, or deleted until after all other threads have quiesced (use
+ * ovsrcu_postpone). */
+static inline void
+cls_match_replace(struct cls_match *prev,
+  struct cls_match *old_rule, struct cls_match *new_rule)
+{
+if (prev == NULL) {
+/* 'new_rule' replaces the old head. */
+ovsrcu_set_hidden(&new_rule->next, cls_match_next_protected(old_rule));
+} else {
+/* 'new_rule' replaces an existing non-head node. */
+struct cls_match *next = cls_match_next_protected(old_rule);
+
+ovsrcu_set_hidden(&new_rule->next, next);
+ovsrcu_set(&prev->next, new_rule);
+}
+
+#ifndef NDEBUG
+ovsrcu_postpone(cls_match_poison, old_rule);
+#endif
+}
+
+/* Removes 'rule' following 'prev' from the list. If 'prev' is NULL, then the
+ * 'rule' is a list head, and the 'rule' is only poisoned after all threads
+ * have quiesced.
+ *
+ * Afterward, the removed rule is not linked to any more, but still links to
+ * the following rules, and may still be referenced by other threads until all
+ * other threads quiesce.  The removed rule may not be re-inserted,
+ * re-initialized, or deleted until after all other threads have quiesced (use
+ * ovsrcu_postpone).
+ */
+static inline void
+cls_match_remove(struct cls_match *prev, struct cls_match *rule)
+{
+if (prev) {
+struct cls_match *next = cls_match_next_protected(rule);
+
+ovsrcu_set(&prev->next, next);
+

[ovs-dev] [PATCH v4 06/12] ofproto: Infra for table versioning.

2015-06-09 Thread Jarno Rajahalme
Signed-off-by: Jarno Rajahalme 
---
 ofproto/ofproto-dpif-xlate.c |   17 -
 ofproto/ofproto-dpif.c   |   43 +-
 ofproto/ofproto-dpif.h   |5 -
 ofproto/ofproto-provider.h   |6 ++
 ofproto/ofproto.c|5 +
 5 files changed, 65 insertions(+), 11 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 59cd088..ac3b99e 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -159,6 +159,9 @@ struct xlate_ctx {
 
 const struct xbridge *xbridge;
 
+/* Flow tables version at the beginning of the translation. */
+long long tables_version;
+
 /* Flow at the last commit. */
 struct flow base_flow;
 
@@ -2774,6 +2777,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t 
ofp_port,
 const struct xport *peer = xport->peer;
 struct flow old_flow = ctx->xin->flow;
 bool old_was_mpls = ctx->was_mpls;
+long long old_version = ctx->tables_version;
 enum slow_path_reason special;
 struct ofpbuf old_stack = ctx->stack;
 union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)];
@@ -2789,6 +2793,10 @@ compose_output_action__(struct xlate_ctx *ctx, 
ofp_port_t ofp_port,
 memset(flow->regs, 0, sizeof flow->regs);
 flow->actset_output = OFPP_UNSET;
 
+/* Set tables version after the bridge has been fixed. */
+ctx->tables_version
+= ofproto_dpif_get_tables_version(ctx->xbridge->ofproto);
+
 special = process_special(ctx, &ctx->xin->flow, peer,
   ctx->xin->packet);
 if (special) {
@@ -2835,6 +2843,9 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t 
ofp_port,
 ofpbuf_uninit(&ctx->stack);
 ctx->stack = old_stack;
 
+/* Restore calling bridge's lookup version. */
+ctx->tables_version = old_version;
+
 /* The peer bridge popping MPLS should have no effect on the original
  * bridge. */
 ctx->was_mpls = old_was_mpls;
@@ -3056,6 +3067,7 @@ xlate_table_action(struct xlate_ctx *ctx, ofp_port_t 
in_port, uint8_t table_id,
 wc = (ctx->xin->skip_wildcards) ? NULL : &ctx->xout->wc;
 
 rule = rule_dpif_lookup_from_table(ctx->xbridge->ofproto,
+   ctx->tables_version,
&ctx->xin->flow, wc,
ctx->xin->xcache != NULL,
ctx->xin->resubmit_stats,
@@ -4826,9 +4838,12 @@ xlate_actions(struct xlate_in *xin, struct xlate_out 
*xout)
  flow->recirc_id);
 return;
 }
+/* Set tables version after the bridge has been fixed. */
+ctx.tables_version = ofproto_dpif_get_tables_version(ctx.xbridge->ofproto);
 
 if (!xin->ofpacts && !ctx.rule) {
-rule = rule_dpif_lookup_from_table(ctx.xbridge->ofproto, flow, wc,
+rule = rule_dpif_lookup_from_table(ctx.xbridge->ofproto,
+   ctx.tables_version, flow, wc,
ctx.xin->xcache != NULL,
ctx.xin->resubmit_stats,
&ctx.table_id,
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 81beca0..cc5d9d4 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -287,6 +287,8 @@ struct ofproto_dpif {
 struct ofproto up;
 struct dpif_backer *backer;
 
+atomic_llong tables_version;  /* Version # to use in classifier lookups. */
+
 uint64_t dump_seq; /* Last read of udpif_dump_seq(). */
 
 /* Special OpenFlow rules. */
@@ -1227,6 +1229,7 @@ construct(struct ofproto *ofproto_)
 return error;
 }
 
+atomic_init(&ofproto->tables_version, CLS_MIN_VERSION);
 ofproto->netflow = NULL;
 ofproto->sflow = NULL;
 ofproto->ipfix = NULL;
@@ -1605,6 +1608,15 @@ query_tables(struct ofproto *ofproto,
 }
 }
 
+static void
+set_tables_version(struct ofproto *ofproto_, long long version)
+{
+struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
+
+atomic_store_relaxed(&ofproto->tables_version, version);
+}
+
+
 static struct ofport *
 port_alloc(void)
 {
@@ -3709,6 +3721,15 @@ rule_set_recirc_id(struct rule *rule_, uint32_t id)
 ovs_mutex_unlock(&rule->up.mutex);
 }
 
+long long
+ofproto_dpif_get_tables_version(struct ofproto_dpif *ofproto)
+{
+long long version;
+
+atomic_read_relaxed(&ofproto->tables_version, &version);
+return version;
+}
+
 /* The returned rule (if any) is valid at least until the next RCU quiescent
  * period.  If the rule needs to stay around longer, a non-zero 'take_ref'
  * must be passed in to cause a reference to be taken on it.
@@ -3716,16 +3737,16 @@ rule_set_recirc_id(struct rule *rule_, uint32_t id)
  * 'flow' is non

[ovs-dev] [PATCH v4 07/12] Use classifier versioning.

2015-06-09 Thread Jarno Rajahalme
Each rule is now added or deleted in a specific tables version.  Flow
tables are versioned with a monotonically increasing 64-bit integer,
where positive values are valid version numbers.

Rule modifications are implemented as an insertion of a new rule and a
deletion of the old rule, both taking place in the same tables
version.  Since concurrent lookups may use different versions, both
the old and new rule must be available for lookups at the same time.

The ofproto provider interface is changed to accomodate the above.  As
rule's actions need not be modified any more, we no longer need
'rule_premodify_actions', nor 'rule_modify_actions'.  'rule_insert'
now takes a pointer to the old rule and adds a flag that tells whether
the old stats should be forwarded to the new rule or not (this
replaces the 'reset_counters' flag of the now removed
'rule_modify_actions').

Versioning all flow table changes has the side effect of making
learned flows visible for future lookups only.  I.e., the upcall that
executes the learn action, will not see the newly learned action in
it's classifier lookups.  Only upcalls that start executing after the
new flow was added will match on it.

Signed-off-by: Jarno Rajahalme 
---
 NEWS   |   21 +-
 lib/classifier.c   |   14 +-
 lib/classifier.h   |2 +
 ofproto/bundles.h  |5 +-
 ofproto/ofproto-dpif.c |   87 ++--
 ofproto/ofproto-provider.h |   59 +--
 ofproto/ofproto.c  |  950 
 tests/ofproto.at   |   42 +-
 tests/ovs-ofctl.at |   40 +-
 utilities/ovs-ofctl.8.in   |   19 +-
 utilities/ovs-ofctl.c  |4 +-
 11 files changed, 675 insertions(+), 568 deletions(-)

diff --git a/NEWS b/NEWS
index 5bea237..f171cfc 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,10 @@
 Post-v2.3.0
 -
+   - Flow table modifications are now atomic, meaning that each packet
+ now sees a coherent version of the OpenFlow pipeline.  For
+ example, if a controller removes all flows, no packet sees an
+ intermediate version of the OpenFlow pipeline where only some of
+ the flows have been deleted.
- Added support for SFQ, FQ_CoDel and CoDel qdiscs.
- Add bash command-line completion support for ovs-vsctl Please check
  utilities/ovs-command-compgen.INSTALL.md for how to use.
@@ -28,10 +33,10 @@ Post-v2.3.0
  release.  See ovs-vswitchd(8) for details.
- OpenFlow:
  * OpenFlow 1.4 bundles are now supported, but for flow mod
-   messages only. 'atomic' bundles are not yet supported, and
-   'ordered' bundles are trivially supported, as all bundled
-   messages are executed in the order they were added to the
-   bundle regardless of the presence of the 'ordered' flag.
+   messages only.  Both 'atomic' and 'ordered' bundle flags are
+   trivially supported, as all bundled messages are executed in
+   the order they were added and all flow table modifications are
+   now atomic to the datapath.
  * IPv6 flow label and neighbor discovery fields are now modifiable.
  * OpenFlow 1.5 extended registers are now supported.
  * The OpenFlow 1.5 actset_output field is now supported.
@@ -49,15 +54,15 @@ Post-v2.3.0
- ovs-ofctl has a new '--bundle' option that makes the flow mod commands
  ('add-flow', 'add-flows', 'mod-flows', 'del-flows', and 'replace-flows')
  use an OpenFlow 1.4 bundle to operate the modifications as a single
- transaction.  If any of the flow mods in a transaction fail, none of
- them are executed.
+ atomic transaction.  If any of the flow mods in a transaction fail, none
+ of them are executed.  All flow mods in a bundle appear to datapath
+ lookups simultaneously.
- ovs-ofctl 'add-flow' and 'add-flows' commands now accept arbitrary flow
  mods as an input by allowing the flow specification to start with an
  explicit 'add', 'modify', 'modify_strict', 'delete', or 'delete_strict'
  keyword.  A missing keyword is treated as 'add', so this is fully
  backwards compatible.  With the new '--bundle' option all the flow mods
- are executed as a single transaction using the new OpenFlow 1.4 bundles
- support.
+ are executed as a single atomic transaction using an OpenFlow 1.4 bundle.
- ovs-pki: Changed message digest algorithm from MD5 to SHA-1 because
  MD5 is no longer secure and some operating systems have started to disable
  it in OpenSSL.
diff --git a/lib/classifier.c b/lib/classifier.c
index feebd80..be92151 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -202,14 +202,24 @@ cls_rule_init_from_minimatch(struct cls_rule *rule,
 minimatch_clone(CONST_CAST(struct minimatch *, &rule->match), match);
 }
 
+/* Initializes 'dst' as a copy of 'src', but with 'version'.
+ *
+ * The caller must eventually destroy 'dst' with cls_rule_destroy(). */
+void
+cls_rule_clone_in_version(struct cls_rule *dst, const str

[ovs-dev] [PATCH v4 08/12] ofproto: Accurate flow counts.

2015-06-09 Thread Jarno Rajahalme
Classifier's rule count now contains temporary duplicates and rules
whose deletion has been deferred.  Maintain a new 'n_flows' count in
struct oftable to as the count of rules in the latest version.

Signed-off-by: Jarno Rajahalme 
---
 ofproto/connmgr.c  |2 +-
 ofproto/ofproto-provider.h |5 -
 ofproto/ofproto.c  |   32 ++--
 3 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
index 1fee860..975ee33 100644
--- a/ofproto/connmgr.c
+++ b/ofproto/connmgr.c
@@ -2027,7 +2027,7 @@ connmgr_flushed(struct connmgr *mgr)
 
 /* Returns the number of hidden rules created by the in-band and fail-open
  * implementations in table 0.  (Subtracting this count from the number of
- * rules in the table 0 classifier, as returned by classifier_count(), yields
+ * rules in the table 0 classifier, as maintained in struct oftable, yields
  * the number of flows that OVS should report via OpenFlow for table 0.) */
 int
 connmgr_count_hidden_rules(const struct connmgr *mgr)
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index 07229c5..fd66e49 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -220,6 +220,9 @@ struct oftable {
 /* Maximum number of flows or UINT_MAX if there is no limit besides any
  * limit imposed by resource limitations. */
 unsigned int max_flows;
+/* Current number of flows, not counting temporary duplicates nor deferred
+ * deletions. */
+unsigned int n_flows;
 
 /* These members determine the handling of an attempt to add a flow that
  * would cause the table to have more than 'max_flows' flows.
@@ -818,7 +821,7 @@ struct ofproto_class {
  *
  *   - 'table_id' to the array index.
  *
- *   - 'active_count' to the classifier_count() for the table.
+ *   - 'active_count' to the 'n_flows' of struct ofproto for the table.
  *
  *   - 'lookup_count' and 'matched_count' to 0.
  *
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 35e1ed2..67ade45 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -1698,12 +1698,11 @@ ofproto_run(struct ofproto *p)
 continue;
 }
 
-if (classifier_count(&table->cls) > 10) {
+if (table->n_flows > 10) {
 static struct vlog_rate_limit count_rl =
 VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(&count_rl, "Table %"PRIuSIZE" has an excessive"
- " number of rules: %d", i,
- classifier_count(&table->cls));
+ " number of rules: %d", i, table->n_flows);
 }
 
 ovs_mutex_lock(&ofproto_mutex);
@@ -1797,7 +1796,7 @@ ofproto_get_memory_usage(const struct ofproto *ofproto, 
struct simap *usage)
 
 n_rules = 0;
 OFPROTO_FOR_EACH_TABLE (table, ofproto) {
-n_rules += classifier_count(&table->cls);
+n_rules += table->n_flows;
 }
 simap_increase(usage, "rules", n_rules);
 
@@ -3160,10 +3159,9 @@ query_tables(struct ofproto *ofproto,
 stats = *statsp = xcalloc(ofproto->n_tables, sizeof *stats);
 for (i = 0; i < ofproto->n_tables; i++) {
 struct ofputil_table_stats *s = &stats[i];
-struct classifier *cls = &ofproto->tables[i].cls;
 
 s->table_id = i;
-s->active_count = classifier_count(cls);
+s->active_count = ofproto->tables[i].n_flows;
 if (i == 0) {
 s->active_count -= connmgr_count_hidden_rules(
 ofproto->connmgr);
@@ -4368,7 +4366,7 @@ evict_rules_from_table(struct oftable *table, unsigned 
int extra_space)
 {
 enum ofperr error = 0;
 struct rule_collection rules;
-unsigned int count = classifier_count(&table->cls) + extra_space;
+unsigned int count = table->n_flows + extra_space;
 unsigned int max_flows = table->max_flows;
 
 rule_collection_init(&rules);
@@ -4664,6 +4662,8 @@ replace_rule_start(struct ofproto *ofproto,
 cls_rule_make_invisible_in_version(&old_rule->cr,
ofproto->tables_version + 1,
ofproto->tables_version);
+} else {
+table->n_flows++;
 }
 /* Insert flow to the classifier, so that later flow_mods may relate
  * to it.  This is reversible, in case later errors require this to
@@ -4679,9 +4679,12 @@ static void replace_rule_revert(struct ofproto *ofproto,
 {
 struct oftable *table = &ofproto->tables[new_rule->table_id];
 
-/* Restore the original visibility of the old rule. */
 if (old_rule) {
+/* Restore the original visibility of the old rule. */
 cls_rule_restore_visibility(&old_rule->cr);
+} else {
+/* Restore table's rule count. */
+table->n_flows--;
 }
 
 /* Remove the new rule immediately. 

[ovs-dev] [PATCH v4 05/12] test-classifier: Test versioning features.

2015-06-09 Thread Jarno Rajahalme
Signed-off-by: Jarno Rajahalme 
---
 tests/classifier.at |8 +-
 tests/test-classifier.c |  301 +--
 2 files changed, 245 insertions(+), 64 deletions(-)

diff --git a/tests/classifier.at b/tests/classifier.at
index cfa1bc7..1e75123 100644
--- a/tests/classifier.at
+++ b/tests/classifier.at
@@ -6,11 +6,15 @@ m4_foreach(
[single-rule],
[rule-replacement],
[many-rules-in-one-list],
+   [versioned many-rules-in-one-list],
[many-rules-in-one-table],
+   [versioned many-rules-in-one-table],
[many-rules-in-two-tables],
-   [many-rules-in-five-tables]],
+   [versioned many-rules-in-two-tables],
+   [many-rules-in-five-tables],
+   [versioned many-rules-in-five-tables]],
   [AT_SETUP([flow classifier - m4_bpatsubst(testname, [-], [ ])])
-   AT_CHECK([ovstest test-classifier testname], [0], [], [])
+   AT_CHECK([ovstest test-classifier m4_bpatsubst(testname, [versioned], 
[--versioned])], [0], [], [])
AT_CLEANUP])])
 
 AT_BANNER([miniflow unit tests])
diff --git a/tests/test-classifier.c b/tests/test-classifier.c
index cb65533..9e837c3 100644
--- a/tests/test-classifier.c
+++ b/tests/test-classifier.c
@@ -42,6 +42,8 @@
 #include "unaligned.h"
 #include "util.h"
 
+static bool versioned = false;
+
 /* Fields in a rule. */
 #define CLS_FIELDS\
 /*struct flowall-caps */  \
@@ -88,6 +90,7 @@ static const struct cls_field cls_fields[CLS_N_FIELDS] = {
 };
 
 struct test_rule {
+struct ovs_list list_node;
 int aux;/* Auxiliary data. */
 struct cls_rule cls_rule;   /* Classifier rule data. */
 };
@@ -107,7 +110,8 @@ test_rule_destroy(struct test_rule *rule)
 }
 }
 
-static struct test_rule *make_rule(int wc_fields, int priority, int value_pat);
+static struct test_rule *make_rule(int wc_fields, int priority, int value_pat,
+   long long version);
 static void free_rule(struct test_rule *);
 static struct test_rule *clone_rule(const struct test_rule *);
 
@@ -398,12 +402,13 @@ get_value(unsigned int *x, unsigned n_values)
 }
 
 static void
-compare_classifiers(struct classifier *cls, struct tcls *tcls)
+compare_classifiers(struct classifier *cls, size_t n_invisible_rules,
+long long version, struct tcls *tcls)
 {
 static const int confidence = 500;
 unsigned int i;
 
-assert(classifier_count(cls) == tcls->n_rules);
+assert(classifier_count(cls) == tcls->n_rules + n_invisible_rules);
 for (i = 0; i < confidence; i++) {
 const struct cls_rule *cr0, *cr1, *cr2;
 struct flow flow;
@@ -433,7 +438,7 @@ compare_classifiers(struct classifier *cls, struct tcls 
*tcls)
 /* This assertion is here to suppress a GCC 4.9 array-bounds warning */
 ovs_assert(cls->n_tries <= CLS_MAX_TRIES);
 
-cr0 = classifier_lookup(cls, CLS_MAX_VERSION, &flow, &wc);
+cr0 = classifier_lookup(cls, version, &flow, &wc);
 cr1 = tcls_lookup(tcls, &flow);
 assert((cr0 == NULL) == (cr1 == NULL));
 if (cr0 != NULL) {
@@ -442,8 +447,12 @@ compare_classifiers(struct classifier *cls, struct tcls 
*tcls)
 
 assert(cls_rule_equal(cr0, cr1));
 assert(tr0->aux == tr1->aux);
+
+/* Make sure the rule should have been visible. */
+assert(cr0->cls_match);
+assert(cls_match_visible_in_version(cr0->cls_match, version));
 }
-cr2 = classifier_lookup(cls, CLS_MAX_VERSION, &flow, NULL);
+cr2 = classifier_lookup(cls, version, &flow, NULL);
 assert(cr2 == cr0);
 }
 }
@@ -511,14 +520,17 @@ verify_tries(struct classifier *cls)
 
 static void
 check_tables(const struct classifier *cls, int n_tables, int n_rules,
- int n_dups)
+ int n_dups, int n_invisible, long long version)
 OVS_NO_THREAD_SAFETY_ANALYSIS
 {
 const struct cls_subtable *table;
 struct test_rule *test_rule;
 int found_tables = 0;
+int found_tables_with_visible_rules = 0;
 int found_rules = 0;
 int found_dups = 0;
+int found_invisible = 0;
+int found_visible_but_removable = 0;
 int found_rules2 = 0;
 
 pvector_verify(&cls->subtables);
@@ -527,6 +539,7 @@ check_tables(const struct classifier *cls, int n_tables, 
int n_rules,
 int max_priority = INT_MIN;
 unsigned int max_count = 0;
 bool found = false;
+bool found_visible_rules = false;
 const struct cls_subtable *iter;
 
 /* Locate the subtable from 'subtables'. */
@@ -548,47 +561,105 @@ check_tables(const struct classifier *cls, int n_tables, 
int n_rules,
== (table->ports_mask_len ? cmap_count(&table->rules) : 0));
 
 found_tables++;
+
 CMAP_FOR_EACH (head, cmap_node, &table->rules) {
 int prev_priority = INT_MAX;
-const struct cls_match *rule;
+long long prev_version = 0;
+const struct cls_mat

[ovs-dev] [PATCH v4 09/12] ofproto: Revertible eviction.

2015-06-09 Thread Jarno Rajahalme
Handling evictions was broken in the previous patches. Eviction took
place early in the commit, and actually inappropriately bumped the
version number too early.  Now eviction is treated much like a flow
modification, where a new rule replaces the old one, but just without
any 'inheritance' from the evicted rule to the new rule.  This makes
evictions to be executed only when commit is successful, as evictions
are reverted like any other changes when the commit fails.

Signed-off-by: Jarno Rajahalme 
---
 ofproto/ofproto.c |  121 ++---
 1 file changed, 77 insertions(+), 44 deletions(-)

diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 67ade45..be65d24 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -81,8 +81,7 @@ static void oftable_destroy(struct oftable *);
 
 static void oftable_set_name(struct oftable *, const char *name);
 
-static enum ofperr evict_rules_from_table(struct oftable *,
-  unsigned int extra_space)
+static enum ofperr evict_rules_from_table(struct oftable *)
 OVS_REQUIRES(ofproto_mutex);
 static void oftable_disable_eviction(struct oftable *);
 static void oftable_enable_eviction(struct oftable *,
@@ -265,8 +264,8 @@ static void replace_rule_start(struct ofproto *,
struct cls_conjunction *, size_t n_conjs)
 OVS_REQUIRES(ofproto_mutex);
 
-static void replace_rule_revert(struct ofproto *,
-struct rule *old_rule, struct rule *new_rule)
+static void replace_rule_revert(struct ofproto *, struct rule *old_rule,
+struct rule *new_rule)
 OVS_REQUIRES(ofproto_mutex);
 
 static void replace_rule_finish(struct ofproto *, struct ofputil_flow_mod *,
@@ -1433,7 +1432,7 @@ ofproto_configure_table(struct ofproto *ofproto, int 
table_id,
 }
 
 ovs_mutex_lock(&ofproto_mutex);
-evict_rules_from_table(table, 0);
+evict_rules_from_table(table);
 ovs_mutex_unlock(&ofproto_mutex);
 }
 
@@ -4361,12 +4360,12 @@ handle_queue_stats_request(struct ofconn *ofconn,
 }
 
 static enum ofperr
-evict_rules_from_table(struct oftable *table, unsigned int extra_space)
+evict_rules_from_table(struct oftable *table)
 OVS_REQUIRES(ofproto_mutex)
 {
 enum ofperr error = 0;
 struct rule_collection rules;
-unsigned int count = table->n_flows + extra_space;
+unsigned int count = table->n_flows;
 unsigned int max_flows = table->max_flows;
 
 rule_collection_init(&rules);
@@ -4501,10 +4500,17 @@ add_flow_start(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 }
 
 /* If necessary, evict an existing rule to clear out space. */
-error = evict_rules_from_table(table, 1);
-if (error) {
-cls_rule_destroy(&cr);
-return error;
+if (table->n_flows >= table->max_flows) {
+if (!choose_rule_to_evict(table, &rule)) {
+error = OFPERR_OFPFMFC_TABLE_FULL;
+cls_rule_destroy(&cr);
+return error;
+}
+eviction_group_remove_rule(rule);
+/* Marks '*old_rule' as an evicted rule rather than replaced rule.
+ */
+fm->delete_reason = OFPRR_EVICTION;
+*old_rule = rule;
 }
 } else {
 fm->modify_cookie = true;
@@ -4524,13 +4530,17 @@ add_flow_start(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 return 0;
 }
 
-/* Revert the effects of add_flow_start().
- * XXX: evictions cannot be reverted. */
+/* Revert the effects of add_flow_start(). */
 static void
-add_flow_revert(struct ofproto *ofproto, struct rule *old_rule,
-struct rule *new_rule)
+add_flow_revert(struct ofproto *ofproto, struct ofputil_flow_mod *fm,
+struct rule *old_rule, struct rule *new_rule)
 OVS_REQUIRES(ofproto_mutex)
 {
+if (old_rule && fm->delete_reason == OFPRR_EVICTION) {
+/* Revert the eviction. */
+eviction_group_add_rule(old_rule);
+}
+
 replace_rule_revert(ofproto, old_rule, new_rule);
 }
 
@@ -4616,7 +4626,7 @@ replace_rule_create(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 rule->modify_seqno = 0;
 
 /* Copy values from old rule for modify semantics. */
-if (old_rule) {
+if (old_rule && fm->delete_reason != OFPRR_EVICTION) {
 /*  'fm' says that  */
 bool change_cookie = (fm->modify_cookie
   && fm->new_cookie != OVS_BE64_MAX
@@ -4657,6 +4667,7 @@ replace_rule_start(struct ofproto *ofproto,
 {
 struct oftable *table = &ofproto->tables[new_rule->table_id];
 
+/* 'old_rule' may be either an evicted rule or replaced rule. */
 if (old_rule) {
 /* Mark the old rule for removal in the next version. */
 cls_rule_make_invisible_in_version(&old_rule->cr,
@@ -4706,41 +4717,58 @@ replace_rule_finish(struct ofproto *ofproto, struct 
ofputil_flow_mod 

[ovs-dev] [PATCH v4 12/12] ofproto: Support port mods in bundles.

2015-06-09 Thread Jarno Rajahalme
Add support for port mods in an OpenFlow 1.4 bundle, as required for
the minimum support level by the OpenFlow 1.4 specification.  If the
bundle includes port mods, it may not specify the OFPBF_ATOMIC flag.
Port mods and flow mods in a bundle are always applied in order and
the consecutive flow mods between port mods are made available to
lookups atomically.

Note that ovs-ofctl does not support creating bundles with port mods.

Signed-off-by: Jarno Rajahalme 
---
 NEWS   |   11 +++--
 OPENFLOW-1.1+.md   |6 +++
 ofproto/bundles.h  |7 ++-
 ofproto/ofproto-provider.h |3 +-
 ofproto/ofproto.c  |  112 
 5 files changed, 101 insertions(+), 38 deletions(-)

diff --git a/NEWS b/NEWS
index f171cfc..c3524b6 100644
--- a/NEWS
+++ b/NEWS
@@ -32,11 +32,12 @@ Post-v2.3.0
  commands are now redundant and will be removed in a future
  release.  See ovs-vswitchd(8) for details.
- OpenFlow:
- * OpenFlow 1.4 bundles are now supported, but for flow mod
-   messages only.  Both 'atomic' and 'ordered' bundle flags are
-   trivially supported, as all bundled messages are executed in
-   the order they were added and all flow table modifications are
-   now atomic to the datapath.
+ * OpenFlow 1.4 bundles are now supported for flow mods and port
+   mods.  For flow mods, both 'atomic' and 'ordered' bundle flags
+   are trivially supported, as all bundled messages are executed
+   in the order they were added and all flow table modifications
+   are now atomic to the datapath.  Port mods may not appear in
+   atomic bundles, as port status modifications are not atomic.
  * IPv6 flow label and neighbor discovery fields are now modifiable.
  * OpenFlow 1.5 extended registers are now supported.
  * The OpenFlow 1.5 actset_output field is now supported.
diff --git a/OPENFLOW-1.1+.md b/OPENFLOW-1.1+.md
index 7911406..0b2f0f9 100644
--- a/OPENFLOW-1.1+.md
+++ b/OPENFLOW-1.1+.md
@@ -148,6 +148,12 @@ parallel in OVS.
 Transactional modification.  OpenFlow 1.4 requires to support
 flow_mods and port_mods in a bundle if bundle is supported.
 (Not related to OVS's 'ofbundle' stuff.)
+Implemented as an OpenFlow 1.4 feature.  Only flow_mods and
+port_mods are supported in a bundle.  If the bundle includes port
+mods, it may not specify the OFPBF_ATOMIC flag.  Nevertheless,
+port mods and flow mods in a bundle are always applied in order
+and consecutive flow mods between port mods are made available to
+lookups atomically.
 [EXT-230]
 [optional for OF1.4+]
 
diff --git a/ofproto/bundles.h b/ofproto/bundles.h
index 98cb2ba..ba4b1b1 100644
--- a/ofproto/bundles.h
+++ b/ofproto/bundles.h
@@ -34,14 +34,17 @@ extern "C" {
 struct ofp_bundle_entry {
 struct ovs_list   node;
 enum ofptype  type;  /* OFPTYPE_FLOW_MOD or OFPTYPE_PORT_MOD. */
+long long version;  /* Version in which the changes take
+ * effect. */
 union {
 struct ofputil_miniflow_mod fm;   /* 'fm.ofpacts' must be malloced. */
 struct ofputil_port_mod pm;
 };
 
 /* Used during commit. */
+struct ofport *port;/* Affected port. */
 struct rule_collection old_rules;   /* Affected rules. */
-struct rule_collection new_rules;   /* Affected rules. */
+struct rule_collection new_rules;   /* Replacement rules. */
 
 /* OpenFlow header and some of the message contents for error reporting. */
 struct ofp_header ofp_msg[DIV_ROUND_UP(64, sizeof(struct ofp_header))];
@@ -83,6 +86,8 @@ ofp_bundle_entry_alloc(enum ofptype type, const struct 
ofp_header *oh)
 if (type == OFPTYPE_FLOW_MOD) {
 minimatch_init_catchall(&entry->fm.match);
 }
+entry->version = 0;
+
 /* Max 64 bytes for error reporting. */
 memcpy(entry->ofp_msg, oh, MIN(ntohs(oh->length), sizeof entry->ofp_msg));
 
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index 335c843..f78c051 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -93,8 +93,9 @@ struct ofproto {
 long long int eviction_group_timer; /* For rate limited reheapification. */
 struct oftable *tables;
 int n_tables;
-long long tables_version;   /* Controls which rules are visible to
+long long visible_version;  /* Controls which rules are visible to
  * table lookups. */
+long long tables_version;   /* Used during commits. */
 
 /* Rules indexed on their cookie values, in all flow tables. */
 struct hindex cookies OVS_GUARDED_BY(ofproto_mutex);
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index ca7be4f..d69629d 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -498,9 +498,9 @@ ofproto_enumerate_names(const char *type, struct sset 
*names)
 static void
 ofproto_bump_tables_version(s

[ovs-dev] [PATCH v4 02/12] ovs-ofctl: Add bundle support and unit testing.

2015-06-09 Thread Jarno Rajahalme
All existing ovs-ofctl flow mod commands now take an optional
'--bundle' argument, which executes the flow mods as a single
transaction.  OpenFlow 1.4+ is implicitly assumed when '--bundle' is
specified.

ovs-ofctl 'add-flow' and 'add-flows' commands now accept flow
specifications that start with an optional 'add', 'modify', 'delete',
'modify_strict', or 'delete_strict' keyword, so that arbitrary flow
table modifications may be specified.  For backwards compatibility, a
missing keyword is treated as an 'add'.  With the new '--bundle'
option all the modifications are executed as a single transaction
using an OpenFlow 1.4 bundle.

OpenFlow 1.4 requires bundles to support at least flow and port mods.
This implementation does not yet support port mods in bundles.

Another restriction is that the atomic transactions are not yet
supported.

Signed-off-by: Jarno Rajahalme 
---
 NEWS|   19 +++-
 include/openvswitch/vconn.h |3 +
 lib/ofp-parse.c |   40 ++-
 lib/ofp-parse.h |6 +-
 lib/ofp-util.c  |   30 ++
 lib/ofp-util.h  |2 +
 lib/ofp-version-opt.c   |7 ++
 lib/ofp-version-opt.h   |1 +
 lib/vconn.c |  238 +
 tests/ofproto-macros.at |   10 ++
 tests/ofproto.at|  244 +++
 tests/ovs-ofctl.at  |  107 +++
 utilities/ovs-ofctl.8.in|   60 ---
 utilities/ovs-ofctl.c   |   88 +++-
 14 files changed, 813 insertions(+), 42 deletions(-)

diff --git a/NEWS b/NEWS
index a480607..5bea237 100644
--- a/NEWS
+++ b/NEWS
@@ -1,6 +1,6 @@
 Post-v2.3.0
 -
-   - Added support for SFQ, FQ_CoDel and CoDel qdiscs. 
+   - Added support for SFQ, FQ_CoDel and CoDel qdiscs.
- Add bash command-line completion support for ovs-vsctl Please check
  utilities/ovs-command-compgen.INSTALL.md for how to use.
- The MAC learning feature now includes per-port fairness to mitigate
@@ -27,6 +27,11 @@ Post-v2.3.0
  commands are now redundant and will be removed in a future
  release.  See ovs-vswitchd(8) for details.
- OpenFlow:
+ * OpenFlow 1.4 bundles are now supported, but for flow mod
+   messages only. 'atomic' bundles are not yet supported, and
+   'ordered' bundles are trivially supported, as all bundled
+   messages are executed in the order they were added to the
+   bundle regardless of the presence of the 'ordered' flag.
  * IPv6 flow label and neighbor discovery fields are now modifiable.
  * OpenFlow 1.5 extended registers are now supported.
  * The OpenFlow 1.5 actset_output field is now supported.
@@ -41,6 +46,18 @@ Post-v2.3.0
  * A new Netronome extension to OpenFlow 1.5+ allows control over the
fields hashed for OpenFlow select groups.  See "selection_method" and
related options in ovs-ofctl(8) for details.
+   - ovs-ofctl has a new '--bundle' option that makes the flow mod commands
+ ('add-flow', 'add-flows', 'mod-flows', 'del-flows', and 'replace-flows')
+ use an OpenFlow 1.4 bundle to operate the modifications as a single
+ transaction.  If any of the flow mods in a transaction fail, none of
+ them are executed.
+   - ovs-ofctl 'add-flow' and 'add-flows' commands now accept arbitrary flow
+ mods as an input by allowing the flow specification to start with an
+ explicit 'add', 'modify', 'modify_strict', 'delete', or 'delete_strict'
+ keyword.  A missing keyword is treated as 'add', so this is fully
+ backwards compatible.  With the new '--bundle' option all the flow mods
+ are executed as a single transaction using the new OpenFlow 1.4 bundles
+ support.
- ovs-pki: Changed message digest algorithm from MD5 to SHA-1 because
  MD5 is no longer secure and some operating systems have started to disable
  it in OpenSSL.
diff --git a/include/openvswitch/vconn.h b/include/openvswitch/vconn.h
index 3b157e1..f8b6655 100644
--- a/include/openvswitch/vconn.h
+++ b/include/openvswitch/vconn.h
@@ -55,6 +55,9 @@ int vconn_transact(struct vconn *, struct ofpbuf *, struct 
ofpbuf **);
 int vconn_transact_noreply(struct vconn *, struct ofpbuf *, struct ofpbuf **);
 int vconn_transact_multiple_noreply(struct vconn *, struct ovs_list *requests,
 struct ofpbuf **replyp);
+int vconn_bundle_transact(struct vconn *, struct ovs_list *requests,
+  uint16_t bundle_flags,
+  void (*error_reporter)(const struct ofp_header *));
 
 void vconn_run(struct vconn *);
 void vconn_run_wait(struct vconn *);
diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c
index 6125f27..210feed 100644
--- a/lib/ofp-parse.c
+++ b/lib/ofp-parse.c
@@ -258,6 +258,29 @@ parse_ofp_str__(struct ofputil_flow_mod *fm, int command, 
char *string,
 
 *usable_protocols = OFPUTIL_P_ANY;
 
+if (command ==

[ovs-dev] [PATCH v4 11/12] ofproto: Use minimatch for making bundles smaller.

2015-06-09 Thread Jarno Rajahalme
struct match in ofputil_flow_mod uses a lot of space, when used for a
stored bundle message.  This patch adds a new struct
ofputil_miniflow_mod, that uses a minimatch instead of match in hopes
of using less memory when handling large bundles.

Signed-off-by: Jarno Rajahalme 
---
 lib/match.c|9 ++
 lib/match.h|1 +
 lib/ofp-util.c |   14 
 lib/ofp-util.h |  100 +-
 ofproto/bundles.h  |7 +-
 ofproto/ofproto-provider.h |2 +-
 ofproto/ofproto.c  |  197 +++-
 7 files changed, 212 insertions(+), 118 deletions(-)

diff --git a/lib/match.c b/lib/match.c
index 7d0b409..a56a306 100644
--- a/lib/match.c
+++ b/lib/match.c
@@ -1193,6 +1193,15 @@ minimatch_init(struct minimatch *dst, const struct match 
*src)
 miniflow_init_with_minimask(&dst->flow, &src->flow, &dst->mask);
 }
 
+/* Initializes 'match' as a "catch-all" match that matches every packet. */
+void
+minimatch_init_catchall(struct minimatch *match)
+{
+match->mask.masks.map = match->flow.map = 0;
+match->mask.masks.values_inline = true;
+match->flow.values_inline = true;
+}
+
 /* Initializes 'dst' as a copy of 'src'.  The caller must eventually free 'dst'
  * with minimatch_destroy(). */
 void
diff --git a/lib/match.h b/lib/match.h
index 6633304..822ab3f 100644
--- a/lib/match.h
+++ b/lib/match.h
@@ -170,6 +170,7 @@ struct minimatch {
 };
 
 void minimatch_init(struct minimatch *, const struct match *);
+void minimatch_init_catchall(struct minimatch *);
 void minimatch_clone(struct minimatch *, const struct minimatch *);
 void minimatch_move(struct minimatch *dst, struct minimatch *src);
 void minimatch_destroy(struct minimatch *);
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 89359c1..645fc4a 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -1839,6 +1839,20 @@ ofputil_decode_flow_mod(struct ofputil_flow_mod *fm,
  fm->table_id, max_table, protocol);
 }
 
+void
+ofputil_miniflow_mod_init(struct ofputil_miniflow_mod *mfm,
+  const struct ofputil_flow_mod *fm)
+{
+memcpy(mfm, fm, offsetof(struct ofputil_miniflow_mod, match));
+minimatch_init(&mfm->match, &fm->match);
+}
+
+void
+ofputil_miniflow_mod_uninit(struct ofputil_miniflow_mod *mfm)
+{
+minimatch_destroy(&mfm->match);
+}
+
 static enum ofperr
 ofputil_pull_bands(struct ofpbuf *msg, size_t len, uint16_t *n_bands,
struct ofpbuf *bands)
diff --git a/lib/ofp-util.h b/lib/ofp-util.h
index 596c2e2..0f8930d 100644
--- a/lib/ofp-util.h
+++ b/lib/ofp-util.h
@@ -270,51 +270,59 @@ enum ofputil_flow_mod_flags {
  *
  * The handling of cookies across multiple versions of OpenFlow is a bit
  * confusing.  See DESIGN for the details. */
-struct ofputil_flow_mod {
-struct ovs_list list_node; /* For queuing flow_mods. */
 
-struct match match;
+/* We need this same layout in two places.  Ideally this would be a struct, but
+ * then all access to these would become uglier.  This is one of the rare
+ * instances where I hope we allowed some elemeentary C++ in the code base.
+ * This define is somewhat ugly, but avoids repeating these in multiple places
+ * and the related maintenance problems. */
+#define OFPUTIL_FLOW_MOD_DATA   \
+/* Cookie matching.  The flow_mod affects only flows that have  \
+ * cookies that bitwise match 'cookie' bits in positions where  \
+ * 'cookie_mask' has 1-bits.\
+ * 'cookie_mask' should be zero for OFPFC_ADD flow_mods. */ \
+ovs_be64 cookie; /* Cookie bits to match. */\
+ovs_be64 cookie_mask;/* 1-bit in each 'cookie' bit to match. */ \
+\
+/* Cookie changes.  \
+ *  \
+ * OFPFC_ADD uses 'new_cookie' as the new flow's cookie.\
+ * 'new_cookie' should not be UINT64_MAX.   \
+ *  \
+ * OFPFC_MODIFY and OFPFC_MODIFY_STRICT have two cases: \
+ *  \
+ *   - If one or more matching flows exist and 'modify_cookie' is   \
+ * true, then the flow_mod changes the existing flows' cookies  \
+ * to 'new_cookie'.  'new_cookie' should not be UINT64_MAX. \
+ *  \
+ *   - If no matching flow exists, 'new_cookie' is not UINT64_MAX,  \
+ * and 'cookie_mask' is 0, then the flow_mod adds a new flow\
+ * with 'new_cookie' as its cookie. \
+ */   

[ovs-dev] [PATCH v4 10/12] ofproto: Postpone sending flow removed messages.

2015-06-09 Thread Jarno Rajahalme
The final flow stats are available only after there are no references
to the rule.  Postpone sending the flow removed message until the
final stats are available.

Signed-off-by: Jarno Rajahalme 
---
 include/openflow/openflow-common.h |2 ++
 lib/ofp-print.c|1 +
 ofproto/ofproto-provider.h |5 +
 ofproto/ofproto.c  |   37 +---
 4 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/include/openflow/openflow-common.h 
b/include/openflow/openflow-common.h
index e4fecce..d32213f 100644
--- a/include/openflow/openflow-common.h
+++ b/include/openflow/openflow-common.h
@@ -301,6 +301,8 @@ enum ofp_flow_removed_reason {
 OFPRR_GROUP_DELETE, /* Group was removed. */
 OFPRR_METER_DELETE, /* Meter was removed. */
 OFPRR_EVICTION, /* Switch eviction to free resources. */
+
+OVS_OFPRR_NONE  /* OVS internal_use only, keep last!. */
 };
 
 /* What changed about the physical port */
diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index 96e65a7..2ac11b1 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -865,6 +865,7 @@ ofp_flow_removed_reason_to_string(enum 
ofp_flow_removed_reason reason,
 return "eviction";
 case OFPRR_METER_DELETE:
 return "meter_delete";
+case OVS_OFPRR_NONE:
 default:
 snprintf(reasonbuf, bufsize, "%d", (int) reason);
 return reasonbuf;
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index fd66e49..57c6d7b 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -356,6 +356,11 @@ struct rule {
 /* Eviction precedence. */
 uint16_t importance OVS_GUARDED;
 
+/* Removal reason for sending flow removed message.
+ * Used only if 'flags' has OFPUTIL_FF_SEND_FLOW_REM set and if the
+ * value is not OVS_OFPRR_NONE. */
+uint8_t removed_reason;
+
 /* Eviction groups (see comment on struct eviction_group for explanation) .
  *
  * 'eviction_group' is this rule's eviction group, or NULL if it is not in
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index be65d24..644b201 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -232,7 +232,8 @@ struct ofport_usage {
 };
 
 /* rule. */
-static void ofproto_rule_send_removed(struct rule *, uint8_t reason);
+static void ofproto_rule_send_removed(struct rule *)
+OVS_EXCLUDED(ofproto_mutex);
 static bool rule_is_readonly(const struct rule *);
 static void ofproto_rule_insert__(struct ofproto *, struct rule *)
 OVS_REQUIRES(ofproto_mutex);
@@ -2759,7 +2760,14 @@ ofproto_rule_destroy__(struct rule *rule)
 
 static void
 rule_destroy_cb(struct rule *rule)
+OVS_NO_THREAD_SAFETY_ANALYSIS
 {
+/* Send rule removed if needed. */
+if (rule->flags & OFPUTIL_FF_SEND_FLOW_REM
+&& rule->removed_reason != OVS_OFPRR_NONE
+&& !rule_is_hidden(rule)) {
+ofproto_rule_send_removed(rule);
+}
 rule->ofproto->ofproto_class->rule_destruct(rule);
 ofproto_rule_destroy__(rule);
 }
@@ -4613,6 +4621,7 @@ replace_rule_create(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 rule->idle_timeout = fm->idle_timeout;
 rule->hard_timeout = fm->hard_timeout;
 rule->importance = fm->importance;
+rule->removed_reason = OVS_OFPRR_NONE;
 
 *CONST_CAST(uint8_t *, &rule->table_id) = table_id;
 rule->flags = fm->flags & OFPUTIL_FF_STATE;
@@ -4761,9 +4770,7 @@ replace_rule_finish(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm,
 } else {
 /* XXX: This is slight duplication with delete_flows_finish__() */
 
-/* XXX: This call should done when rule's refcount reaches
- * zero to get accurate stats in the flow removed message. */
-ofproto_rule_send_removed(old_rule, OFPRR_EVICTION);
+old_rule->removed_reason = OFPRR_EVICTION;
 
 ofmonitor_report(ofproto->connmgr, old_rule, NXFME_DELETED,
  OFPRR_EVICTION,
@@ -4970,7 +4977,10 @@ delete_flows_finish__(struct ofproto *ofproto,
 for (size_t i = 0; i < rules->n; i++) {
 struct rule *rule = rules->rules[i];
 
-ofproto_rule_send_removed(rule, reason);
+/* This value will be used to send the flow removed message right
+ * before the rule is actually destroyed. */
+rule->removed_reason = reason;
+
 ofmonitor_report(ofproto->connmgr, rule, NXFME_DELETED, reason,
  req ? req->ofconn : NULL,
  req ? req->request->xid : 0, NULL);
@@ -5081,23 +5091,20 @@ delete_flow_start_strict(struct ofproto *ofproto,
 return error;
 }
 
-/* XXX: This should be sent right when the rule refcount gets to zero! */
+/* This may only be called by rule_destroy_cb()! */
 static void
-ofproto_rule_send_removed(struct rule *rule, uint8_t reason)
-OVS_REQUIRES(ofproto_mutex)
+ofpr

Re: [ovs-dev] [PATCH] dpif: Always generate RFC4122 UUIDs for UFID.

2015-06-09 Thread Joe Stringer
On 9 June 2015 at 15:12, Ben Pfaff  wrote:
> On Tue, Jun 09, 2015 at 01:57:59PM -0700, Joe Stringer wrote:
>> This patch sacrifices a few bits of hash quality from the 128-bit unique
>> flow identifiers to make the UFIDs RFC4122-conformant as per the version 4
>> (random) UUID spec. Given that the 128-bit space is already quite large,
>> this should not affect the spread of UFIDs in any meaningful way for
>> hashing.
>>
>> Signed-off-by: Joe Stringer 
>> ---
>> This is independent of the recent series "odp-util: Reuse UUID
>> marshalling for UFID.", but may make more sense in the context of the
>> discussion on that patch.
>
> Acked-by: Ben Pfaff 

Thanks, applied.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] odp-util: Reuse UUID marshalling for UFID.

2015-06-09 Thread Joe Stringer
On 9 June 2015 at 13:29, Joe Stringer  wrote:
> On 9 June 2015 at 12:33, Ben Pfaff  wrote:
>> On Tue, Jun 09, 2015 at 11:41:04AM -0700, Joe Stringer wrote:
>>> On 7 June 2015 at 11:03, Ben Pfaff  wrote:
>>> > On Mon, Jun 01, 2015 at 06:22:19PM -0700, Joe Stringer wrote:
>>> >> Unique flow identifiers are really a UUID of sorts, so it makes sense to
>>> >> reuse the UUID string representations for UFID.
>>> >>
>>> >> Suggested-by: Ben Pfaff 
>>> >> Signed-off-by: Joe Stringer 
>>> >
>>> > Acked-by: Ben Pfaff 
>>>
>>> It strikes me that it may be somewhat misleading to print UFIDs as
>>> UUIDs, given that there are specific bits with particular meaning in
>>> the UUID standards. One way to resolve this would be to tweak the UFID
>>> generation to create variant 4 UUIDs ("random") by overwriting the
>>> appropriate bits after hashing.
>>
>> The same did occur to me.  It's probably a good idea, if you're up for
>> it.
>
> Sure. It should be a minor patch on top of this.

I applied this series to master, thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/4] rculist: Use ovsrcu_set() when poisoning.

2015-06-09 Thread Ben Pfaff
On Tue, Jun 09, 2015 at 05:05:57PM -0700, Jarno Rajahalme wrote:
> 
> > On Jun 9, 2015, at 3:30 PM, Ben Pfaff  wrote:
> > 
> > On Tue, Jun 09, 2015 at 03:25:08PM -0700, Jarno Rajahalme wrote:
> >> Should not use ovsrcu_set_hidden() when the pointer may have been
> >> visible to other threads already.
> >> 
> >> Signed-off-by: Jarno Rajahalme 
> > 
> > I think that rculist_poison__() is only appropriate when an rculist
> > should not be visible to any other threads.  No?
> 
> That was my initial thought as well, but we poison for the case other
> threads are still using it (while they shouldn't), so I thought
> ovsrcu_set_hidden() is not appropriate here after all. I doubt this
> has any practical difference either way, more about what is
> appropriate use.

Is there a code generation difference on x86-64?  If not, then I'm happy
with this change:
Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] OVN MAC learning on lports with unknow macs set

2015-06-09 Thread Liran Schour
Hi all,

I want to add learning capabilities for lport with unknown macs set..

When an unknown mac source address is going out from a logical port that 
its macs sets to "unknown", an additional controller action will be take 
and the packet will be sent to the controller by a PACKET_IN msg. 
The ovn-controller should add this new mac to the lport macs set on the NB 
DB.

Any thoughts on this?

Thanks,
- Liran
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev