Re: [ovs-dev] [PATCH] tests: Add '-latomic' only for 'test-atomic.c'

2014-03-11 Thread cosmin.parasc...@freescale.com
> Commit fd2e50c (Add check for -latomic) fixed a bug for 32-bit
> PowerPC systems, where the compiler would generate calls to
> functions implemented by libatomic, such as '__atomic_load_8'.
> 
> Nonetheless, it passes '-latomic' to all files that get compiled
> and that isn't necessary, so add it only for 'test-atomic.c'.
> 

The patch breaks the build on systems that don't have GCC 4.8.x, as they won't 
have libatomic. I'll think of another solution.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Netflow5 records' init_time

2014-03-11 Thread Lior Neudorfer
Hi,

I'm seeing wrong behavior of Netflow5 records emitted by Open vSwitch
2.1.90.

I've configured my switch to export Netflow records with an active timeout
of 30 seconds.
I've then started a single, long TCP session flowing through the switch.

The first netflow record seems OK - after 30 seconds, I get the first
record describing the session and it's OK.
>From that moment on, Netflow records about the session seem incorrect - the
start_time field ("init_time" in openvswitch code) is consistently set to a
very large value (which is identical in all records).
end_time is correct - it specifies the time in which the last packet from
the session was seen.

---

I've looked at ofproto/netflow.c, and I believe the problem lies in
the netflow_expire__ function.
The function calls nf_flow->created = 0; however, in gen_netflow_rec,
init_time is calculated like this:
nf_rec->init_time = htonl(nf_flow->created - nf->boot_time);

So for long sessions, which expire every active_timeout seconds, I will get
init_time = (0-boot_time) in every record which describes the ongoing
session.

I've recompiled openvswitch without the "nf_flow->created = 0" code, and
start_time looks good.
Any reason why this line should stay?

Thanks,
Lior Neudorfer
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] LISP documentation improvements

2014-03-11 Thread Lorand Jakab
Someone familiar with LISP but not so familiar with OVS pointed out to me in a
private email that the relationship between some LISP concepts and their OVS
implementation is not clearly explained.  I addressed his comments in the
enclosed patch, which he reviewed.

As an aside, I wonder if we should at some point move the information in the
README-lisp file to a more approproate location?

Lorand Jakab (1):
  README-lisp: improve LISP documentation

 README-lisp | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

-- 
1.8.3.2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] README-lisp: improve LISP documentation

2014-03-11 Thread Lorand Jakab
People familiar with LISP are used to the concept of a mapping cache in
a LISP Tunnel Router.  Explain how that concept maps to OVS flow rules.
Additionally, mention that eth0 need not be added in all example
scenarios.

Signed-off-by: Lorand Jakab 
---
 README-lisp | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/README-lisp b/README-lisp
index 6057706..f1e1172 100644
--- a/README-lisp
+++ b/README-lisp
@@ -30,7 +30,8 @@ VMs.
 In case 2) the VMs expect ARP replies from each other, but this is not
 possible over a layer 3 tunnel.  One solution is to have static MAC address
 entries preconfigured on the VMs (e.g., `arp -f /etc/ethers` on startup on
-Unix based VMs), or have the hypervisor do proxy ARP.
+Unix based VMs), or have the hypervisor do proxy ARP.  In this scenario, the
+eth0 interfaces need not be added to the br0 bridge in the examples below.
 
 On the receiving side, the packet arrives without the original MAC header.
 The LISP tunneling code attaches a header with harcoded source and destination
@@ -61,12 +62,20 @@ bridge instance, and become numbered 1, 2, and 3 
respectively:
 ovs-vsctl add-port br0 eth0
 ovs-vsctl add-port br0 lisp0 -- set Interface lisp0 type=lisp 
options:remote_ip=flow options:key=flow
 
-Flows on br0 are configured as follows:
+The last command sets up flow based tunneling on the lisp0 interface.  From
+the LISP point of view, this is like having the Tunnel Router map cache
+implemented as flow rules.
+
+Flows on br0 should be configured as follows:
 
 priority=3,dl_dst=02:00:00:00:00:00,action=mod_dl_dst:,output:1
 priority=2,in_port=1,dl_type=0x0806,action=NORMAL
 
priority=1,in_port=1,dl_type=0x0800,vlan_tci=0,nw_src=,action=set_field:->tun_dst,output:3
 priority=0,action=NORMAL
 
-Optionally, if you want to use Instance ID in a flow, you can set it with
-"action=set_tunnel:".
+The third rule is like a map cache entry:  the  specified by the
+nw_src match field is mapped to the RLOC , which is set as the tunnel
+destination for this particular flow.
+
+Optionally, if you want to use Instance ID in a flow, you can add
+"set_tunnel:" to the action list.
-- 
1.8.3.2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] openvswitch: Orphan frags before sending to userspace via Netlink to avoid guest stall

2014-03-11 Thread Zoltan Kiss

On 07/03/14 17:59, Thomas Graf wrote:

On 03/07/2014 06:28 PM, Pravin Shelar wrote:

Problem is mapping SKBTX_DEV_ZEROCOPY pages to userspace. skb_zerocopy
is not doing that.

Unless I missing something, Current netlink code can not handle
skb-frags with zero copy. So we have to copy skb anyways and no need
to orphan-frags here.
If you are planning on handling skb-frags without copying then
skb_orphan_frags should be done in netlink.


If you look at the second part of skb_zerocopy() this is exactly what
it is doing unless the target skb has sufficient linear space
preallocated. At least unless mmap is enabled in which case we would
have to copy again until we have implemented a way to pass page refs
via the nl ring buffer.

So I think Zoltan is correct in orphaning frags that come from f.e.
a tun device via zerocopy_sg_from_iovec().


Now as I'm checking how Netlink works, I might be wrong at some parts :) 
skb_zerocopy correctly add the frags to the user_skb we are sending 
upwards, however when the userspace receive it in netlink_recvmsg(), it 
gets copied to the supplied buffer anyway. Is that correct? In which 
case we don't need to worry that userspace will sit on that page 
indefinitely. However we have to worry about userspace not calling recv 
on that Netlink socket, so in the end we still need skb_orphan_frags, 
just for a different reason :)

We can put skb_orphan_frags into skb_zerocopy, skb_clone also do that.

However with Netlink mmapped IO, we should take a different approach, 
and instead of calling skb_orphan_frags we should make sure user_skb can 
hold any skb we get from the kernel, and copy the frags there. Even if 
we would be able to pass page refs to userspace through the ring buffer 
(AFAIK currently we can't), it would be fragile to just pass kernel 
pages directly to userspace, even if they came without the 
SKBTX_DEV_ZEROCOPY flag. And I think it would be quite rare that we need 
that copy anyway, because the flow setup usually happens with small 
packets without frags.
If we choose the above approach with Netlink mmap, we don't need 
skb_orphan_frags, in fact


Regards,

Zoli
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [urcu v2 08/15] ovs-atomic-types: Move into ovs-atomic.h.

2014-03-11 Thread Ben Pfaff
Every implementation used this same code, so it makes sense to centralize
it.

Signed-off-by: Ben Pfaff 
---
 lib/automake.mk   |1 -
 lib/ovs-atomic-c11.h  |1 -
 lib/ovs-atomic-clang.h|1 -
 lib/ovs-atomic-gcc4+.h|1 -
 lib/ovs-atomic-gcc4.7+.h  |1 -
 lib/ovs-atomic-pthreads.h |1 -
 lib/ovs-atomic-types.h|   47 -
 lib/ovs-atomic.h  |   42 +++-
 8 files changed, 41 insertions(+), 54 deletions(-)
 delete mode 100644 lib/ovs-atomic-types.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 6ed40f6..cc04633 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -139,7 +139,6 @@ lib_libopenvswitch_la_SOURCES = \
lib/ovs-atomic-locked.c \
lib/ovs-atomic-locked.h \
lib/ovs-atomic-pthreads.h \
-   lib/ovs-atomic-types.h \
lib/ovs-atomic.h \
lib/ovs-thread.c \
lib/ovs-thread.h \
diff --git a/lib/ovs-atomic-c11.h b/lib/ovs-atomic-c11.h
index 96aec7c..946d412 100644
--- a/lib/ovs-atomic-c11.h
+++ b/lib/ovs-atomic-c11.h
@@ -24,7 +24,6 @@
 
 #define OMIT_STANDARD_ATOMIC_TYPES 1
 #define ATOMIC(TYPE) _Atomic(TYPE)
-#include "ovs-atomic-types.h"
 
 #define atomic_read(SRC, DST) \
 atomic_read_explicit(SRC, DST, memory_order_seq_cst)
diff --git a/lib/ovs-atomic-clang.h b/lib/ovs-atomic-clang.h
index c169f37..6792985 100644
--- a/lib/ovs-atomic-clang.h
+++ b/lib/ovs-atomic-clang.h
@@ -22,7 +22,6 @@
 #define OVS_ATOMIC_CLANG_IMPL 1
 
 #define ATOMIC(TYPE) _Atomic(TYPE)
-#include "ovs-atomic-types.h"
 
 #define ATOMIC_VAR_INIT(VALUE) (VALUE)
 
diff --git a/lib/ovs-atomic-gcc4+.h b/lib/ovs-atomic-gcc4+.h
index 923e624..e237c20 100644
--- a/lib/ovs-atomic-gcc4+.h
+++ b/lib/ovs-atomic-gcc4+.h
@@ -23,7 +23,6 @@
 #define OVS_ATOMIC_GCC4P_IMPL 1
 
 #define ATOMIC(TYPE) TYPE
-#include "ovs-atomic-types.h"
 
 #define ATOMIC_BOOL_LOCK_FREE 2
 #define ATOMIC_CHAR_LOCK_FREE 2
diff --git a/lib/ovs-atomic-gcc4.7+.h b/lib/ovs-atomic-gcc4.7+.h
index da88558..99c6b1f 100644
--- a/lib/ovs-atomic-gcc4.7+.h
+++ b/lib/ovs-atomic-gcc4.7+.h
@@ -20,7 +20,6 @@
 #endif
 
 #define ATOMIC(TYPE) TYPE
-#include "ovs-atomic-types.h"
 
 typedef enum {
 memory_order_relaxed = __ATOMIC_RELAXED,
diff --git a/lib/ovs-atomic-pthreads.h b/lib/ovs-atomic-pthreads.h
index 4b27bc2..b49d699 100644
--- a/lib/ovs-atomic-pthreads.h
+++ b/lib/ovs-atomic-pthreads.h
@@ -24,7 +24,6 @@
 #define OVS_ATOMIC_PTHREADS_IMPL 1
 
 #define ATOMIC(TYPE) TYPE
-#include "ovs-atomic-types.h"
 
 #define ATOMIC_BOOL_LOCK_FREE 0
 #define ATOMIC_CHAR_LOCK_FREE 0
diff --git a/lib/ovs-atomic-types.h b/lib/ovs-atomic-types.h
deleted file mode 100644
index bbce476..000
--- a/lib/ovs-atomic-types.h
+++ /dev/null
@@ -1,47 +0,0 @@
-/* This header defines atomic_* types using an ATOMIC macro provided by the
-* caller. */
-#ifndef IN_OVS_ATOMIC_H
-#error "This header should only be included indirectly via ovs-atomic.h."
-#endif
-
-#ifndef OMIT_STANDARD_ATOMIC_TYPES
-typedef ATOMIC(bool)   atomic_bool;
-
-typedef ATOMIC(char)   atomic_char;
-typedef ATOMIC(signed char)atomic_schar;
-typedef ATOMIC(unsigned char)  atomic_uchar;
-
-typedef ATOMIC(short)  atomic_short;
-typedef ATOMIC(unsigned short) atomic_ushort;
-
-typedef ATOMIC(int)atomic_int;
-typedef ATOMIC(unsigned int)   atomic_uint;
-
-typedef ATOMIC(long)   atomic_long;
-typedef ATOMIC(unsigned long)  atomic_ulong;
-
-typedef ATOMIC(long long)  atomic_llong;
-typedef ATOMIC(unsigned long long) atomic_ullong;
-
-typedef ATOMIC(size_t) atomic_size_t;
-typedef ATOMIC(ptrdiff_t)  atomic_ptrdiff_t;
-
-typedef ATOMIC(intmax_t)   atomic_intmax_t;
-typedef ATOMIC(uintmax_t)  atomic_uintmax_t;
-
-typedef ATOMIC(intptr_t)   atomic_intptr_t;
-typedef ATOMIC(uintptr_t)  atomic_uintptr_t;
-#endif  /* !OMIT_STANDARD_ATOMIC_TYPES */
-
-/* Nonstandard atomic types. */
-typedef ATOMIC(uint8_t)   atomic_uint8_t;
-typedef ATOMIC(uint16_t)  atomic_uint16_t;
-typedef ATOMIC(uint32_t)  atomic_uint32_t;
-typedef ATOMIC(uint64_t)  atomic_uint64_t;
-
-typedef ATOMIC(int8_t)atomic_int8_t;
-typedef ATOMIC(int16_t)   atomic_int16_t;
-typedef ATOMIC(int32_t)   atomic_int32_t;
-typedef ATOMIC(int64_t)   atomic_int64_t;
-
-#undef OMIT_STANDARD_ATOMIC_TYPES
diff --git a/lib/ovs-atomic.h b/lib/ovs-atomic.h
index a9d257a..62cd258 100644
--- a/lib/ovs-atomic.h
+++ b/lib/ovs-atomic.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -284,6 +284,46 @@
 #endif
 #undef IN_OVS_ATOMIC_H
 
+#ifndef OMIT_STANDARD_ATOMIC_TYPES
+typedef ATOMIC(bool)   atomic_bool;
+
+typedef ATOMIC(char)   atomic_char;
+typedef ATOMIC(signe

[ovs-dev] [urcu v2 03/15] util: New macro PAD_SIZE.

2014-03-11 Thread Ben Pfaff
PAD_SIZE(x,y) is a little shorter and may have a more obvious meaning
than ROUND_UP(x,y) - x.

I intend to add more users in an upcoming comment.

Signed-off-by: Ben Pfaff 
---
 lib/netlink.c |6 +++---
 lib/nx-match.c|   10 +-
 lib/ofp-actions.c |6 +++---
 lib/ofp-actions.h |2 +-
 lib/util.h|3 +++
 5 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/lib/netlink.c b/lib/netlink.c
index ada429e..3f479be 100644
--- a/lib/netlink.c
+++ b/lib/netlink.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -174,7 +174,7 @@ nl_msg_put(struct ofpbuf *msg, const void *data, size_t 
size)
 void *
 nl_msg_put_uninit(struct ofpbuf *msg, size_t size)
 {
-size_t pad = NLMSG_ALIGN(size) - size;
+size_t pad = PAD_SIZE(size, NLMSG_ALIGNTO);
 char *p = ofpbuf_put_uninit(msg, size + pad);
 if (pad) {
 memset(p + size, 0, pad);
@@ -197,7 +197,7 @@ nl_msg_push(struct ofpbuf *msg, const void *data, size_t 
size)
 void *
 nl_msg_push_uninit(struct ofpbuf *msg, size_t size)
 {
-size_t pad = NLMSG_ALIGN(size) - size;
+size_t pad = PAD_SIZE(size, NLMSG_ALIGNTO);
 char *p = ofpbuf_push_uninit(msg, size + pad);
 if (pad) {
 memset(p + size, 0, pad);
diff --git a/lib/nx-match.c b/lib/nx-match.c
index 437e85b..de79009 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -730,7 +730,7 @@ nx_put_match(struct ofpbuf *b, const struct match *match,
 {
 int match_len = nx_put_raw(b, false, match, cookie, cookie_mask);
 
-ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
 return match_len;
 }
 
@@ -753,7 +753,7 @@ oxm_put_match(struct ofpbuf *b, const struct match *match)
 
 ofpbuf_put_uninit(b, sizeof *omh);
 match_len = nx_put_raw(b, true, match, cookie, cookie_mask) + sizeof *omh;
-ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
 
 omh = ofpbuf_at(b, start_len, sizeof *omh);
 omh->type = htons(OFPMT_OXM);
@@ -994,7 +994,7 @@ int
 nx_match_from_string(const char *s, struct ofpbuf *b)
 {
 int match_len = nx_match_from_string_raw(s, b);
-ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
 return match_len;
 }
 
@@ -1007,7 +1007,7 @@ oxm_match_from_string(const char *s, struct ofpbuf *b)
 
 ofpbuf_put_uninit(b, sizeof *omh);
 match_len = nx_match_from_string_raw(s, b) + sizeof *omh;
-ofpbuf_put_zeros(b, ROUND_UP(match_len, 8) - match_len);
+ofpbuf_put_zeros(b, PAD_SIZE(match_len, 8));
 
 omh = ofpbuf_at(b, start_len, sizeof *omh);
 omh->type = htons(OFPMT_OXM);
diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index 3cd1e8b..273ceae 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -3627,8 +3627,8 @@ ofpact_update_len(struct ofpbuf *ofpacts, struct ofpact 
*ofpact)
 void
 ofpact_pad(struct ofpbuf *ofpacts)
 {
-unsigned int rem = ofpacts->size % OFPACT_ALIGNTO;
-if (rem) {
-ofpbuf_put_zeros(ofpacts, OFPACT_ALIGNTO - rem);
+unsigned int pad = PAD_SIZE(ofpacts->size, OFPACT_ALIGNTO);
+if (pad) {
+ofpbuf_put_zeros(ofpacts, pad);
 }
 }
diff --git a/lib/ofp-actions.h b/lib/ofp-actions.h
index 0f6bf70..0a1b46d 100644
--- a/lib/ofp-actions.h
+++ b/lib/ofp-actions.h
@@ -436,7 +436,7 @@ struct ofpact_meter {
  * Used for OFPIT11_WRITE_ACTIONS. */
 struct ofpact_nest {
 struct ofpact ofpact;
-uint8_t pad[OFPACT_ALIGN(sizeof(struct ofpact)) - sizeof(struct ofpact)];
+uint8_t pad[PAD_SIZE(sizeof(struct ofpact), OFPACT_ALIGNTO)];
 struct ofpact actions[];
 };
 BUILD_ASSERT_DECL(offsetof(struct ofpact_nest, actions) == OFPACT_ALIGNTO);
diff --git a/lib/util.h b/lib/util.h
index 9afe10e..41fd51d 100644
--- a/lib/util.h
+++ b/lib/util.h
@@ -111,6 +111,9 @@ extern const char *program_name;
 /* Returns X rounded up to the nearest multiple of Y. */
 #define ROUND_UP(X, Y) (DIV_ROUND_UP(X, Y) * (Y))
 
+/* Returns the least number that, when added to X, yields a multiple of Y. */
+#define PAD_SIZE(X, Y) (ROUND_UP(X, Y) - (X))
+
 /* Returns X rounded down to the nearest multiple of Y. */
 #define ROUND_DOWN(X, Y) ((X) / (Y) * (Y))
 
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [urcu v2 00/15] implement userspace RCU and add a few simple users

2014-03-11 Thread Ben Pfaff
v1->v2: Redid all the ovs-atomic changes to fix a bug in the GCC 4.x (x <
7) support.  This was not a simple fix; all the ovs-atomic patches are
essentially new.  Also, changed the ovs-rcu library to use "consume" rather
than "acquire" based on this:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1525.htm

Ben Pfaff (15):
  ovs-atomic-gcc4+: Fix parenthesization in atomic_read_explicit().
  compiler: New macro for defining aligned structs.
  util: New macro PAD_SIZE.
  util: Move CACHE_LINE_SIZE here.
  ovs-atomic: Factor type declarations out of most implementations.
  ovs-atomic: Use raw types, not structs, when locks are required.
  ovs-atomic-pthreads: Use global shared locks for atomic_flag also.
  ovs-atomic-types: Move into ovs-atomic.h.
  ovs-atomic: Delete atomic, atomic_flag, ovs_refcount destroy
functions.
  ovs-rcu: New library.
  ofproto: Use RCU to protect rule_actions.
  util: New functions for allocating memory while avoiding false
sharing.
  ovs-thread: Replace ovsthread_counter by more general
ovsthread_stats.
  dpif-netdev: Use ovsthread_stats for flow stats.
  dpif-netdev: Use RCU to protect data.

 lib/automake.mk   |6 +-
 lib/bfd.c |3 +-
 lib/cfm.c |6 +-
 lib/compiler.h|   16 ++-
 lib/dpif-linux.c  |3 +-
 lib/dpif-netdev.c |  299 ++---
 lib/fat-rwlock.c  |   34 +
 lib/lacp.c|3 +-
 lib/mac-learning.c|3 +-
 lib/netdev.c  |1 -
 lib/netlink-socket.c  |1 -
 lib/netlink.c |6 +-
 lib/nx-match.c|   10 +-
 lib/ofp-actions.c |6 +-
 lib/ofp-actions.h |2 +-
 lib/ovs-atomic-c11.h  |   26 +---
 lib/ovs-atomic-clang.h|   42 +-
 lib/ovs-atomic-flag-gcc4.7+.h |   14 +-
 lib/ovs-atomic-gcc4+.c|   68 --
 lib/ovs-atomic-gcc4+.h|  210 +++--
 lib/ovs-atomic-gcc4.7+.h  |   44 +-
 lib/ovs-atomic-locked.c   |   58 
 lib/ovs-atomic-locked.h   |   32 +
 lib/ovs-atomic-pthreads.c |   78 ---
 lib/ovs-atomic-pthreads.h |  151 +
 lib/ovs-atomic.h  |   80 ++-
 lib/ovs-rcu.c |  293 
 lib/ovs-rcu.h |  170 +++
 lib/ovs-thread.c  |  107 +++
 lib/ovs-thread.h  |   25 +++-
 lib/stp.c |3 +-
 lib/timeval.c |   11 ++
 lib/util.c|   60 -
 lib/util.h|   14 +-
 ofproto/bond.c|3 +-
 ofproto/connmgr.c |5 +-
 ofproto/netflow.c |3 +-
 ofproto/ofproto-dpif-ipfix.c  |3 +-
 ofproto/ofproto-dpif-sflow.c  |1 -
 ofproto/ofproto-dpif-upcall.c |8 +-
 ofproto/ofproto-dpif-xlate.c  |2 -
 ofproto/ofproto-dpif.c|2 -
 ofproto/ofproto-provider.h|   19 ++-
 ofproto/ofproto.c |  122 +++--
 tests/test-atomic.c   |4 +-
 45 files changed, 1143 insertions(+), 914 deletions(-)
 delete mode 100644 lib/ovs-atomic-gcc4+.c
 create mode 100644 lib/ovs-atomic-locked.c
 create mode 100644 lib/ovs-atomic-locked.h
 delete mode 100644 lib/ovs-atomic-pthreads.c
 create mode 100644 lib/ovs-rcu.c
 create mode 100644 lib/ovs-rcu.h

-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [urcu v2 07/15] ovs-atomic-pthreads: Use global shared locks for atomic_flag also.

2014-03-11 Thread Ben Pfaff
This will eliminate the need for atomic_flag_destroy().

Signed-off-by: Ben Pfaff 
---
 lib/automake.mk   |1 -
 lib/ovs-atomic-pthreads.c |   78 -
 lib/ovs-atomic-pthreads.h |   56 +++-
 3 files changed, 48 insertions(+), 87 deletions(-)
 delete mode 100644 lib/ovs-atomic-pthreads.c

diff --git a/lib/automake.mk b/lib/automake.mk
index 1310dcd..6ed40f6 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -138,7 +138,6 @@ lib_libopenvswitch_la_SOURCES = \
lib/ovs-atomic-gcc4.7+.h \
lib/ovs-atomic-locked.c \
lib/ovs-atomic-locked.h \
-   lib/ovs-atomic-pthreads.c \
lib/ovs-atomic-pthreads.h \
lib/ovs-atomic-types.h \
lib/ovs-atomic.h \
diff --git a/lib/ovs-atomic-pthreads.c b/lib/ovs-atomic-pthreads.c
deleted file mode 100644
index 7311135..000
--- a/lib/ovs-atomic-pthreads.c
+++ /dev/null
@@ -1,78 +0,0 @@
-/*
- * Copyright (c) 2013, 2014 Nicira, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at:
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-#include 
-
-#include "ovs-atomic.h"
-#include "ovs-thread.h"
-
-#if OVS_ATOMIC_PTHREADS_IMPL
-void
-atomic_flag_init(volatile atomic_flag *flag_)
-{
-atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
-
-pthread_mutex_init(&flag->mutex, NULL);
-atomic_flag_clear(flag_);
-}
-
-void
-atomic_flag_destroy(volatile atomic_flag *flag_)
-{
-atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
-
-pthread_mutex_destroy(&flag->mutex);
-}
-
-bool
-atomic_flag_test_and_set(volatile atomic_flag *flag_)
-{
-atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
-bool old_value;
-
-xpthread_mutex_lock(&flag->mutex);
-old_value = flag->b;
-flag->b = true;
-xpthread_mutex_unlock(&flag->mutex);
-
-return old_value;
-}
-
-bool
-atomic_flag_test_and_set_explicit(volatile atomic_flag *flag,
-  memory_order order OVS_UNUSED)
-{
-return atomic_flag_test_and_set(flag);
-}
-
-void
-atomic_flag_clear(volatile atomic_flag *flag_)
-{
-atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
-
-xpthread_mutex_lock(&flag->mutex);
-flag->b = false;
-xpthread_mutex_unlock(&flag->mutex);
-}
-
-void
-atomic_flag_clear_explicit(volatile atomic_flag *flag,
-   memory_order order OVS_UNUSED)
-{
-return atomic_flag_clear(flag);
-}
-
-#endif  /* OVS_ATOMIC_PTHREADS_IMPL */
diff --git a/lib/ovs-atomic-pthreads.h b/lib/ovs-atomic-pthreads.h
index 7b742cd..4b27bc2 100644
--- a/lib/ovs-atomic-pthreads.h
+++ b/lib/ovs-atomic-pthreads.h
@@ -90,15 +90,55 @@ atomic_signal_fence(memory_order order OVS_UNUSED)
 
 typedef struct {
 bool b;
-pthread_mutex_t mutex;
 } atomic_flag;
-#define ATOMIC_FLAG_INIT { false, PTHREAD_MUTEX_INITIALIZER }
+#define ATOMIC_FLAG_INIT { false }
 
-void atomic_flag_init(volatile atomic_flag *);
-void atomic_flag_destroy(volatile atomic_flag *);
+static inline void
+atomic_flag_init(volatile atomic_flag *flag OVS_UNUSED)
+{
+/* Nothing to do. */
+}
+
+static inline void
+atomic_flag_destroy(volatile atomic_flag *flag OVS_UNUSED)
+{
+/* Nothing to do. */
+}
+
+static inline bool
+atomic_flag_test_and_set(volatile atomic_flag *flag_)
+{
+atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
+bool old_value;
+
+atomic_lock__(flag);
+old_value = flag->b;
+flag->b = true;
+atomic_unlock__(flag);
 
-bool atomic_flag_test_and_set(volatile atomic_flag *);
-bool atomic_flag_test_and_set_explicit(volatile atomic_flag *, memory_order);
+return old_value;
+}
+
+static inline bool
+atomic_flag_test_and_set_explicit(volatile atomic_flag *flag,
+  memory_order order OVS_UNUSED)
+{
+return atomic_flag_test_and_set(flag);
+}
+
+static inline void
+atomic_flag_clear(volatile atomic_flag *flag_)
+{
+atomic_flag *flag = CONST_CAST(atomic_flag *, flag_);
+
+atomic_lock__(flag);
+flag->b = false;
+atomic_unlock__(flag);
+}
 
-void atomic_flag_clear(volatile atomic_flag *);
-void atomic_flag_clear_explicit(volatile atomic_flag *, memory_order);
+static inline void
+atomic_flag_clear_explicit(volatile atomic_flag *flag,
+   memory_order order OVS_UNUSED)
+{
+atomic_flag_clear(flag);
+}
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [urcu v2 05/15] ovs-atomic: Factor type declarations out of most implementations.

2014-03-11 Thread Ben Pfaff
This reduces duplicate code.

Signed-off-by: Ben Pfaff 
---
 lib/automake.mk   |1 +
 lib/ovs-atomic-c11.h  |   15 ---
 lib/ovs-atomic-clang.h|   42 +++-
 lib/ovs-atomic-gcc4.7+.h  |   44 +++---
 lib/ovs-atomic-pthreads.h |   45 +++
 lib/ovs-atomic-types.h|   47 +
 6 files changed, 61 insertions(+), 133 deletions(-)
 create mode 100644 lib/ovs-atomic-types.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 476421c..3b2b75f 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -139,6 +139,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/ovs-atomic-gcc4.7+.h \
lib/ovs-atomic-pthreads.c \
lib/ovs-atomic-pthreads.h \
+   lib/ovs-atomic-types.h \
lib/ovs-atomic.h \
lib/ovs-thread.c \
lib/ovs-thread.h \
diff --git a/lib/ovs-atomic-c11.h b/lib/ovs-atomic-c11.h
index 97262b2..96aec7c 100644
--- a/lib/ovs-atomic-c11.h
+++ b/lib/ovs-atomic-c11.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -22,16 +22,9 @@
 
 #include 
 
-/* Nonstandard atomic types. */
-typedef _Atomic(uint8_t)   atomic_uint8_t;
-typedef _Atomic(uint16_t)  atomic_uint16_t;
-typedef _Atomic(uint32_t)  atomic_uint32_t;
-typedef _Atomic(uint64_t)  atomic_uint64_t;
-
-typedef _Atomic(int8_t)atomic_int8_t;
-typedef _Atomic(int16_t)   atomic_int16_t;
-typedef _Atomic(int32_t)   atomic_int32_t;
-typedef _Atomic(int64_t)   atomic_int64_t;
+#define OMIT_STANDARD_ATOMIC_TYPES 1
+#define ATOMIC(TYPE) _Atomic(TYPE)
+#include "ovs-atomic-types.h"
 
 #define atomic_read(SRC, DST) \
 atomic_read_explicit(SRC, DST, memory_order_seq_cst)
diff --git a/lib/ovs-atomic-clang.h b/lib/ovs-atomic-clang.h
index 7449428..c169f37 100644
--- a/lib/ovs-atomic-clang.h
+++ b/lib/ovs-atomic-clang.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -21,44 +21,8 @@
 
 #define OVS_ATOMIC_CLANG_IMPL 1
 
-/* Standard atomic types. */
-typedef _Atomic(_Bool) atomic_bool;
-
-typedef _Atomic(char) atomic_char;
-typedef _Atomic(signed char) atomic_schar;
-typedef _Atomic(unsigned char) atomic_uchar;
-
-typedef _Atomic(short) atomic_short;
-typedef _Atomic(unsigned short) atomic_ushort;
-
-typedef _Atomic(int) atomic_int;
-typedef _Atomic(unsigned int) atomic_uint;
-
-typedef _Atomic(long) atomic_long;
-typedef _Atomic(unsigned long) atomic_ulong;
-
-typedef _Atomic(long long) atomic_llong;
-typedef _Atomic(unsigned long long) atomic_ullong;
-
-typedef _Atomic(size_t) atomic_size_t;
-typedef _Atomic(ptrdiff_t) atomic_ptrdiff_t;
-
-typedef _Atomic(intmax_t) atomic_intmax_t;
-typedef _Atomic(uintmax_t) atomic_uintmax_t;
-
-typedef _Atomic(intptr_t) atomic_intptr_t;
-typedef _Atomic(uintptr_t) atomic_uintptr_t;
-
-/* Nonstandard atomic types. */
-typedef _Atomic(uint8_t)   atomic_uint8_t;
-typedef _Atomic(uint16_t)  atomic_uint16_t;
-typedef _Atomic(uint32_t)  atomic_uint32_t;
-typedef _Atomic(uint64_t)  atomic_uint64_t;
-
-typedef _Atomic(int8_t)atomic_int8_t;
-typedef _Atomic(int16_t)   atomic_int16_t;
-typedef _Atomic(int32_t)   atomic_int32_t;
-typedef _Atomic(int64_t)   atomic_int64_t;
+#define ATOMIC(TYPE) _Atomic(TYPE)
+#include "ovs-atomic-types.h"
 
 #define ATOMIC_VAR_INIT(VALUE) (VALUE)
 
diff --git a/lib/ovs-atomic-gcc4.7+.h b/lib/ovs-atomic-gcc4.7+.h
index 56d265f..da88558 100644
--- a/lib/ovs-atomic-gcc4.7+.h
+++ b/lib/ovs-atomic-gcc4.7+.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013 Nicira, Inc.
+ * Copyright (c) 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -19,46 +19,8 @@
 #error "This header should only be included indirectly via ovs-atomic.h."
 #endif
 
-/* C11 standardized atomic type. */
-typedef bool   atomic_bool;
-
-typedef char   atomic_char;
-typedef signed charatomic_schar;
-typedef unsigned char  atomic_uchar;
-
-typedef short  atomic_short;
-typedef unsigned short atomic_ushort;
-
-typedef intatomic_int;
-typedef unsigned int   atomic_uint;
-
-typedef long   atomic_long;
-typedef unsigned long  atomic_ulong;
-
-typedef long long  atomic_llong;
-typedef unsigned long long atomic_ullong;
-
-typedef size_t atomic_size_t;
-typedef ptrdiff_t  atomic_ptrdiff_t;
-
-typedef intmax_t   atomic_intmax_t;
-typedef uintmax_t  atomic_uintmax_t;
-
-typedef intptr_t   atomic_intptr_t;
-typedef u

[ovs-dev] [urcu v2 01/15] ovs-atomic-gcc4+: Fix parenthesization in atomic_read_explicit().

2014-03-11 Thread Ben Pfaff
Signed-off-by: Ben Pfaff 
---
 lib/ovs-atomic-gcc4+.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/ovs-atomic-gcc4+.h b/lib/ovs-atomic-gcc4+.h
index b465181..ddfd03c 100644
--- a/lib/ovs-atomic-gcc4+.h
+++ b/lib/ovs-atomic-gcc4+.h
@@ -202,7 +202,7 @@ atomic_signal_fence(memory_order order OVS_UNUSED)
 #define atomic_read_explicit(SRC, DST, ORDER)   \
 (ATOMIC_SWITCH(SRC, \
(atomic_thread_fence_if_seq_cst(ORDER),  \
-(*DST) = (SRC)->value,  \
+*(DST) = (SRC)->value,  \
 atomic_thread_fence(ORDER)),\
*(DST) = locked_uint64_load(AS_LOCKED_UINT64(SRC)),  \
*(DST) = locked_int64_load(AS_LOCKED_INT64(SRC))),   \
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [urcu v2 06/15] ovs-atomic: Use raw types, not structs, when locks are required.

2014-03-11 Thread Ben Pfaff
Until now, the GCC 4+ and pthreads implementations of atomics have used
struct wrappers for their atomic types.  This had the advantage of allowing
a mutex to be wrapped in, in some cases, and of better type-checking by
preventing stray uses of atomic variables other than through one of the
atomic_*() functions or macros.  However, the mutex meant that an
atomic_destroy() function-like macro needed to be used.  The struct wrapper
also made it impossible to define new atomic types that were compatible
with each other without using a typedef.  For example, one could not simply
define a macro like
#define ATOMIC(TYPE) struct { TYPE value; }
and then have two declarations like:
ATOMIC(void *) x;
ATOMIC(void *) y;
and do anything with these objects that require type-compatibility, even
"&x == &y", because the two structs are not compatible.  One can do it
through a typedef:
typedef ATOMIC(void *) atomic_voidp;
atomic_voidp x, y;
but that is inconvenient, especially because of the need to invent a name
for the type.

This commit aims to ease the problem by getting rid of the wrapper structs
in the cases where the atomic library used them.  It gets rid of the
mutexes, in the cases where they are still needed, by using a global
array of mutexes instead.

This commit also defines the ATOMIC macro described above and documents
its use in ovs-atomic.h.

Signed-off-by: Ben Pfaff 
---
 lib/automake.mk   |3 +-
 lib/ovs-atomic-gcc4+.c|   68 
 lib/ovs-atomic-gcc4+.h|  198 -
 lib/ovs-atomic-locked.c   |   58 +
 lib/ovs-atomic-locked.h   |   32 
 lib/ovs-atomic-pthreads.h |   66 ++-
 lib/ovs-atomic.h  |8 +-
 7 files changed, 175 insertions(+), 258 deletions(-)
 delete mode 100644 lib/ovs-atomic-gcc4+.c
 create mode 100644 lib/ovs-atomic-locked.c
 create mode 100644 lib/ovs-atomic-locked.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 3b2b75f..1310dcd 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -134,9 +134,10 @@ lib_libopenvswitch_la_SOURCES = \
lib/ovs-atomic-c11.h \
lib/ovs-atomic-clang.h \
lib/ovs-atomic-flag-gcc4.7+.h \
-   lib/ovs-atomic-gcc4+.c \
lib/ovs-atomic-gcc4+.h \
lib/ovs-atomic-gcc4.7+.h \
+   lib/ovs-atomic-locked.c \
+   lib/ovs-atomic-locked.h \
lib/ovs-atomic-pthreads.c \
lib/ovs-atomic-pthreads.h \
lib/ovs-atomic-types.h \
diff --git a/lib/ovs-atomic-gcc4+.c b/lib/ovs-atomic-gcc4+.c
deleted file mode 100644
index d6a68ae..000
--- a/lib/ovs-atomic-gcc4+.c
+++ /dev/null
@@ -1,68 +0,0 @@
-/*
- * Copyright (c) 2013 Nicira, Inc.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at:
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-#include 
-
-#include "ovs-atomic.h"
-#include "ovs-thread.h"
-
-#if OVS_ATOMIC_GCC4P_IMPL
-static struct ovs_mutex mutex = OVS_MUTEX_INITIALIZER;
-
-#define DEFINE_LOCKED_OP(TYPE, NAME, OPERATOR)  \
-TYPE##_t\
-locked_##TYPE##_##NAME(struct locked_##TYPE *u, TYPE##_t arg)   \
-{   \
-TYPE##_t old_value; \
-\
-ovs_mutex_lock(&mutex); \
-old_value = u->value;   \
-u->value OPERATOR arg;  \
-ovs_mutex_unlock(&mutex);   \
-\
-return old_value;   \
-}
-
-#define DEFINE_LOCKED_TYPE(TYPE)\
-TYPE##_t\
-locked_##TYPE##_load(const struct locked_##TYPE *u) \
-{   \
-TYPE##_t value; \
-\
-ovs_mutex_lock(&mutex); \
-value = u->value;   \
-ovs_mutex_unlock(&mutex);  

[ovs-dev] [urcu v2 09/15] ovs-atomic: Delete atomic, atomic_flag, ovs_refcount destroy functions.

2014-03-11 Thread Ben Pfaff
None of the atomic implementations need a destroy function anymore, so it's
"more standard" and more convenient for users to get rid of them.

Signed-off-by: Ben Pfaff 
---
 lib/bfd.c |3 +--
 lib/cfm.c |6 +-
 lib/dpif-linux.c  |3 +--
 lib/dpif-netdev.c |5 -
 lib/lacp.c|3 +--
 lib/mac-learning.c|3 +--
 lib/netdev.c  |1 -
 lib/netlink-socket.c  |1 -
 lib/ovs-atomic-c11.h  |   12 
 lib/ovs-atomic-clang.h|1 -
 lib/ovs-atomic-flag-gcc4.7+.h |   14 +-
 lib/ovs-atomic-gcc4+.h|   13 -
 lib/ovs-atomic-gcc4.7+.h  |1 -
 lib/ovs-atomic-pthreads.h |   13 -
 lib/ovs-atomic.h  |   30 +-
 lib/stp.c |3 +--
 ofproto/bond.c|3 +--
 ofproto/netflow.c |3 +--
 ofproto/ofproto-dpif-ipfix.c  |3 +--
 ofproto/ofproto-dpif-sflow.c  |1 -
 ofproto/ofproto-dpif-upcall.c |3 ---
 ofproto/ofproto.c |3 ---
 tests/test-atomic.c   |4 +---
 23 files changed, 12 insertions(+), 120 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index 5413105..9071442 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2013 Nicira, Inc.
+/* Copyright (c) 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -470,7 +470,6 @@ bfd_unref(struct bfd *bfd) OVS_EXCLUDED(mutex)
 ovs_mutex_lock(&mutex);
 hmap_remove(all_bfds, &bfd->node);
 netdev_close(bfd->netdev);
-ovs_refcount_destroy(&bfd->ref_cnt);
 free(bfd->name);
 free(bfd);
 ovs_mutex_unlock(&mutex);
diff --git a/lib/cfm.c b/lib/cfm.c
index 583df1d..38448ab 100644
--- a/lib/cfm.c
+++ b/lib/cfm.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -372,10 +372,6 @@ cfm_unref(struct cfm *cfm) OVS_EXCLUDED(mutex)
 netdev_close(cfm->netdev);
 free(cfm->rmps_array);
 
-atomic_destroy(&cfm->extended);
-atomic_destroy(&cfm->check_tnl_key);
-ovs_refcount_destroy(&cfm->ref_cnt);
-
 free(cfm);
 }
 
diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c
index c2579f6..6f21fc4 100644
--- a/lib/dpif-linux.c
+++ b/lib/dpif-linux.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -1120,7 +1120,6 @@ dpif_linux_flow_dump_done(const struct dpif *dpif 
OVS_UNUSED, void *iter_)
 unsigned int nl_status = nl_dump_done(&iter->dump);
 
 atomic_read(&iter->status, &dump_status);
-atomic_destroy(&iter->status);
 free(iter);
 return dump_status ? dump_status : nl_status;
 }
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 5897f8b..54b8f50 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -448,7 +448,6 @@ create_dp_netdev(const char *name, const struct dpif_class 
*class,
 *CONST_CAST(const struct dpif_class **, &dp->class) = class;
 *CONST_CAST(const char **, &dp->name) = xstrdup(name);
 ovs_refcount_init(&dp->ref_cnt);
-atomic_flag_init(&dp->destroyed);
 
 ovs_mutex_init(&dp->flow_mutex);
 classifier_init(&dp->cls, NULL);
@@ -558,8 +557,6 @@ dp_netdev_free(struct dp_netdev *dp)
 ovs_mutex_destroy(&dp->flow_mutex);
 seq_destroy(dp->port_seq);
 hmap_destroy(&dp->ports);
-atomic_flag_destroy(&dp->destroyed);
-ovs_refcount_destroy(&dp->ref_cnt);
 latch_destroy(&dp->exit_latch);
 free(CONST_CAST(char *, dp->name));
 free(dp);
@@ -871,7 +868,6 @@ dp_netdev_flow_unref(struct dp_netdev_flow *flow)
 cls_rule_destroy(CONST_CAST(struct cls_rule *, &flow->cr));
 ovs_mutex_lock(&flow->mutex);
 dp_netdev_actions_unref(flow->actions);
-ovs_refcount_destroy(&flow->ref_cnt);
 ovs_mutex_unlock(&flow->mutex);
 ovs_mutex_destroy(&flow->mutex);
 free(flow);
@@ -1579,7 +1575,6 @@ void
 dp_netdev_actions_unref(struct dp_netdev_actions *actions)
 {
 if (actions && ovs_refcount_unref(&actions->ref_cnt) == 1) {
-ovs_refcount_destroy(&actions->ref_cnt);
 free(actions->actions);
 free(actions);
 }
diff --git a/lib/lacp.c b/lib/lacp.c
index a3f72ed..cbe2259 100644
--- a/lib/lacp.c
+++ b/lib/lacp.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2011, 2012, 2013 Nicira, Inc.
+/* Copyright (c) 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed und

[ovs-dev] [urcu v2 10/15] ovs-rcu: New library.

2014-03-11 Thread Ben Pfaff
RCU allows multiple threads to read objects in parallel without any
performance penalty.  The following commit will introduce the first use.

Signed-off-by: Ben Pfaff 
---
 lib/automake.mk   |2 +
 lib/ovs-rcu.c |  293 +
 lib/ovs-rcu.h |  170 
 lib/ovs-thread.c  |   16 ++-
 lib/ovs-thread.h  |2 +
 lib/timeval.c |   11 ++
 ofproto/ofproto-dpif-upcall.c |5 +
 7 files changed, 498 insertions(+), 1 deletion(-)
 create mode 100644 lib/ovs-rcu.c
 create mode 100644 lib/ovs-rcu.h

diff --git a/lib/automake.mk b/lib/automake.mk
index cc04633..a6c614d 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -140,6 +140,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/ovs-atomic-locked.h \
lib/ovs-atomic-pthreads.h \
lib/ovs-atomic.h \
+   lib/ovs-rcu.c \
+   lib/ovs-rcu.h \
lib/ovs-thread.c \
lib/ovs-thread.h \
lib/ovsdb-data.c \
diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c
new file mode 100644
index 000..ac4513b
--- /dev/null
+++ b/lib/ovs-rcu.c
@@ -0,0 +1,293 @@
+/*
+ * Copyright (c) 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include "ovs-rcu.h"
+#include "guarded-list.h"
+#include "list.h"
+#include "ovs-thread.h"
+#include "poll-loop.h"
+#include "seq.h"
+
+struct ovsrcu_cb {
+void (*function)(void *aux);
+void *aux;
+};
+
+struct ovsrcu_cbset {
+struct list list_node;
+struct ovsrcu_cb cbs[16];
+int n_cbs;
+};
+
+struct ovsrcu_perthread {
+struct list list_node;   /* In global list. */
+
+struct ovs_mutex mutex;
+uint64_t seqno;
+struct ovsrcu_cbset *cbset;
+};
+
+static struct seq *global_seqno;
+
+static pthread_key_t perthread_key;
+static struct list ovsrcu_threads;
+static struct ovs_mutex ovsrcu_threads_mutex;
+
+static struct guarded_list flushed_cbsets;
+static struct seq *flushed_cbsets_seq;
+
+static void ovsrcu_init(void);
+static void ovsrcu_flush_cbset(struct ovsrcu_perthread *);
+static void ovsrcu_unregister__(struct ovsrcu_perthread *);
+static bool ovsrcu_call_postponed(void);
+static void *ovsrcu_postpone_thread(void *arg OVS_UNUSED);
+static void ovsrcu_synchronize(void);
+
+static struct ovsrcu_perthread *
+ovsrcu_perthread_get(void)
+{
+struct ovsrcu_perthread *perthread;
+
+ovsrcu_init();
+
+perthread = pthread_getspecific(perthread_key);
+if (!perthread) {
+perthread = xmalloc(sizeof *perthread);
+ovs_mutex_init(&perthread->mutex);
+perthread->seqno = seq_read(global_seqno);
+perthread->cbset = NULL;
+
+ovs_mutex_lock(&ovsrcu_threads_mutex);
+list_push_back(&ovsrcu_threads, &perthread->list_node);
+ovs_mutex_unlock(&ovsrcu_threads_mutex);
+
+pthread_setspecific(perthread_key, perthread);
+}
+return perthread;
+}
+
+/* Indicates the end of a quiescent state.  See "Details" near the top of
+ * ovs-rcu.h.
+ *
+ * Quiescent states don't stack or nest, so this always ends a quiescent state
+ * even if ovsrcu_quiesce_start() was called multiple times in a row. */
+void
+ovsrcu_quiesce_end(void)
+{
+ovsrcu_perthread_get();
+}
+
+static void
+ovsrcu_quiesced(void)
+{
+if (single_threaded()) {
+ovsrcu_call_postponed();
+} else {
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+if (ovsthread_once_start(&once)) {
+xpthread_create(NULL, NULL, ovsrcu_postpone_thread, NULL);
+ovsthread_once_done(&once);
+}
+}
+}
+
+/* Indicates the beginning of a quiescent state.  See "Details" near the top of
+ * ovs-rcu.h. */
+void
+ovsrcu_quiesce_start(void)
+{
+struct ovsrcu_perthread *perthread;
+
+ovsrcu_init();
+perthread = pthread_getspecific(perthread_key);
+if (perthread) {
+pthread_setspecific(perthread_key, NULL);
+ovsrcu_unregister__(perthread);
+}
+
+ovsrcu_quiesced();
+}
+
+/* Indicates a momentary quiescent state.  See "Details" near the top of
+ * ovs-rcu.h. */
+void
+ovsrcu_quiesce(void)
+{
+ovsrcu_perthread_get()->seqno = seq_read(global_seqno);
+seq_change(global_seqno);
+
+ovsrcu_quiesced();
+}
+
+static void
+ovsrcu_synchronize(void)
+{
+uint64_t target_seqno;
+
+if (single_threaded()) {
+return;
+}
+
+target_seqno = seq_read(global_seqno);
+

[ovs-dev] [urcu v2 02/15] compiler: New macro for defining aligned structs.

2014-03-11 Thread Ben Pfaff
This is broken out into a separate commit because it adds new MSVC
specific code and I don't have MSVC around to test whether it's correct.

CC: Gurucharan Shetty 
Signed-off-by: Ben Pfaff 
---
 lib/compiler.h |   16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/lib/compiler.h b/lib/compiler.h
index e867d72..3b59813 100644
--- a/lib/compiler.h
+++ b/lib/compiler.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -179,6 +179,20 @@
 #define OVS_PACKED(DECL) __pragma(pack(push, 1)) DECL __pragma(pack(pop))
 #endif
 
+/* For defining a structure whose instances should aligned on an N-byte
+ * boundary.
+ *
+ * e.g. The following:
+ * OVS_ALIGNED_STRUCT(64, mystruct) { ... };
+ * is equivalent to the following except that it specifies 64-byte alignment:
+ * struct mystruct { ... };
+ */
+#ifndef _MSC_VER
+#define OVS_ALIGNED_STRUCT(N, TAG) struct __attribute__((aligned(N))) TAG
+#else
+#define OVS_ALIGNED_STRUCT(N, TAG) __declspec(align(N)) struct TAG
+#endif
+
 #ifdef _MSC_VER
 #define CCALL __cdecl
 #pragma section(".CRT$XCU",read)
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [urcu v2 14/15] dpif-netdev: Use ovsthread_stats for flow stats.

2014-03-11 Thread Ben Pfaff
This should scale better than a single mutex, though still not
ideally.

Signed-off-by: Ben Pfaff 
---
 lib/dpif-netdev.c |   99 +++--
 1 file changed, 74 insertions(+), 25 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 268b04d..d9e7a1a 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -259,10 +259,7 @@ struct dp_netdev_flow {
 /* Statistics.
  *
  * Reading or writing these members requires 'mutex'. */
-long long int used OVS_GUARDED; /* Last used time, in monotonic msecs. */
-long long int packet_count OVS_GUARDED; /* Number of packets matched. */
-long long int byte_count OVS_GUARDED;   /* Number of bytes matched. */
-uint16_t tcp_flags OVS_GUARDED; /* Bitwise-OR of seen tcp_flags values. */
+struct ovsthread_stats stats; /* Contains "struct dp_netdev_flow_stats". */
 
 /* Actions.
  *
@@ -276,6 +273,16 @@ static struct dp_netdev_flow *dp_netdev_flow_ref(
 const struct dp_netdev_flow *);
 static void dp_netdev_flow_unref(struct dp_netdev_flow *);
 
+/* Contained by struct dp_netdev_flow's 'stats' member.  */
+struct dp_netdev_flow_stats {
+struct ovs_mutex mutex; /* Guards all the other members. */
+
+long long int used OVS_GUARDED; /* Last used time, in monotonic msecs. */
+long long int packet_count OVS_GUARDED; /* Number of packets matched. */
+long long int byte_count OVS_GUARDED;   /* Number of bytes matched. */
+uint16_t tcp_flags OVS_GUARDED; /* Bitwise-OR of seen tcp_flags values. */
+};
+
 /* A set of datapath actions within a "struct dp_netdev_flow".
  *
  *
@@ -889,6 +896,15 @@ static void
 dp_netdev_flow_unref(struct dp_netdev_flow *flow)
 {
 if (flow && ovs_refcount_unref(&flow->ref_cnt) == 1) {
+struct dp_netdev_flow_stats *bucket;
+size_t i;
+
+OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &flow->stats) {
+ovs_mutex_destroy(&bucket->mutex);
+free_cacheline(bucket);
+}
+ovsthread_stats_destroy(&flow->stats);
+
 cls_rule_destroy(CONST_CAST(struct cls_rule *, &flow->cr));
 ovs_mutex_lock(&flow->mutex);
 dp_netdev_actions_unref(flow->actions);
@@ -1039,12 +1055,19 @@ dp_netdev_find_flow(const struct dp_netdev *dp, const 
struct flow *flow)
 static void
 get_dpif_flow_stats(struct dp_netdev_flow *netdev_flow,
 struct dpif_flow_stats *stats)
-OVS_REQ_RDLOCK(netdev_flow->mutex)
 {
-stats->n_packets = netdev_flow->packet_count;
-stats->n_bytes = netdev_flow->byte_count;
-stats->used = netdev_flow->used;
-stats->tcp_flags = netdev_flow->tcp_flags;
+struct dp_netdev_flow_stats *bucket;
+size_t i;
+
+memset(stats, 0, sizeof *stats);
+OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &netdev_flow->stats) {
+ovs_mutex_lock(&bucket->mutex);
+stats->n_packets += bucket->packet_count;
+stats->n_bytes += bucket->byte_count;
+stats->used = MAX(stats->used, bucket->used);
+stats->tcp_flags |= bucket->tcp_flags;
+ovs_mutex_unlock(&bucket->mutex);
+}
 }
 
 static int
@@ -1155,10 +1178,11 @@ dpif_netdev_flow_get(const struct dpif *dpif,
 if (netdev_flow) {
 struct dp_netdev_actions *actions = NULL;
 
-ovs_mutex_lock(&netdev_flow->mutex);
 if (stats) {
 get_dpif_flow_stats(netdev_flow, stats);
 }
+
+ovs_mutex_lock(&netdev_flow->mutex);
 if (actionsp) {
 actions = dp_netdev_actions_ref(netdev_flow->actions);
 }
@@ -1194,6 +1218,8 @@ dp_netdev_flow_add(struct dp_netdev *dp, const struct 
flow *flow,
 ovs_mutex_init(&netdev_flow->mutex);
 ovs_mutex_lock(&netdev_flow->mutex);
 
+ovsthread_stats_init(&netdev_flow->stats);
+
 netdev_flow->actions = dp_netdev_actions_create(actions, actions_len);
 
 match_init(&match, flow, wc);
@@ -1214,12 +1240,18 @@ dp_netdev_flow_add(struct dp_netdev *dp, const struct 
flow *flow,
 
 static void
 clear_stats(struct dp_netdev_flow *netdev_flow)
-OVS_REQUIRES(netdev_flow->mutex)
 {
-netdev_flow->used = 0;
-netdev_flow->packet_count = 0;
-netdev_flow->byte_count = 0;
-netdev_flow->tcp_flags = 0;
+struct dp_netdev_flow_stats *bucket;
+size_t i;
+
+OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &netdev_flow->stats) {
+ovs_mutex_lock(&bucket->mutex);
+bucket->used = 0;
+bucket->packet_count = 0;
+bucket->byte_count = 0;
+bucket->tcp_flags = 0;
+ovs_mutex_unlock(&bucket->mutex);
+}
 }
 
 static int
@@ -1270,13 +1302,14 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct 
dpif_flow_put *put)
 ovs_mutex_lock(&netdev_flow->mutex);
 old_actions = netdev_flow->actions;
 netdev_flow->actions = new_actions;
+ovs_mutex_unlock(&netdev_flow->mutex);
+
 if (put->stats) {
 get_dpif_flow_stats(netdev_fl

[ovs-dev] [urcu v2 11/15] ofproto: Use RCU to protect rule_actions.

2014-03-11 Thread Ben Pfaff
Signed-off-by: Ben Pfaff 

Conflicts:
lib/ovs-rcu.h
---
 ofproto/connmgr.c|5 +-
 ofproto/ofproto-dpif-xlate.c |2 -
 ofproto/ofproto-dpif.c   |2 -
 ofproto/ofproto-provider.h   |   19 ---
 ofproto/ofproto.c|  119 ++
 5 files changed, 63 insertions(+), 84 deletions(-)

diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
index 033ab7d..0058309 100644
--- a/ofproto/connmgr.c
+++ b/ofproto/connmgr.c
@@ -2043,8 +2043,9 @@ ofmonitor_report(struct connmgr *mgr, struct rule *rule,
 ovs_mutex_unlock(&rule->mutex);
 
 if (flags & NXFMF_ACTIONS) {
-fu.ofpacts = rule->actions->ofpacts;
-fu.ofpacts_len = rule->actions->ofpacts_len;
+struct rule_actions *actions = rule_get_actions(rule);
+fu.ofpacts = actions->ofpacts;
+fu.ofpacts_len = actions->ofpacts_len;
 } else {
 fu.ofpacts = NULL;
 fu.ofpacts_len = 0;
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index eb4931e..5201b93 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -1836,7 +1836,6 @@ xlate_recursively(struct xlate_ctx *ctx, struct rule_dpif 
*rule)
 ctx->rule = rule;
 actions = rule_dpif_get_actions(rule);
 do_xlate_actions(actions->ofpacts, actions->ofpacts_len, ctx);
-rule_actions_unref(actions);
 ctx->rule = old_rule;
 ctx->recurse--;
 }
@@ -3171,7 +3170,6 @@ xlate_actions__(struct xlate_in *xin, struct xlate_out 
*xout)
 }
 
 out:
-rule_actions_unref(actions);
 rule_dpif_unref(rule);
 }
 
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 8c43ee9..1c779ee 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -3587,8 +3587,6 @@ trace_format_rule(struct ds *result, int level, const 
struct rule_dpif *rule)
 ds_put_cstr(result, "OpenFlow actions=");
 ofpacts_format(actions->ofpacts, actions->ofpacts_len, result);
 ds_put_char(result, '\n');
-
-rule_actions_unref(actions);
 }
 
 static void
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index 2c72fbc..956e593 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -43,6 +43,7 @@
 #include "ofp-util.h"
 #include "ofproto/ofproto.h"
 #include "ovs-atomic.h"
+#include "ovs-rcu.h"
 #include "ovs-thread.h"
 #include "shash.h"
 #include "simap.h"
@@ -374,7 +375,7 @@ struct rule {
 
 /* OpenFlow actions.  See struct rule_actions for more thread-safety
  * notes. */
-struct rule_actions *actions OVS_GUARDED;
+OVSRCU_TYPE(struct rule_actions *) actions;
 
 /* In owning meter's 'rules' list.  An empty list if there is no meter. */
 struct list meter_list_node OVS_GUARDED_BY(ofproto_mutex);
@@ -403,10 +404,11 @@ struct rule {
 void ofproto_rule_ref(struct rule *);
 void ofproto_rule_unref(struct rule *);
 
-struct rule_actions *rule_get_actions(const struct rule *rule)
-OVS_EXCLUDED(rule->mutex);
-struct rule_actions *rule_get_actions__(const struct rule *rule)
-OVS_REQUIRES(rule->mutex);
+static inline struct rule_actions *
+rule_get_actions(const struct rule *rule)
+{
+return ovsrcu_get(struct rule_actions *, &rule->actions);
+}
 
 /* Returns true if 'rule' is an OpenFlow 1.3 "table-miss" rule, false
  * otherwise.
@@ -431,8 +433,6 @@ rule_is_table_miss(const struct rule *rule)
  * 'rule' is the rule for which 'rule->actions == actions') or that owns a
  * reference to 'actions->ref_count' (or both). */
 struct rule_actions {
-struct ovs_refcount ref_count;
-
 /* These members are immutable: they do not change during the struct's
  * lifetime.  */
 struct ofpact *ofpacts; /* Sequence of "struct ofpacts". */
@@ -442,8 +442,7 @@ struct rule_actions {
 
 struct rule_actions *rule_actions_create(const struct ofproto *,
  const struct ofpact *, size_t);
-void rule_actions_ref(struct rule_actions *);
-void rule_actions_unref(struct rule_actions *);
+void rule_actions_destroy(struct rule_actions *);
 
 /* A set of rules to which an OpenFlow operation applies. */
 struct rule_collection {
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index d61e618..d0daea4 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -43,6 +43,7 @@
 #include "ofproto-provider.h"
 #include "openflow/nicira-ext.h"
 #include "openflow/openflow.h"
+#include "ovs-rcu.h"
 #include "packets.h"
 #include "pinsched.h"
 #include "pktbuf.h"
@@ -1897,11 +1898,9 @@ ofproto_add_flow(struct ofproto *ofproto, const struct 
match *ma

[ovs-dev] [urcu v2 13/15] ovs-thread: Replace ovsthread_counter by more general ovsthread_stats.

2014-03-11 Thread Ben Pfaff
This allows clients to do more than just increment a counter.  The
following commit will make the first use of that feature.

Signed-off-by: Ben Pfaff 
---
 lib/dpif-netdev.c |   75 +--
 lib/ovs-thread.c  |   91 ++---
 lib/ovs-thread.h  |   23 +++---
 3 files changed, 108 insertions(+), 81 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 54b8f50..268b04d 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -147,10 +147,8 @@ struct dp_netdev {
 
 /* Statistics.
  *
- * ovsthread_counter is internally synchronized. */
-struct ovsthread_counter *n_hit;/* Number of flow table matches. */
-struct ovsthread_counter *n_missed; /* Number of flow table misses. */
-struct ovsthread_counter *n_lost;   /* Number of misses not passed up. */
+ * ovsthread_stats is internally synchronized. */
+struct ovsthread_stats stats; /* Contains 'struct dp_netdev_stats *'. */
 
 /* Ports.
  *
@@ -170,6 +168,22 @@ static struct dp_netdev_port *dp_netdev_lookup_port(const 
struct dp_netdev *dp,
 odp_port_t)
 OVS_REQ_RDLOCK(dp->port_rwlock);
 
+enum dp_stat_type {
+DP_STAT_HIT,/* Packets that matched in the flow table. */
+DP_STAT_MISS,   /* Packets that did not match. */
+DP_STAT_LOST,   /* Packets not passed up to the client. */
+DP_N_STATS
+};
+
+/* Contained by struct dp_netdev's 'stats' member.  */
+struct dp_netdev_stats {
+struct ovs_mutex mutex;  /* Protects 'n'. */
+
+/* Indexed by DP_STAT_*, protected by 'mutex'. */
+unsigned long long int n[DP_N_STATS] OVS_GUARDED;
+};
+
+
 /* A port in a netdev-based datapath. */
 struct dp_netdev_port {
 struct hmap_node node;  /* Node in dp_netdev's 'ports'. */
@@ -461,9 +475,7 @@ create_dp_netdev(const char *name, const struct dpif_class 
*class,
 ovs_mutex_unlock(&dp->queue_mutex);
 dp->queue_seq = seq_create();
 
-dp->n_hit = ovsthread_counter_create();
-dp->n_missed = ovsthread_counter_create();
-dp->n_lost = ovsthread_counter_create();
+ovsthread_stats_init(&dp->stats);
 
 ovs_rwlock_init(&dp->port_rwlock);
 hmap_init(&dp->ports);
@@ -532,6 +544,8 @@ dp_netdev_free(struct dp_netdev *dp)
 OVS_REQUIRES(dp_netdev_mutex)
 {
 struct dp_netdev_port *port, *next;
+struct dp_netdev_stats *bucket;
+int i;
 
 shash_find_and_delete(&dp_netdevs, dp->name);
 
@@ -544,9 +558,12 @@ dp_netdev_free(struct dp_netdev *dp)
 do_del_port(dp, port->port_no);
 }
 ovs_rwlock_unlock(&dp->port_rwlock);
-ovsthread_counter_destroy(dp->n_hit);
-ovsthread_counter_destroy(dp->n_missed);
-ovsthread_counter_destroy(dp->n_lost);
+
+OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &dp->stats) {
+ovs_mutex_destroy(&bucket->mutex);
+free_cacheline(bucket);
+}
+ovsthread_stats_destroy(&dp->stats);
 
 dp_netdev_purge_queues(dp);
 seq_destroy(dp->queue_seq);
@@ -604,14 +621,21 @@ static int
 dpif_netdev_get_stats(const struct dpif *dpif, struct dpif_dp_stats *stats)
 {
 struct dp_netdev *dp = get_dp_netdev(dpif);
+struct dp_netdev_stats *bucket;
+size_t i;
 
 fat_rwlock_rdlock(&dp->cls.rwlock);
 stats->n_flows = hmap_count(&dp->flow_table);
 fat_rwlock_unlock(&dp->cls.rwlock);
 
-stats->n_hit = ovsthread_counter_read(dp->n_hit);
-stats->n_missed = ovsthread_counter_read(dp->n_missed);
-stats->n_lost = ovsthread_counter_read(dp->n_lost);
+stats->n_hit = stats->n_missed = stats->n_lost = 0;
+OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &dp->stats) {
+ovs_mutex_lock(&bucket->mutex);
+stats->n_hit += bucket->n[DP_STAT_HIT];
+stats->n_missed += bucket->n[DP_STAT_MISS];
+stats->n_lost += bucket->n[DP_STAT_LOST];
+ovs_mutex_unlock(&bucket->mutex);
+}
 stats->n_masks = UINT32_MAX;
 stats->n_mask_hit = UINT64_MAX;
 
@@ -1711,6 +1735,25 @@ dp_netdev_flow_used(struct dp_netdev_flow *netdev_flow,
 netdev_flow->tcp_flags |= packet_get_tcp_flags(packet, &netdev_flow->flow);
 }
 
+static void *
+dp_netdev_stats_new_cb(void)
+{
+struct dp_netdev_stats *bucket = xzalloc_cacheline(sizeof *bucket);
+ovs_mutex_init(&bucket->mutex);
+return bucket;
+}
+
+static void
+dp_netdev_count_packet(struct dp_netdev *dp, enum dp_stat_type type)
+{
+struct dp_netdev_stats *bucket;
+
+bucket = ovsthread_stats_bucket_get(&dp->stats, dp_netdev_stats_new_cb);
+ovs_mutex_lock(&bucket->mutex);
+bucket->n[type]++;
+ovs_mutex_unlock(&bucket->mutex);
+}
+
 static void
 dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf *packet,
  struct pkt_metadata *md)
@@ -1736,9 +1779,9 @@ dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf 
*packet,
   actions->actions, actio

[ovs-dev] [urcu v2 12/15] util: New functions for allocating memory while avoiding false sharing.

2014-03-11 Thread Ben Pfaff
This factors code out of fat-rwlock, making it easily usable by other code.

Signed-off-by: Ben Pfaff 
---
 lib/fat-rwlock.c |   29 ++
 lib/util.c   |   60 +-
 lib/util.h   |6 +-
 3 files changed, 66 insertions(+), 29 deletions(-)

diff --git a/lib/fat-rwlock.c b/lib/fat-rwlock.c
index 3866dda..82dfbfe 100644
--- a/lib/fat-rwlock.c
+++ b/lib/fat-rwlock.c
@@ -57,16 +57,6 @@ struct fat_rwlock_slot {
  * Accessed only by the slot's own thread, so no synchronization is
  * needed. */
 unsigned int depth;
-
-/* To prevent two of these structures from accidentally occupying the same
- * cache line (causing "false sharing"), we cache-align each of these data
- * structures.  That requires malloc()ing extra space and throwing away
- * some space at the beginning, which means that the pointer to this struct
- * isn't necessarily the pointer to the beginning of the block, and so we
- * need to retain the original pointer to free later.
- *
- * Accessed only by a single thread, so no synchronization is needed. */
-void *base;/* Pointer to pass to free() for this block.  */
 };
 
 static void
@@ -77,7 +67,7 @@ free_slot(struct fat_rwlock_slot *slot)
 }
 
 list_remove(&slot->list_node);
-free(slot->base);
+free_cacheline(slot);
 }
 
 static void
@@ -124,7 +114,6 @@ static struct fat_rwlock_slot *
 fat_rwlock_get_slot__(struct fat_rwlock *rwlock)
 {
 struct fat_rwlock_slot *slot;
-void *base;
 
 /* Fast path. */
 slot = ovsthread_getspecific(rwlock->key);
@@ -134,21 +123,7 @@ fat_rwlock_get_slot__(struct fat_rwlock *rwlock)
 
 /* Slow path: create a new slot for 'rwlock' in this thread. */
 
-/* Allocate room for:
- *
- * - Up to CACHE_LINE_SIZE - 1 bytes before the per-thread, so that
- *   the start of the slot doesn't potentially share a cache line.
- *
- * - The slot itself.
- *
- * - Space following the slot up to the end of the cache line, so
- *   that the end of the slot doesn't potentially share a cache
- *   line. */
-base = xmalloc((CACHE_LINE_SIZE - 1)
-   + ROUND_UP(sizeof *slot, CACHE_LINE_SIZE));
-slot = (void *) ROUND_UP((uintptr_t) base, CACHE_LINE_SIZE);
-
-slot->base = base;
+slot = xmalloc_cacheline(sizeof *slot);
 slot->rwlock = rwlock;
 ovs_mutex_init(&slot->mutex);
 slot->depth = 0;
diff --git a/lib/util.c b/lib/util.c
index 911cc3e..f3a0e41 100644
--- a/lib/util.c
+++ b/lib/util.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -174,6 +174,64 @@ x2nrealloc(void *p, size_t *n, size_t s)
 return xrealloc(p, *n * s);
 }
 
+/* The desired minimum alignment for an allocated block of memory. */
+#define MEM_ALIGN MAX(sizeof(void *), 8)
+BUILD_ASSERT_DECL(IS_POW2(MEM_ALIGN));
+BUILD_ASSERT_DECL(CACHE_LINE_SIZE >= MEM_ALIGN);
+
+/* Allocates and returns 'size' bytes of memory in dedicated cache lines.  That
+ * is, the memory block returned will not share a cache line with other data,
+ * avoiding "false sharing".  (The memory returned will not be at the start of
+ * a cache line, though, so don't assume such alignment.)
+ *
+ * Use free_cacheline() to free the returned memory block. */
+void *
+xmalloc_cacheline(size_t size)
+{
+void **payload;
+void *base;
+
+/* Allocate room for:
+ *
+ * - Up to CACHE_LINE_SIZE - 1 bytes before the payload, so that the
+ *   start of the payload doesn't potentially share a cache line.
+ *
+ * - A payload consisting of a void *, followed by padding out to
+ *   MEM_ALIGN bytes, followed by 'size' bytes of user data.
+ *
+ * - Space following the payload up to the end of the cache line, so
+ *   that the end of the payload doesn't potentially share a cache line
+ *   with some following block. */
+base = xmalloc((CACHE_LINE_SIZE - 1)
+   + ROUND_UP(MEM_ALIGN + size, CACHE_LINE_SIZE));
+
+/* Locate the payload and store a pointer to the base at the beginning. */
+payload = (void **) ROUND_UP((uintptr_t) base, CACHE_LINE_SIZE);
+*payload = base;
+
+return (char *) payload + MEM_ALIGN;
+}
+
+/* Like xmalloc_cacheline() but clears the allocated memory to all zero
+ * bytes. */
+void *
+xzalloc_cacheline(size_t size)
+{
+void *p = xmalloc_cacheline(size);
+memset(p, 0, size);
+return p;
+}
+
+/* Frees a memory block allocated with xmalloc_cacheline() or
+ * xzalloc_cacheline(). */
+void
+free_cacheline(void *p)
+{
+if (p) {
+free(*(void **) ((uintptr_t) p - MEM_ALIGN));
+}
+}
+
 char *

[ovs-dev] [urcu v2 04/15] util: Move CACHE_LINE_SIZE here.

2014-03-11 Thread Ben Pfaff
It will come in handy elsewhere in upcoming commits.

Signed-off-by: Ben Pfaff 
---
 lib/fat-rwlock.c |5 -
 lib/util.h   |5 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/fat-rwlock.c b/lib/fat-rwlock.c
index 9ffa37b..3866dda 100644
--- a/lib/fat-rwlock.c
+++ b/lib/fat-rwlock.c
@@ -25,11 +25,6 @@
 #include "ovs-thread.h"
 #include "random.h"
 
-/* This system's cache line size, in bytes.
- * Being wrong hurts performance but not correctness. */
-#define CACHE_LINE_SIZE 64  /* Correct for most CPUs. */
-BUILD_ASSERT_DECL(IS_POW2(CACHE_LINE_SIZE));
-
 struct fat_rwlock_slot {
 /* Membership in rwlock's list of "struct fat_rwlock_slot"s.
  *
diff --git a/lib/util.h b/lib/util.h
index 41fd51d..64516a8 100644
--- a/lib/util.h
+++ b/lib/util.h
@@ -145,6 +145,11 @@ is_pow2(uintmax_t x)
 #define RDP2_4(X) (RDP2_5(X) | (RDP2_5(X) >> 2))
 #define RDP2_5(X) (  (X) | (  (X) >> 1))
 
+/* This system's cache line size, in bytes.
+ * Being wrong hurts performance but not correctness. */
+#define CACHE_LINE_SIZE 64
+BUILD_ASSERT_DECL(IS_POW2(CACHE_LINE_SIZE));
+
 #ifndef MIN
 #define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
 #endif
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [urcu v2 15/15] dpif-netdev: Use RCU to protect data.

2014-03-11 Thread Ben Pfaff
Signed-off-by: Ben Pfaff 
---
 lib/dpif-netdev.c |  148 -
 1 file changed, 43 insertions(+), 105 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index d9e7a1a..eb437d1 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -49,6 +49,7 @@
 #include "odp-util.h"
 #include "ofp-print.h"
 #include "ofpbuf.h"
+#include "ovs-rcu.h"
 #include "packets.h"
 #include "poll-loop.h"
 #include "random.h"
@@ -245,12 +246,6 @@ struct dp_netdev_flow {
 const struct hmap_node node; /* In owning dp_netdev's 'flow_table'. */
 const struct flow flow;  /* The flow that created this entry. */
 
-/* Number of references.
- * The classifier owns one reference.
- * Any thread trying to keep a rule from being freed should hold its own
- * reference. */
-struct ovs_refcount ref_cnt;
-
 /* Protects members marked OVS_GUARDED.
  *
  * Acquire after datapath's flow_mutex. */
@@ -266,12 +261,10 @@ struct dp_netdev_flow {
  * Reading 'actions' requires 'mutex'.
  * Writing 'actions' requires 'mutex' and (to allow for transactions) the
  * datapath's flow_mutex. */
-struct dp_netdev_actions *actions OVS_GUARDED;
+OVSRCU_TYPE(struct dp_netdev_actions *) actions;
 };
 
-static struct dp_netdev_flow *dp_netdev_flow_ref(
-const struct dp_netdev_flow *);
-static void dp_netdev_flow_unref(struct dp_netdev_flow *);
+static void dp_netdev_flow_free(struct dp_netdev_flow *);
 
 /* Contained by struct dp_netdev_flow's 'stats' member.  */
 struct dp_netdev_flow_stats {
@@ -294,8 +287,6 @@ struct dp_netdev_flow_stats {
  * 'flow' is the dp_netdev_flow for which 'flow->actions == actions') or that
  * owns a reference to 'actions->ref_cnt' (or both). */
 struct dp_netdev_actions {
-struct ovs_refcount ref_cnt;
-
 /* These members are immutable: they do not change during the struct's
  * lifetime.  */
 struct nlattr *actions; /* Sequence of OVS_ACTION_ATTR_* attributes. */
@@ -304,9 +295,9 @@ struct dp_netdev_actions {
 
 struct dp_netdev_actions *dp_netdev_actions_create(const struct nlattr *,
size_t);
-struct dp_netdev_actions *dp_netdev_actions_ref(
-const struct dp_netdev_actions *);
-void dp_netdev_actions_unref(struct dp_netdev_actions *);
+struct dp_netdev_actions *dp_netdev_flow_get_actions(
+const struct dp_netdev_flow *);
+static void dp_netdev_actions_free(struct dp_netdev_actions *);
 
 /* A thread that receives packets from some ports, looks them up in the flow
  * table, and executes the actions it finds. */
@@ -870,6 +861,24 @@ dpif_netdev_port_query_by_name(const struct dpif *dpif, 
const char *devname,
 }
 
 static void
+dp_netdev_flow_free(struct dp_netdev_flow *flow)
+{
+struct dp_netdev_flow_stats *bucket;
+size_t i;
+
+OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &flow->stats) {
+ovs_mutex_destroy(&bucket->mutex);
+free_cacheline(bucket);
+}
+ovsthread_stats_destroy(&flow->stats);
+
+cls_rule_destroy(CONST_CAST(struct cls_rule *, &flow->cr));
+dp_netdev_actions_free(dp_netdev_flow_get_actions(flow));
+ovs_mutex_destroy(&flow->mutex);
+free(flow);
+}
+
+static void
 dp_netdev_remove_flow(struct dp_netdev *dp, struct dp_netdev_flow *flow)
 OVS_REQ_WRLOCK(dp->cls.rwlock)
 OVS_REQUIRES(dp->flow_mutex)
@@ -879,39 +888,7 @@ dp_netdev_remove_flow(struct dp_netdev *dp, struct 
dp_netdev_flow *flow)
 
 classifier_remove(&dp->cls, cr);
 hmap_remove(&dp->flow_table, node);
-dp_netdev_flow_unref(flow);
-}
-
-static struct dp_netdev_flow *
-dp_netdev_flow_ref(const struct dp_netdev_flow *flow_)
-{
-struct dp_netdev_flow *flow = CONST_CAST(struct dp_netdev_flow *, flow_);
-if (flow) {
-ovs_refcount_ref(&flow->ref_cnt);
-}
-return flow;
-}
-
-static void
-dp_netdev_flow_unref(struct dp_netdev_flow *flow)
-{
-if (flow && ovs_refcount_unref(&flow->ref_cnt) == 1) {
-struct dp_netdev_flow_stats *bucket;
-size_t i;
-
-OVSTHREAD_STATS_FOR_EACH_BUCKET (bucket, i, &flow->stats) {
-ovs_mutex_destroy(&bucket->mutex);
-free_cacheline(bucket);
-}
-ovsthread_stats_destroy(&flow->stats);
-
-cls_rule_destroy(CONST_CAST(struct cls_rule *, &flow->cr));
-ovs_mutex_lock(&flow->mutex);
-dp_netdev_actions_unref(flow->actions);
-ovs_mutex_unlock(&flow->mutex);
-ovs_mutex_destroy(&flow->mutex);
-free(flow);
-}
+ovsrcu_postpone(dp_netdev_flow_free, flow);
 }
 
 static void
@@ -1030,7 +1007,6 @@ dp_netdev_lookup_flow(const struct dp_netdev *dp, const 
struct flow *flow)
 
 fat_rwlock_rdlock(&dp->cls.rwlock);
 netdev_flow = dp_netdev_flow_cast(classifier_lookup(&dp->cls, flow, NULL));
-dp_netdev_flow_ref(netdev_flow);
 fat_rwlock_unlock(&dp->cls.rwlock);
 
 return netdev_flow;
@@ -1045,7 +1021,7 @@ dp_netdev_find_flow(con

Re: [ovs-dev] [PATCH] ovsdb-server: Truncate file for Windows.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:35AM -0700, Gurucharan Shetty wrote:
> There is no ftruncate() in visual studio. There is a _chsize_s()
> which has a similar functionality.
> 
> Signed-off-by: Gurucharan Shetty 

This implementation of ftruncate() does not return anything if
_chsize_s() succeeds.  (It should return 0.)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] vlog: Array intialization for Windows.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:36AM -0700, Gurucharan Shetty wrote:
> Visual studio does not understand array initialization
> done this way:
> [first ... last] = value
> 
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] util: Maximum path length for Windows.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:37AM -0700, Gurucharan Shetty wrote:
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] uuid: Use current time to compute sha.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:38AM -0700, Gurucharan Shetty wrote:
> Windows does not have the getppid(), getuid(), getgid() functions.
> We do get a random seed from CryptGenRandom(). That seed along with
> process id and current time hopefully is good enough.
> 
> Signed-off-by: Gurucharan Shetty 

I see that there was a bug that 'now' wasn't getting folded in.  Let's
fix that for all systems, not just for Windows.

I think that this could be a little nicer by adding a function like
this:

static void
sha1_update_int(struct sha1 *sha1_ctx, uintmax_t x)
{
   sha1_update(sha1_ctx, &x, sizeof x);
}

and then we can delete the variables and the assignments and the
changes to windefs.h and just write:

sha1_update_int(&sha1_ctx, getpid());
#ifndef _WIN32
sha1_update_int(&sha1_ctx, getppid());
sha1_update_int(&sha1_ctx, getuid());
sha1_update_int(&sha1_ctx, getgid());
#endif

> ---
>  include/windows/windefs.h |4 
>  lib/uuid.c|6 ++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/include/windows/windefs.h b/include/windows/windefs.h
> index 6d36adc..6fde1ea 100644
> --- a/include/windows/windefs.h
> +++ b/include/windows/windefs.h
> @@ -32,4 +32,8 @@
>  #define u_int32_t uint32_t
>  #define u_int64_t uint64_t
>  
> +typedef int pid_t;
> +typedef int uid_t;
> +typedef int gid_t;
> +
>  #endif /* windefs.h */
> diff --git a/lib/uuid.c b/lib/uuid.c
> index 315c851..9ac5b55 100644
> --- a/lib/uuid.c
> +++ b/lib/uuid.c
> @@ -220,17 +220,23 @@ do_init(void)
>  get_entropy_or_die(random_seed, sizeof random_seed);
>  xgettimeofday(&now);
>  pid = getpid();
> +#ifndef _WIN32
>  ppid = getppid();
>  uid = getuid();
>  gid = getgid();
> +#endif
>  
>  /* Convert seed into key. */
>  sha1_init(&sha1_ctx);
>  sha1_update(&sha1_ctx, random_seed, sizeof random_seed);
>  sha1_update(&sha1_ctx, &pid, sizeof pid);
> +#ifndef _WIN32
>  sha1_update(&sha1_ctx, &ppid, sizeof ppid);
>  sha1_update(&sha1_ctx, &uid, sizeof uid);
>  sha1_update(&sha1_ctx, &gid, sizeof gid);
> +#else
> +sha1_update(&sha1_ctx, &now, sizeof now);
> +#endif
>  sha1_final(&sha1_ctx, sha1);
>  
>  /* Generate key. */
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] byte-order: htonll() and ntohll() for Windows.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:39AM -0700, Gurucharan Shetty wrote:
> These functions exist, so don't provide them.
> 
> Signed-off-by: Gurucharan Shetty 

How curious.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] windows: Add stub headers for windows.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:42AM -0700, Gurucharan Shetty wrote:
> Windows does not have a bunch of headers that
> are available in Linux. Instead of littering the code
> with #ifndef _WIN32, add stub headers.
> 
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovs-thread: count the number of cpu cores.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:41AM -0700, Gurucharan Shetty wrote:
> We use the number of cpu cores to determine the number
> of threads that we spawn. We are not yet sure what is
> the ideal number of OVS userspace threads that can run
> on Hyper-V. Till we figure that out, use the same logic
> of counting CPU cores in Windows too.
> 
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] README-lisp: improve LISP documentation

2014-03-11 Thread Ben Pfaff
On Tue, Mar 11, 2014 at 07:02:28PM +0200, Lorand Jakab wrote:
> People familiar with LISP are used to the concept of a mapping cache in
> a LISP Tunnel Router.  Explain how that concept maps to OVS flow rules.
> Additionally, mention that eth0 need not be added in all example
> scenarios.
> 
> Signed-off-by: Lorand Jakab 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ofp-print: __attribute__ format in visual studio.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 10:56:40AM -0700, Gurucharan Shetty wrote:
> Visual studio does not understand __attribute__ format. All the
> violations would be caught with gcc in linux. So, it should be okay
> removing that.
> 
> Signed-off-by: Gurucharan Shetty 

Please change this code to use the PRINTF_FORMAT macro from compiler.h
instead.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovs-thread: count the number of cpu cores.

2014-03-11 Thread YAMAMOTO Takashi
> We use the number of cpu cores to determine the number
> of threads that we spawn. We are not yet sure what is
> the ideal number of OVS userspace threads that can run
> on Hyper-V. Till we figure that out, use the same logic
> of counting CPU cores in Windows too.
> 
> Signed-off-by: Gurucharan Shetty 
> ---
>  lib/ovs-thread.c |6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
> index b6b51c7..85de014 100644
> --- a/lib/ovs-thread.c
> +++ b/lib/ovs-thread.c
> @@ -509,10 +509,16 @@ count_cpu_cores(void)
>  static long int n_cores;
>  
>  if (ovsthread_once_start(&once)) {
> +#ifndef _WIN32
>  parse_cpuinfo(&n_cores);
>  if (!n_cores) {
>  n_cores = sysconf(_SC_NPROCESSORS_ONLN);
>  }
> +#else
> +SYSTEM_INFO sysinfo;
> +GetSystemInfo(&sysinfo);
> +n_cores = sysinfo.dwNumberOfProcessors;

does this need to be wrapped by ovsthread_once?

YAMAMOTO Takashi

> +#endif
>  ovsthread_once_done(&once);
>  }
>  
> -- 
> 1.7.9.5
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [bond megaflow v2 3/5] lib/metaflow: add new flow fields for recirculation

2014-03-11 Thread Andy Zhou
Signed-off-by: Andy Zhou 

---
v1 -> v2:  Minor adjustment.
---
 lib/match.c |   26 
 lib/match.h |6 ++
 lib/meta-flow.c |   60 +++
 lib/meta-flow.h |2 ++
 4 files changed, 94 insertions(+)

diff --git a/lib/match.c b/lib/match.c
index 12e58a2..8789413 100644
--- a/lib/match.c
+++ b/lib/match.c
@@ -165,6 +165,32 @@ match_zero_wildcarded_fields(struct match *match)
 }
 
 void
+match_set_dp_hash(struct match *match, uint32_t value)
+{
+match_set_dp_hash_masked(match, value, UINT32_MAX);
+}
+
+void
+match_set_dp_hash_masked(struct match *match, uint32_t value, uint32_t mask)
+{
+match->wc.masks.dp_hash = mask;
+match->flow.dp_hash = value & mask;
+}
+
+void
+match_set_recirc_id(struct match *match, uint32_t value)
+{
+match_set_recirc_id_masked(match, value, UINT32_MAX);
+}
+
+void
+match_set_recirc_id_masked(struct match *match, uint32_t value, uint32_t mask)
+{
+match->wc.masks.recirc_id = mask;
+match->flow.recirc_id = value & mask;
+}
+
+void
 match_set_reg(struct match *match, unsigned int reg_idx, uint32_t value)
 {
 match_set_reg_masked(match, reg_idx, value, UINT32_MAX);
diff --git a/lib/match.h b/lib/match.h
index 7a8ae68..95c8e67 100644
--- a/lib/match.h
+++ b/lib/match.h
@@ -41,6 +41,12 @@ void match_init_catchall(struct match *);
 
 void match_zero_wildcarded_fields(struct match *);
 
+void match_set_dp_hash(struct match *, uint32_t value);
+void match_set_dp_hash_masked(struct match *, uint32_t value, uint32_t mask);
+
+void match_set_recirc_id(struct match *, uint32_t value);
+void match_set_recirc_id_masked(struct match *, uint32_t value, uint32_t mask);
+
 void match_set_reg(struct match *, unsigned int reg_idx, uint32_t value);
 void match_set_reg_masked(struct match *, unsigned int reg_idx,
   uint32_t value, uint32_t mask);
diff --git a/lib/meta-flow.c b/lib/meta-flow.c
index d90477a..248d1f0 100644
--- a/lib/meta-flow.c
+++ b/lib/meta-flow.c
@@ -52,6 +52,30 @@ const struct mf_field mf_fields[MFF_N_IDS] = {
 /* ##  ## */
 
 {
+MFF_DP_HASH, "dp_hash", NULL,
+MF_FIELD_SIZES(be32),
+MFM_FULLY,
+MFS_HEXADECIMAL,
+MFP_NONE,
+false,
+NXM_NX_DP_HASH, "NXM_NX_DP_HASH",
+NXM_NX_DP_HASH, "NXM_NX_DP_HASH",
+OFPUTIL_P_NXM_OXM_ANY,
+OFPUTIL_P_NXM_OXM_ANY,
+-1,
+}, {
+MFF_RECIRC_ID, "recirc_id", NULL,
+MF_FIELD_SIZES(be32),
+MFM_FULLY,
+MFS_DECIMAL,
+MFP_NONE,
+false,
+NXM_NX_RECIRC_ID, "NXM_NX_RECIRC_ID",
+NXM_NX_RECIRC_ID, "NXM_NX_RECIRC_ID",
+OFPUTIL_P_NXM_OXM_ANY,
+OFPUTIL_P_NXM_OXM_ANY,
+-1,
+}, {
 MFF_TUN_ID, "tun_id", "tunnel_id",
 MF_FIELD_SIZES(be64),
 MFM_FULLY,
@@ -879,6 +903,10 @@ bool
 mf_is_all_wild(const struct mf_field *mf, const struct flow_wildcards *wc)
 {
 switch (mf->id) {
+case MFF_DP_HASH:
+return !wc->masks.dp_hash;
+case MFF_RECIRC_ID:
+return !wc->masks.recirc_id;
 case MFF_TUN_SRC:
 return !wc->masks.tunnel.ip_src;
 case MFF_TUN_DST:
@@ -1124,6 +1152,8 @@ bool
 mf_is_value_valid(const struct mf_field *mf, const union mf_value *value)
 {
 switch (mf->id) {
+case MFF_DP_HASH:
+case MFF_RECIRC_ID:
 case MFF_TUN_ID:
 case MFF_TUN_SRC:
 case MFF_TUN_DST:
@@ -1217,6 +1247,12 @@ mf_get_value(const struct mf_field *mf, const struct 
flow *flow,
  union mf_value *value)
 {
 switch (mf->id) {
+case MFF_DP_HASH:
+value->be32 = htonl(flow->dp_hash);
+break;
+case MFF_RECIRC_ID:
+value->be32 = htonl(flow->recirc_id);
+break;
 case MFF_TUN_ID:
 value->be64 = flow->tunnel.tun_id;
 break;
@@ -1409,6 +1445,12 @@ mf_set_value(const struct mf_field *mf,
  const union mf_value *value, struct match *match)
 {
 switch (mf->id) {
+case MFF_DP_HASH:
+match_set_dp_hash(match, ntohl(value->be32));
+break;
+case MFF_RECIRC_ID:
+match_set_recirc_id(match, ntohl(value->be32));
+break;
 case MFF_TUN_ID:
 match_set_tun_id(match, value->be64);
 break;
@@ -1622,6 +1664,12 @@ mf_set_flow_value(const struct mf_field *mf,
   const union mf_value *value, struct flow *flow)
 {
 switch (mf->id) {
+case MFF_DP_HASH:
+flow->dp_hash = ntohl(value->be32);
+break;
+case MFF_RECIRC_ID:
+flow->recirc_id = ntohl(value->be32);
+break;
 case MFF_TUN_ID:
 flow->tunnel.tun_id = value->be64;
 break;
@@ -1834,6 +1882,14 @@ void
 mf_set_wild(const struct mf_field *mf, struct match *match)
 {
 switch (mf->id) {
+case MFF_DP_HASH:
+match->flow.dp_hash = 0;
+match->wc.masks.dp_hash = 0;
+break;
+case MFF_RECIRC_ID:
+match->flow.rec

[ovs-dev] [bond megaflow v2 4/5] dpif-netdev: user space datapath recirculation

2014-03-11 Thread Andy Zhou
Add basic recirculation infrastructure and user space
data path support for it. The following bond mega flow patch will
make use of this infrastructure.

Signed-off-by: Andy Zhou 

---
v1->v2:  Rewritten based on having post recirculation rules stored
 in table 254.
---
 include/linux/openvswitch.h |   35 -
 lib/dpif-netdev.c   |   35 -
 lib/dpif.c  |3 +-
 lib/odp-execute.c   |9 +
 lib/odp-execute.h   |2 +-
 lib/odp-util.c  |   91 ++-
 lib/packets.c   |   16 +---
 lib/packets.h   |   16 +---
 ofproto/ofproto-dpif.c  |2 +-
 ofproto/ofproto-dpif.h  |   63 --
 10 files changed, 251 insertions(+), 21 deletions(-)

diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index d1ff5ec..af951f5 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -307,7 +307,8 @@ enum ovs_key_attr {
OVS_KEY_ATTR_TUNNEL,/* Nested set of ovs_tunnel attributes */
OVS_KEY_ATTR_SCTP,  /* struct ovs_key_sctp */
OVS_KEY_ATTR_TCP_FLAGS, /* be16 TCP flags. */
-
+   OVS_KEY_ATTR_DP_HASH,   /* u32 hash value */
+   OVS_KEY_ATTR_RECIRC_ID, /* u32 recirc id */
 #ifdef __KERNEL__
OVS_KEY_ATTR_IPV4_TUNNEL,  /* struct ovs_key_ipv4_tunnel */
 #endif
@@ -530,6 +531,36 @@ struct ovs_action_push_vlan {
__be16 vlan_tci;/* 802.1Q TCI (VLAN ID and priority). */
 };
 
+/* Recirculation ID needs to be unique per back-end.
+ * It can be any value except zero. Use the following
+ * defines to restrict the range further.
+ */
+#define RECIRC_ID_BASE   300
+#define RECIRC_ID_SIZE   65535
+
+/* Data path hash algorithm for computing Datapath hash.
+ *
+ * The Algorithm type only specifies the fields in a flow
+ * will be used as part of the hash. Each datapath is free
+ * to use its own hash algorithm. The hash value will be
+ * opaque to the user space daemon.
+ */
+enum ovs_recirc_hash_alg {
+   OVS_RECIRC_HASH_ALG_NONE,
+   OVS_RECIRC_HASH_ALG_L4,
+};
+/*
+ * struct ovs_action_recirc - %OVS_ACTION_ATTR_RECIRC action argument.
+ * @recirc_id: The Recirculation label, Zero is invalid.
+ * @hash_alg: Algorithm used to compute hash prior to recirculation.
+ * @hash_bias: bias used for computing hash.  used to compute hash prior to 
recirculation.
+ */
+struct ovs_action_recirc {
+   uint8_t   hash_alg; /* One of ovs_dp_hash_alg */
+   __be32  hash_bias;
+   __be32  recirc_id;  /* Recirculation label. */
+};
+
 /**
  * enum ovs_action_attr - Action types.
  *
@@ -553,6 +584,7 @@ struct ovs_action_push_vlan {
  * indicate the new packet contents. This could potentially still be
  * %ETH_P_MPLS if the resulting MPLS label stack is not empty.  If there
  * is no MPLS label stack, as determined by ethertype, no action is taken.
+ * @OVS_ACTION_RECIRC: Recirculate within the data path.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -569,6 +601,7 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_SAMPLE,   /* Nested OVS_SAMPLE_ATTR_*. */
OVS_ACTION_ATTR_PUSH_MPLS,/* struct ovs_action_push_mpls. */
OVS_ACTION_ATTR_POP_MPLS, /* __be16 ethertype. */
+   OVS_ACTION_ATTR_RECIRC,   /* struct ovs_action_recirc. */
__OVS_ACTION_ATTR_MAX
 };
 
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 5897f8b..58bcfc1 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1157,6 +1157,19 @@ dpif_netdev_flow_get(const struct dpif *dpif,
 return error;
 }
 
+/* To support recirculation, user space datapath requires all rules
+ * have exact match to the metadata field, which will store the
+ * recirculation ID. */
+static void
+dp_netdev_match_init(struct match *match, const struct flow *flow,
+const struct flow_wildcards *wc)
+{
+struct flow_wildcards netdev_wc = *wc;
+
+netdev_wc.masks.metadata = UINT64_MAX;
+match_init(match, flow, &netdev_wc);
+}
+
 static int
 dp_netdev_flow_add(struct dp_netdev *dp, const struct flow *flow,
const struct flow_wildcards *wc,
@@ -1176,7 +1189,7 @@ dp_netdev_flow_add(struct dp_netdev *dp, const struct 
flow *flow,
 
 netdev_flow->actions = dp_netdev_actions_create(actions, actions_len);
 
-match_init(&match, flow, wc);
+dp_netdev_match_init(&match, flow, wc);
 cls_rule_init(CONST_CAST(struct cls_rule *, &netdev_flow->cr),
   &match, NETDEV_RULE_PRIORITY);
 fat_rwlock_wrlock(&dp->cls.rwlock);
@@ -1808,7 +1821,7 @@ struct dp_netdev_execute_aux {
 
 static void
 dp_execute_cb(void *aux_, struct ofpbuf *packet,
-  const struct pkt_metadata *md OVS_UNUSED,
+  struct pkt_metadata *md,
   const struct nlat

[ovs-dev] [bond megaflow v2 1/5] ofproto-dpif: Added Per backer recirculation ID management

2014-03-11 Thread Andy Zhou
Recirculation ID needs to be unique per datapath. Its usage will be
tracked by the backer that corresponds to the datapath.

In theory, Recirculation ID can be any uint32_t value, except 0. This
implementation limits to a smaller range just for ease of debugging.
Make the range size 0 effectively disables recirculation.

Signed-off-by: Andy Zhou 

---
v1->v2:  No major changes.
 Comments about recirculation is added in ofproto-dpif.h in
 patch 4
---
 ofproto/ofproto-dpif.c |  152 +++-
 ofproto/ofproto-dpif.h |4 ++
 2 files changed, 155 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 8c43ee9..252e6b4 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -238,6 +238,29 @@ COVERAGE_DEFINE(rev_port_toggled);
 COVERAGE_DEFINE(rev_flow_table);
 COVERAGE_DEFINE(rev_mac_learning);
 
+/* A map for recirculation ID allocation. */
+struct rid_map {
+struct hmap map;
+};
+
+struct rid_node {
+struct hmap_node node;
+uint32_t recirc_id;
+};
+
+struct recirc_id_set {
+struct rid_map ridmap;
+uint32_t base; /* IDs in the range of [base, base + n_ids). */
+uint32_t n_ids;/* Total number of ids in the set. */
+uint32_t next_free_id; /* Possible next free id. */
+};
+
+static void recirc_id_set_init(struct recirc_id_set *set, uint32_t base,
+  uint32_t n_ids);
+static void recirc_id_set_uninit(struct recirc_id_set *set);
+static uint32_t recirc_id_alloc(struct recirc_id_set *set);
+static void recirc_id_free(struct recirc_id_set *set, uint32_t recirc_id);
+
 /* All datapaths of a given type share a single dpif backer instance. */
 struct dpif_backer {
 char *type;
@@ -254,6 +277,10 @@ struct dpif_backer {
 
 bool recv_set_enable; /* Enables or disables receiving packets. */
 
+/* Recirculation. */
+struct ovs_mutex recirc_id_lock; /* lock for recirc_ids */
+struct recirc_id_set rids;   /* Managing recirculation ID. */
+
 /* True if the datapath supports variable-length
  * OVS_USERSPACE_ATTR_USERDATA in OVS_ACTION_ATTR_USERSPACE actions.
  * False if the datapath supports only 8-byte (or shorter) userdata. */
@@ -781,9 +808,9 @@ close_dpif_backer(struct dpif_backer *backer)
 ovs_rwlock_destroy(&backer->odp_to_ofport_lock);
 hmap_destroy(&backer->odp_to_ofport_map);
 shash_find_and_delete(&all_dpif_backers, backer->type);
+recirc_id_set_uninit(&backer->rids);
 free(backer->type);
 dpif_close(backer->dpif);
-
 free(backer);
 }
 
@@ -901,6 +928,9 @@ open_dpif_backer(const char *type, struct dpif_backer 
**backerp)
 udpif_set_threads(backer->udpif, n_handlers, n_revalidators);
 }
 
+ovs_mutex_init(&backer->recirc_id_lock);
+recirc_id_set_init(&backer->rids, 0, 0);
+
 return error;
 }
 
@@ -4474,6 +4504,126 @@ vsp_add(struct ofport_dpif *port, ofp_port_t 
realdev_ofp_port, int vid)
 ovs_mutex_unlock(&ofproto->vsp_mutex);
 }
 
+static void
+recirc_id_set_init(struct recirc_id_set *set, uint32_t base,
+  uint32_t n_ids)
+{
+set->base = base;
+set->n_ids = n_ids;
+set->next_free_id = base;
+hmap_init(&set->ridmap.map);
+}
+
+static void
+recirc_id_set_uninit(struct recirc_id_set *set)
+{
+struct rid_node *rid, *next;
+
+HMAP_FOR_EACH_SAFE(rid, next, node, &set->ridmap.map) {
+hmap_remove(&set->ridmap.map, &rid->node);
+free(rid);
+}
+
+hmap_destroy(&set->ridmap.map);
+}
+
+static struct rid_node *
+ridset_find(struct recirc_id_set *set, uint32_t id)
+{
+size_t hash;
+struct rid_node *rid;
+
+hash = hash_int(id, 0);
+HMAP_FOR_EACH_WITH_HASH(rid, node, hash, &set->ridmap.map) {
+if (id == rid->recirc_id) {
+return rid;
+}
+}
+return NULL;
+}
+
+static struct rid_node *
+ridset_add(struct recirc_id_set *set, uint32_t id)
+{
+struct rid_node *rid = xmalloc(sizeof *rid);
+size_t hash;
+
+rid->recirc_id = id;
+hash = hash_int(id, 0);
+hmap_insert(&set->ridmap.map, &rid->node, hash);
+return rid;
+}
+
+static uint32_t
+recirc_id_alloc(struct recirc_id_set *set)
+{
+uint32_t id;
+
+if (set->n_ids == 0)
+return 0;
+
+if (!(ridset_find(set, set->next_free_id))) {
+id = set->next_free_id;
+goto found_free_id;
+}
+
+for (id = set->base; id < set->base + set->n_ids; id++) {
+if ((ridset_find(set, id)))
+goto found_free_id;
+}
+
+/* Not available. */
+return 0;
+
+found_free_id:
+ridset_add(set, id);
+
+if (id < set->base + set->n_ids) {
+set->next_free_id = id + 1;
+} else {
+set->next_free_id = set->base;
+}
+
+return id;
+}
+
+static void
+recirc_id_free(struct recirc_id_set *set, uint32_t recirc_id)
+{
+struct rid_node *rid;
+if (recirc_id > set->base && (recirc_id <= set->base + set->n_ids)) {
+ 

Re: [ovs-dev] [PATCH V3 3/5] datapath: Allow each vport to have an array of 'port_id's.

2014-03-11 Thread Ben Pfaff
On Fri, Mar 07, 2014 at 06:04:40PM -0800, Alex Wang wrote:
> In order to allow handlers directly read upcalls from datapath,
> we need to support per-handler netlink socket for each vport in
> datapath.  This commit makes this happen.  Also, it is guaranteed
> backward and forward compatibility with previous branch.
> 
> Signed-off-by: Alex Wang 

This needs Jesse or Pravin's review.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [bond megaflow v2 2/5] lib/flow: add dp_hash and recirc_id to struct flow

2014-03-11 Thread Andy Zhou
Signed-off-by: Andy Zhou 

---
v1->v2:  No major changes
---
 include/openflow/nicira-ext.h |   26 ++
 lib/flow.c|6 --
 lib/flow.h|   13 ++---
 lib/match.c   |   12 +++-
 lib/nx-match.c|   15 ++-
 lib/ofp-util.c|2 +-
 ofproto/ofproto-dpif-xlate.c  |2 +-
 7 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h
index 22939f4..03b3776 100644
--- a/include/openflow/nicira-ext.h
+++ b/include/openflow/nicira-ext.h
@@ -1803,6 +1803,32 @@ OFP_ASSERT(sizeof(struct nx_action_output_reg) == 24);
 #define NXM_NX_TCP_FLAGS   NXM_HEADER  (0x0001, 34, 2)
 #define NXM_NX_TCP_FLAGS_W NXM_HEADER_W(0x0001, 34, 2)
 
+/* Metadata dp_hash.
+ *
+ * The dp_hash is used to carry the flow hash computed in the
+ * datapath.
+ *
+ * Prereqs: None.
+ *
+ * Format: 32-bit integer in network byte order.
+ *
+ * Masking: Fully maskable. */
+#define NXM_NX_DP_HASH   NXM_HEADER  (0x0001, 35, 4)
+#define NXM_NX_DP_HASH_W NXM_HEADER_W(0x0001, 35, 4)
+
+/* Metadata recirc_id.
+ *
+ * The recirc_id used for recirculation. 0 is reserved
+ * for initially received packet.
+ *
+ * Prereqs: None.
+ *
+ * Format: 32-bit integer in network byte order.
+ *
+ * Masking: not maskable. */
+#define NXM_NX_RECIRC_ID   NXM_HEADER  (0x0001, 36, 4)
+#define NXM_NX_RECIRC_ID_W NXM_HEADER_W(0x0001, 36, 4)
+
 /* ## - ## */
 /* ## Requests and replies. ## */
 /* ## - ## */
diff --git a/lib/flow.c b/lib/flow.c
index 82d6729..6c60f04 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -532,8 +532,10 @@ flow_unwildcard_tp_ports(const struct flow *flow, struct 
flow_wildcards *wc)
 void
 flow_get_metadata(const struct flow *flow, struct flow_metadata *fmd)
 {
-BUILD_ASSERT_DECL(FLOW_WC_SEQ == 24);
+BUILD_ASSERT_DECL(FLOW_WC_SEQ == 25);
 
+fmd->dp_hash = flow->dp_hash;
+fmd->recirc_id = flow->recirc_id;
 fmd->tun_id = flow->tunnel.tun_id;
 fmd->tun_src = flow->tunnel.ip_src;
 fmd->tun_dst = flow->tunnel.ip_dst;
@@ -1177,7 +1179,7 @@ flow_push_mpls(struct flow *flow, int n, ovs_be16 
mpls_eth_type,
 flow->mpls_lse[0] = set_mpls_lse_values(ttl, tc, 1, htonl(label));
 
 /* Clear all L3 and L4 fields. */
-BUILD_ASSERT(FLOW_WC_SEQ == 24);
+BUILD_ASSERT(FLOW_WC_SEQ == 25);
 memset((char *) flow + FLOW_SEGMENT_2_ENDS_AT, 0,
sizeof(struct flow) - FLOW_SEGMENT_2_ENDS_AT);
 }
diff --git a/lib/flow.h b/lib/flow.h
index 8165bcf..8e1349d 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -37,7 +37,7 @@ struct pkt_metadata;
 /* This sequence number should be incremented whenever anything involving flows
  * or the wildcarding of flows changes.  This will cause build assertion
  * failures in places which likely need to be updated. */
-#define FLOW_WC_SEQ 24
+#define FLOW_WC_SEQ 25
 
 #define FLOW_N_REGS 8
 BUILD_ASSERT_DECL(FLOW_N_REGS <= NXM_NX_MAX_REGS);
@@ -97,6 +97,11 @@ union flow_in_port {
  * be looked at.  This enables better wildcarding for datapath flows.
  */
 struct flow {
+/* Recirculation */
+uint32_t dp_hash;   /* Datapth computed hash value. The exact
+   computation is opaque to the user space.*/
+uint32_t recirc_id; /* Must be exact match. */
+
 /* L1 */
 struct flow_tnl tunnel; /* Encapsulating tunnel parameters. */
 ovs_be64 metadata;  /* OpenFlow Metadata. */
@@ -139,8 +144,8 @@ BUILD_ASSERT_DECL(sizeof(struct flow) % 4 == 0);
 
 /* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */
 BUILD_ASSERT_DECL(offsetof(struct flow, tp_dst) + 2
-  == sizeof(struct flow_tnl) + 164
-  && FLOW_WC_SEQ == 24);
+  == sizeof(struct flow_tnl) + 172
+  && FLOW_WC_SEQ == 25);
 
 /* Incremental points at which flow classification may be performed in
  * segments.
@@ -165,6 +170,8 @@ extern const uint8_t flow_segment_u32s[];
 
 /* Represents the metadata fields of struct flow. */
 struct flow_metadata {
+uint32_t dp_hash;/* Datapath computed hash field. */
+uint32_t recirc_id;  /* Recirculation ID. */
 ovs_be64 tun_id; /* Encapsulating tunnel ID. */
 ovs_be32 tun_src;/* Tunnel outer IPv4 src addr */
 ovs_be32 tun_dst;/* Tunnel outer IPv4 dst addr */
diff --git a/lib/match.c b/lib/match.c
index b2a25fd..12e58a2 100644
--- a/lib/match.c
+++ b/lib/match.c
@@ -895,7 +895,7 @@ match_format(const struct match *match, struct ds *s, 
unsigned int priority)
 
 int i;
 
-BUILD_ASSERT_DECL(FLOW_WC_SEQ == 24);
+BUILD_ASSERT_DECL(FLOW_WC_SEQ == 25);
 
 if (priority != OFP_DEFAULT_PRIORITY) {
 ds_put_format(s, "priority=%u,", priority);
@@ -903,6 +903,16 @@ match_format(const struct matc

Re: [ovs-dev] [RFC 0/9] Cache the modules affected by xlate_actions().

2014-03-11 Thread Ben Pfaff
On Thu, Mar 06, 2014 at 05:20:23PM -0800, Joe Stringer wrote:
> This series implements a cache for xlate_actions() so that full flow
> translation does not need to be performed for long-lived flows; instead,
> references are kept to the modules affected by it, and this cache can be used
> to attribute stats and effect other side-effects of xlation.
> 
> This patch series improves CRR performance mildly in the typical case; 
> modestly
> for cases where revalidation is constantly occurring; and makes the work for
> revalidator threads much more efficient, allowing more flows to be maintained
> in the datapath. Note that patch #8 causes revalidators to remove any flows
> that are not experiencing high throughput.
> 
> The "modules" which logic is implemented for are:
> * Stats attribution:
> + Rules
> + Netdevs
> + Bonds
> + Mirrors
> * Side effects:
> + Learn actions - hard timeout refresh
> + Normal actions - mac learning
> 
> There are minor testsuite failures in the later patches, but as a whole this 
> is
> fairly complete. I am looking for feedback on the approach, specifically
> whether I may have missed some modules, if there are any holes in the logic
> where typical operation may cause failures, and if there is perhaps areas that
> I do not go far enough. I do plan to look into shifting ukey creation into
> handler threads, but that is a later topic.

I applied patches 1 through 4 to master.

I think I'm happy with patches 5 through 7, but I want Ethan's
feedback on them.

You noted that the later patches cause testsuite failures.  We'll need
to resolve those, one way or another, before we can apply them.  Any
particular feedback you're looking for in the meantime?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH V3 2/5] dpif-netdev: Implement the API functions to allow multiple handler threads read upcall.

2014-03-11 Thread Alex Wang
Thanks Ben,

Great to hear more from you~

Please see my reply inline,

>
> > Signed-off-by: Alex Wang 
> >
> > ---
> > V2 -> V3:
> > - rebase.
> >
> > PATCH -> V2:
> > - explain the drop of upcall queueing priority in dpif-netdev.
> > - use mhash to calculate the 5-tuple hash.
>
> Why does dpif_netdev_recv_set() ignore its 'enable' argument?
>


I saw in current dpif-netdev.c, the dpif_netdev_recv_set() does nothing.

I didn't ask for the reason.  But now seems to be the good time, do you know
why it is not implemented?

If there is the need, I'd like follow the dpif-provider definition and
implement it.



> dp_netdev_refresh_queues() has two callers, each of which is a wrapper
> that takes the queue rwlock write-lock.  I guess that taking the lock
> could be moved into dp_netdev_refresh_queues().
>


Yes, I'll do that,



> dp_netdev_init_queue() and dp_netdev_uninit_queue() are short
> functions with only a single caller each.  I think that the code
> would be clearer if they were integrated into their callers
>

I'll do that,



> I would be more comfortable if dpif_netdev_recv() and
> dpif_netdev_recv_wait() checked that the handler_id passed in was in
> the currently valid range.
>
>
Thx, I'll do that,


> As written, flow_hash_5tuple() will incorporate ICMP type and code
> into the hash (because those are stored into tp_src and tp_dst).  Is
> that desirable?
>


I'm okay with that.  I think counting ICMP type and code in hash will not
cause
particular handler receiving unfairly more ICMP related upcalls.  (Is this
what you
concerned?)




Also, want to ask that if there is any issue with the first patch?


Thanks,
Alex Wang,
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [bond megaflow v2 0/5] Bond megaflow and recirculation

2014-03-11 Thread Andy Zhou
This is the first patch series to add megaflow support for bond interface
using recirculation.

The final goal is that the bonding port selection will be made within the
datapath, using recirculation.  Currently, the bonding port selection is
made in the user space, preventing possible wider mega flow being generated.

This patch adds the necessary infrastructure for the next patch series.
Since it is still work in progress, no features are enabled by default.

Andy Zhou (5):
  ofproto-dpif: Added Per backer recirculation ID management
  lib/flow: add dp_hash and recirc_id to struct flow
  lib/metaflow: add new flow fields for recirculation
  dpif-netdev: user space datapath recirculation
  ofproto/bond: Implement bond megaflow using recirculation

 include/linux/openvswitch.h   |   35 +++-
 include/openflow/nicira-ext.h |   26 +++
 lib/dpif-netdev.c |   53 +-
 lib/dpif.c|3 +-
 lib/flow.c|8 +-
 lib/flow.h|   13 +-
 lib/match.c   |   38 +++-
 lib/match.h   |6 +
 lib/meta-flow.c   |   60 ++
 lib/meta-flow.h   |2 +
 lib/nx-match.c|   15 +-
 lib/odp-execute.c |9 +
 lib/odp-execute.h |2 +-
 lib/odp-util.c|  106 ++-
 lib/odp-util.h|4 +-
 lib/ofp-util.c|2 +-
 lib/packets.c |   16 +-
 lib/packets.h |   16 +-
 ofproto/bond.c|  271 +--
 ofproto/bond.h|   33 +++-
 ofproto/ofproto-dpif-upcall.c |   13 +-
 ofproto/ofproto-dpif-xlate.c  |   80 ++--
 ofproto/ofproto-dpif-xlate.h  |   21 ++-
 ofproto/ofproto-dpif.c|  407 +++--
 ofproto/ofproto-dpif.h|   68 ++-
 ofproto/ofproto.c |   47 +++--
 ofproto/ofproto.h |1 +
 tests/lacp.at |   24 ++-
 tests/ofproto-dpif.at |  207 +++--
 tests/test-odp.c  |3 +-
 30 files changed, 1369 insertions(+), 220 deletions(-)

-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [RFC 7/9] ofproto-dpif: Allow caching of xlate_actions() effects.

2014-03-11 Thread Ben Pfaff
On Thu, Mar 06, 2014 at 05:20:30PM -0800, Joe Stringer wrote:
> This patch adds a new object called 'struct xlate_cache' which can be
> set in 'struct xlate_in', and passed to xlate_actions() to cache the
> modules affected by this flow translation. Subsequently, the caller can
> pass the xcache to xlate_from_cache() to credit stats and perform side
> effects for a lower cost than full flow translation.
> 
> Initial testing shows mild TCP CRR performance increase (~5%) and a
> drastic decrease in flow dump duration. This is expected to allow
> significantly more flows to be maintained in the datapath.
> 
> Signed-off-by: Joe Stringer 

I'm folding in some minor style fixes:

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index ecc545f..bd78cbc 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3366,7 +3366,8 @@ xlate_credit_stats_dev(struct xc_entry *entry, struct 
dpif_flow_stats *push)
 }
 
 struct xlate_cache *
-xlate_cache_new(void) {
+xlate_cache_new(void)
+{
 struct xlate_cache *xc = xmalloc(sizeof *xc);
 
 ofpbuf_init(&xc->entries, 512);
@@ -3374,7 +3375,8 @@ xlate_cache_new(void) {
 }
 
 static struct xc_entry *
-xlate_cache_add_entry(struct xlate_cache *xc, enum stats_type type) {
+xlate_cache_add_entry(struct xlate_cache *xc, enum stats_type type)
+{
 struct xc_entry *entry;
 
 entry = ofpbuf_put_uninit(&xc->entries, sizeof *entry);
@@ -3395,7 +3397,8 @@ xlate_cache_normal(struct ofproto_dpif *ofproto, struct 
flow *flow, int vlan)
 return;
 }
 
-xbundle = lookup_input_bundle(xbridge, flow->in_port.ofp_port, false, 
NULL);
+xbundle = lookup_input_bundle(xbridge, flow->in_port.ofp_port, false,
+  NULL);
 if (!xbundle) {
 return;
 }
@@ -3409,7 +3412,7 @@ xlate_from_cache(struct xlate_cache *xc, struct 
dpif_flow_stats *push)
 struct xc_entry *entry;
 struct ofpbuf entries = xc->entries;
 
-XC_ENTRY_FOR_EACH(entry, entries, xc) {
+XC_ENTRY_FOR_EACH (entry, entries, xc) {
 switch (entry->type) {
 case XC_RULE:
 rule_dpif_credit_stats(entry->u.rule, push);
@@ -3463,7 +3466,9 @@ xlate_dev_unref(struct xc_entry *entry)
 }
 }
 
-void xlate_cache_clear(struct xlate_cache *xc) {
+void
+xlate_cache_clear(struct xlate_cache *xc)
+{
 struct xc_entry *entry;
 struct ofpbuf entries;
 
@@ -3471,7 +3476,7 @@ void xlate_cache_clear(struct xlate_cache *xc) {
 return;
 }
 
-XC_ENTRY_FOR_EACH(entry, entries, xc) {
+XC_ENTRY_FOR_EACH (entry, entries, xc) {
 switch (entry->type) {
 case XC_RULE:
 rule_dpif_unref(entry->u.rule);
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH V3 2/5] dpif-netdev: Implement the API functions to allow multiple handler threads read upcall.

2014-03-11 Thread Ben Pfaff
On Fri, Mar 07, 2014 at 06:04:39PM -0800, Alex Wang wrote:
> This commit implements the API functions to allow multiple handler
> threads read upcall.
> 
> Also, this commit removes the handling priority of DPIF_UC_MISS
> over DPIF_UC_ACTION.  So, both misses will be put to the same
> queue.  The decision is based on the fact that a lot has changed
> since the age when flow setup rate is most treasured and starving
> all actions in the presence of any flow misses doesn't seem like
> a sound balancing solution.
> 
> Thusly the current implementation will be put in testing and
> investigation for better balancing solution will continue if
> there is an issue.
> 
> Signed-off-by: Alex Wang 
> 
> ---
> V2 -> V3:
> - rebase.
> 
> PATCH -> V2:
> - explain the drop of upcall queueing priority in dpif-netdev.
> - use mhash to calculate the 5-tuple hash.

Why does dpif_netdev_recv_set() ignore its 'enable' argument?

dp_netdev_refresh_queues() has two callers, each of which is a wrapper
that takes the queue rwlock write-lock.  I guess that taking the lock
could be moved into dp_netdev_refresh_queues().

dp_netdev_init_queue() and dp_netdev_uninit_queue() are short
functions with only a single caller each.  I think that the code
would be clearer if they were integrated into their callers.

I would be more comfortable if dpif_netdev_recv() and
dpif_netdev_recv_wait() checked that the handler_id passed in was in
the currently valid range.

As written, flow_hash_5tuple() will incorporate ICMP type and code
into the hash (because those are stored into tp_src and tp_dst).  Is
that desirable?

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v8 repost 2] ofproto: Honour Table Mod settings for table-miss handling

2014-03-11 Thread Simon Horman
This reworks lookup of rules for both table 0 and table action translation.
The result is that Table Mod settings, which can alter the miss-behaviour
of tables, including table 0, on a per-table basis may be honoured.

Previous patches proposed by myself which build on earlier merged patches
by Andy Zhou implement the ofproto side of Table Mod. So with this patch
the feature should be complete.

Neither this patch, nor any other patches it builds on, alter the default
behaviour of Open vSwitch. And in particular the OpenFlow1.1 behaviour is
the default regardless of which OpenFlow version is negotiated between the
switch and the controller.

An implementation detail, which lends itself to future work, is the
handling of OFPTC_TABLE_MISS_CONTINUE. If a table has this behaviour set by
Table Mod and a miss occurs then a loop is created, skipping to the next
table. It is quite easy to create a situation where this loop covers ~255
tables which is very expensive as the lookup for each table involves taking
locks, amongst other things.

Cc: Andy Zhou 
Signed-off-by: Simon Horman 

---
v8
* Rebase

v7
* As suggested by Ben Pfaff
  - Used simplified loop/case logic in rule_dpif_lookup_from_table()
+ I re-introduced next_id so that table_id is not advanced to
  ofproto->up.n_tables.
  - Consolidate logic of xlate_table_action() and rule_dpif_lookup()
into rule_dpif_lookup_from_table()
  - Do not honour table mod settings in the case of a resubmit action

v4 - v6
* Rebase

v3
* Remove bogus fall-through comment in OFPTC_TABLE_MISS_CONTROLLER case
  in rule_dpif_lookup_from_table(). This case always returns.
* Add fall-through from OFPTC_TABLE_MISS_CONTROLLER
  to OFPTC_TABLE_MISS_DROP in xlate_table_action() to handle
  situations where packet_in is not allowed.
* Re-arrange rule_dpif_lookup_from_table() and xlate_table_action()
  to only have one call to choose_miss_rule(). This seems tidier.
* Consistently call ovs-ofctl monitor without a -P argument
* Remove ofproto parameter name from declaration of table_get_config().
  This is in keeping with the coding style of other declarations in
  ofproto.h
* Only allow supported configurations in table_mod message

v2
* Use ofproto->up.n_tables instead of N_TABLES as the number
  of tables present.
* Do not use the presence of rules in a table to indicate that it is
  present. Instead treat all tables, other than TBL_INTERNAL, that
  exist in ofproto.up as present. This simplifies things slightly
  and seems to be a more logical approach.
* As pointed out by Yamamoto-san
  - Read config for each table that is missed rather
than just the first one. A logic bug that somehow I was blind to.
---
 OPENFLOW-1.1+|   5 -
 ofproto/ofproto-dpif-xlate.c |  77 ++
 ofproto/ofproto-dpif.c   | 183 --
 ofproto/ofproto-dpif.h   |  12 +-
 ofproto/ofproto.c|  14 +-
 ofproto/ofproto.h|   5 +
 tests/ofproto-dpif.at| 353 +++
 7 files changed, 600 insertions(+), 49 deletions(-)

diff --git a/OPENFLOW-1.1+ b/OPENFLOW-1.1+
index 1b8a0ee..4363d28 100644
--- a/OPENFLOW-1.1+
+++ b/OPENFLOW-1.1+
@@ -54,11 +54,6 @@ OpenFlow 1.1
 The list of remaining work items for OpenFlow 1.1 is below.  It is
 probably incomplete.
 
-* OFPT_TABLE_MOD message.  This is new in OF1.1, so we need to
-  implement it.  It should be implemented so that the default OVS
-  behavior does not change.  Simon Horman has posted a patch.
-  [required for OF1.1 and OF1.2]
-
 * MPLS.  Simon Horman maintains a patch series that adds this
   feature.  This is partially merged.
   [optional for OF1.1+]
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index ccf0b75..0b3d963 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -225,8 +225,9 @@ static void xlate_actions__(struct xlate_in *, struct 
xlate_out *)
 OVS_REQ_RDLOCK(xlate_rwlock);
 static void xlate_normal(struct xlate_ctx *);
 static void xlate_report(struct xlate_ctx *, const char *);
-static void xlate_table_action(struct xlate_ctx *, ofp_port_t in_port,
-   uint8_t table_id, bool may_packet_in);
+static void xlate_table_action(struct xlate_ctx *, ofp_port_t in_port,
+   uint8_t table_id, bool may_packet_in,
+   bool force_controller_on_miss);
 static bool input_vid_is_valid(uint16_t vid, struct xbundle *, bool warn);
 static uint16_t input_vid_to_vlan(const struct xbundle *, uint16_t vid);
 static void output_normal(struct xlate_ctx *, const struct xbundle *,
@@ -1719,14 +1720,14 @@ compose_output_action__(struct xlate_ctx *ctx, 
ofp_port_t ofp_port,
 ctx->xout->slow |= special;
 } else if (may_receive(peer, ctx)) {
 if (xport_stp_forward_state(peer)) {
-xlate_table_action(ctx, flow->in_port.ofp_port, 0, true

[ovs-dev] [PATCH 0/4] ofproto: Support OF version-specific table-miss behaviours

2014-03-11 Thread Simon Horman
OpenFlow 1.1 and 1.2 specify that if a table-miss occurs then the default
behaviour is to forward the packet the controller using a packet-in
message. And until this patch this is the default behaviour that Open
vSwitch uses for all OpenFlow versions.

OpenFlow1.3+ specifies that if a table-miss occurs then the default
behaviour is simply to drop the packet. This patch implements this
behaviour using the following logic:

If a table-miss occurs and the table-miss behaviour for the table
has not been set using a table_mod (in which case it is no longer
the default setting) then:

* Installing a facet in the datapath with a drop action in the
  if there are no pre-OF1.3 controllers connected which would receive
  an packet_in message.

  Note that this covers both the case where there are only OF1.3
  controllers and the case where there are no controllers at all.

* Otherwise sent a packet_in message to all pre-OF1.3 controllers.

  This covers both the case where there are only pre-OF1.3
  controllers and there are both pre-OF1.3 and OF1.3+ controllers.


The last patch in this series depends on
"[PATCH v8 repost 2] ofproto: Honour Table Mod settings for table-miss handling"

To aid review this series and its dependency are available in git at
https://github.com/horms/openvswitch.git devel/table_miss

Simon Horman (4):
  ofp-util: Use enum ofp_table_config in struct ofputil_table_mod
  connmgr: Remove unnecessary reason fixup logic
  ofproto-dpip: Set generated_by_table_miss for miss rule
  ofproto: Support OF version-specific table-miss behaviours

 OPENFLOW-1.1+|  7 
 lib/ofp-util.h   |  2 +-
 ofproto/connmgr.c| 92 ++--
 ofproto/connmgr.h|  1 +
 ofproto/ofproto-dpif-xlate.c |  5 ++-
 ofproto/ofproto-dpif.c   | 69 +++--
 ofproto/ofproto-dpif.h   |  3 +-
 ofproto/ofproto-provider.h   | 14 +--
 ofproto/ofproto.c|  8 ++--
 ofproto/ofproto.h| 22 ++-
 tests/ofproto-dpif.at| 42 
 tests/tunnel.at  |  2 +-
 12 files changed, 183 insertions(+), 84 deletions(-)

-- 
1.8.5.2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 4/4] ofproto: Support OF version-specific table-miss behaviours

2014-03-11 Thread Simon Horman
OpenFlow 1.1 and 1.2 specify that if a table-miss occurs then the default
behaviour is to forward the packet the controller using a packet-in
message. And until this patch this is the default behaviour that Open
vSwitch uses for all OpenFlow versions.

OpenFlow1.3+ specifies that if a table-miss occurs then the default
behaviour is simply to drop the packet. This patch implements this
behaviour using the following logic:

If a table-miss occurs and the table-miss behaviour for the table
has not been set using a table_mod (in which case it is no longer
the default setting) then:

* Installing a facet in the datapath with a drop action in the
  if there are no pre-OF1.3 controllers connected which would receive
  an packet_in message.

  Note that this covers both the case where there are only OF1.3
  controllers and the case where there are no controllers at all.

* Otherwise sent a packet_in message to all pre-OF1.3 controllers.

  This covers both the case where there are only pre-OF1.3
  controllers and there are both pre-OF1.3 and OF1.3+ controllers.

Signed-off-by: Simon Horman 
---
 OPENFLOW-1.1+  |  7 --
 ofproto/connmgr.c  | 62 +-
 ofproto/connmgr.h  |  1 +
 ofproto/ofproto-dpif.c | 57 --
 ofproto/ofproto-dpif.h |  1 +
 ofproto/ofproto-provider.h |  2 +-
 ofproto/ofproto.c  |  8 +++---
 ofproto/ofproto.h  | 22 ++--
 tests/ofproto-dpif.at  | 42 +--
 tests/tunnel.at|  2 +-
 10 files changed, 163 insertions(+), 41 deletions(-)

diff --git a/OPENFLOW-1.1+ b/OPENFLOW-1.1+
index 4363d28..dd505c9 100644
--- a/OPENFLOW-1.1+
+++ b/OPENFLOW-1.1+
@@ -89,13 +89,6 @@ didn't compare the specs carefully yet.)
 * Add OFPMP_TABLE_FEATURES statistics.  Alexander Wu has posted a
   patch series.  [optional for OF1.3+]
 
-* More flexible table miss support.
-  This requires the following.
-  - Change the default table-miss action (in the absense of table-miss
-entry) from packet_in to drop for OF1.3+.  Decide what to do if
-a switch is configured to support multiple OF versions.
-  [required for OF1.3+]
-
 * IPv6 extension header handling support.  Fully implementing this
   requires kernel support.  This likely will take some careful and
   probably time-consuming design work.  The actual coding, once
diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
index 954a0f9..f4fc85d 100644
--- a/ofproto/connmgr.c
+++ b/ofproto/connmgr.c
@@ -1410,6 +1410,65 @@ ofconn_receives_async_msg(const struct ofconn *ofconn,
 return true;
 }
 
+/* The default "table-miss" behaviour for OpenFlow1.3+ is to drop the
+ * packet rather than to send the packet to the controller.
+ *
+ * This function returns false to indicate the packet should be dropped if
+ * the controller action was the result of the default table-miss behaviour
+ * and the controller is using OpenFlow1.3+.
+ *
+ * Otherwise true is returned to indicate the packet should be forwarded to
+ * the controller */
+static bool
+ofconn_wants_packet_in_on_miss(struct ofconn *ofconn,
+   const struct ofproto_packet_in *pin)
+{
+if (pin->generated_by_table_miss) {
+enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
+
+if (protocol != OFPUTIL_P_NONE
+&& ofputil_protocol_to_ofp_version(protocol) >= OFP13_VERSION) {
+enum ofproto_table_config config;
+
+config = ofproto_table_get_config(ofconn->connmgr->ofproto,
+  pin->up.table_id);
+if (config == OFPROTO_TABLE_MISS_DEFAULT) {
+return false;
+}
+}
+}
+return true;
+}
+
+/* The default "table-miss" behaviour for OpenFlow1.3+ is to drop the
+ * packet rather than to send the packet to the controller.
+ *
+ * This function returns false to indicate that a packet_in message
+ * for a "table-miss" should be sent to at least one controller.
+ * That is there is at least one controller with controller_id 0
+ * which connected using an OpenFlow version earlier than OpenFlow1.3.
+ *
+ * False otherwise.
+ *
+ * This logic assumes that "table-miss" packet_in messages
+ * are always sent to controller_id 0. */
+bool
+connmgr_wants_packet_in_on_miss(struct connmgr *mgr)
+{
+struct ofconn *ofconn;
+
+LIST_FOR_EACH (ofconn, node, &mgr->all_conns) {
+enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
+
+if (ofconn->controller_id == 0 &&
+(protocol == OFPUTIL_P_NONE ||
+ ofputil_protocol_to_ofp_version(protocol) < OFP13_VERSION)) {
+return true;
+}
+}
+return false;
+}
+
 /* Returns a human-readable name for an OpenFlow connection between 'mgr' and
  * 'target', suitable for use in log messages for identifying the connection.
 

[ovs-dev] [PATCH 3/4] ofproto-dpip: Set generated_by_table_miss for miss rule

2014-03-11 Thread Simon Horman
An assumption for this change is assuming that miss-rules are added by
add_internal_flow().

If so it seems to me that the current rule_is_table_miss() does not match
such rules. This is because add_internal_flows() adds rules with the mask
of register 0 set to all 1s due to the following line in
add_internal_flow().

match_set_reg(&fm.match, 0, id);

This patch proposes an alternate approach which is to provide
a function to determine if a rule is a miss rule. And infer
that any rule that results in execute_controller_action() being
called has a controller action present and if that rule is internal
then it is the miss rule. This allows generated_by_table_miss for miss
rules.

Signed-off-by: Simon Horman 
---
 ofproto/ofproto-dpif-xlate.c |  5 -
 ofproto/ofproto-dpif.c   | 12 ++--
 ofproto/ofproto-dpif.h   |  2 +-
 ofproto/ofproto-provider.h   | 12 +---
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 7f78460..52f55ad 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -2140,8 +2140,11 @@ execute_controller_action(struct xlate_ctx *ctx, int len,
 
 pin->controller_id = controller_id;
 pin->send_len = len;
+/* If a rule is internal and has a controller action,
+ * which is implied by the rule being processed here,
+ * then it is the rule to handle a table miss. */
 pin->generated_by_table_miss = (ctx->rule
-&& rule_dpif_is_table_miss(ctx->rule));
+&& rule_dpif_is_internal(ctx->rule));
 ofproto_dpif_send_packet_in(ctx->xbridge->ofproto, pin);
 ofpbuf_delete(packet);
 }
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 61521ed..433c0c0 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -3023,9 +3023,9 @@ rule_dpif_is_fail_open(const struct rule_dpif *rule)
 }
 
 bool
-rule_dpif_is_table_miss(const struct rule_dpif *rule)
+rule_dpif_is_internal(const struct rule_dpif *rule)
 {
-return rule_is_table_miss(&rule->up);
+return rule_is_internal(&rule->up);
 }
 
 ovs_be64
@@ -4408,6 +4408,14 @@ ofproto_dpif_unixctl_init(void)
 unixctl_command_register("dpif/dump-flows", "[-m] bridge", 1, 2,
  ofproto_unixctl_dpif_dump_flows, NULL);
 }
+
+
+/* Returns true if 'rule' is an internal rule, false otherwise. */
+bool
+rule_is_internal(const struct rule *rule)
+{
+return rule->table_id == TBL_INTERNAL;
+}
 
 /* Linux VLAN device support (e.g. "eth0.10" for VLAN 10.)
  *
diff --git a/ofproto/ofproto-dpif.h b/ofproto/ofproto-dpif.h
index f526104..df8d79e 100644
--- a/ofproto/ofproto-dpif.h
+++ b/ofproto/ofproto-dpif.h
@@ -89,7 +89,7 @@ void rule_dpif_credit_stats(struct rule_dpif *rule ,
 const struct dpif_flow_stats *);
 
 bool rule_dpif_is_fail_open(const struct rule_dpif *);
-bool rule_dpif_is_table_miss(const struct rule_dpif *);
+bool rule_dpif_is_internal(const struct rule_dpif *);
 
 struct rule_actions *rule_dpif_get_actions(const struct rule_dpif *);
 
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index d116451..e8ed2e9 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -408,17 +408,7 @@ struct rule_actions *rule_get_actions(const struct rule 
*rule)
 struct rule_actions *rule_get_actions__(const struct rule *rule)
 OVS_REQUIRES(rule->mutex);
 
-/* Returns true if 'rule' is an OpenFlow 1.3 "table-miss" rule, false
- * otherwise.
- *
- * ("Table-miss" rules are special because a packet_in generated through one
- * uses OFPR_NO_MATCH as its reason, whereas packet_ins generated by any other
- * rule use OFPR_ACTION.) */
-static inline bool
-rule_is_table_miss(const struct rule *rule)
-{
-return rule->cr.priority == 0 && cls_rule_is_catchall(&rule->cr);
-}
+bool rule_is_internal(const struct rule *);
 
 /* A set of actions within a "struct rule".
  *
-- 
1.8.5.2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/4] connmgr: Remove unnecessary reason fixup logic

2014-03-11 Thread Simon Horman
A packet_in message may be sent for one of two reasons.

1. As the result of an controller action supplied in a rule.
   This is executed if a packet matches the match for the rule.
   The packet_in reason is explicitly part of the action and
   is thus correct.

2. As the result of the failure to match a packet against any rule.
   In this case a rule present in TBL_INTERNAL is used. This rule
   has the packet_in reason set to OFPR_NO_MATCH.

   This is the behaviour described in OpenFlow 1.1 and 1.2 when
   a miss occurs. And to date it is the behaviour provided
   by Open vSwtich for all OpenFlow versions.

   When this behaviour is in effect OFPR_NO_MATCH is the desired
   packet_in reason and by virtue of the TBL_INTERNAL rule described
   above is always that value.

So for both cases the packet_in reason is always correct.
And following this reasoning there seems to be no need to fix it up.

What there is a need for, which this patch does not address, is for packets
to be dropped rather than forwarded to OpenFlow1.3 controllers when a miss
occurs and an alternate behaviour has not been selected using Table Mod. I
plan to address this in subsequent patches.

As well as not been necessary, I believe that the logic that this patch
removes has not been executed because pin->generated_by_table_miss is never
true without a separate patch that I have proposed, "ofproto-dpip: Set
generated_by_table_miss for miss rule".  So I do not expect there to be any
run-time affect of this change.

This patch does not remove pin->generated_by_table_miss, although it
is now set but never used, as I plan to use when implementing the
OpenFlow1.3 drop-on-miss behaviour described above.

Cc: YAMAMOTO Takashi 
Signed-off-by: Simon Horman 
---
 ofproto/connmgr.c | 32 
 1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
index 033ab7d..954a0f9 100644
--- a/ofproto/connmgr.c
+++ b/ofproto/connmgr.c
@@ -1452,8 +1452,7 @@ ofconn_send(const struct ofconn *ofconn, struct ofpbuf 
*msg,
 
 /* Sending asynchronous messages. */
 
-static void schedule_packet_in(struct ofconn *, struct ofproto_packet_in,
-   enum ofp_packet_in_reason wire_reason);
+static void schedule_packet_in(struct ofconn *, struct ofproto_packet_in);
 
 /* Sends an OFPT_PORT_STATUS message with 'opp' and 'reason' to appropriate
  * controllers managed by 'mgr'.  For messages caused by a controller
@@ -1526,25 +1525,6 @@ connmgr_send_flow_removed(struct connmgr *mgr,
 }
 }
 
-/* Normally a send-to-controller action uses reason OFPR_ACTION.  However, in
- * OpenFlow 1.3 and later, packet_ins generated by a send-to-controller action
- * in a "table-miss" flow (one with priority 0 and completely wildcarded) are
- * sent as OFPR_NO_MATCH.  This function returns the reason that should
- * actually be sent on 'ofconn' for 'pin'. */
-static enum ofp_packet_in_reason
-wire_reason(struct ofconn *ofconn, const struct ofproto_packet_in *pin)
-{
-if (pin->generated_by_table_miss && pin->up.reason == OFPR_ACTION) {
-enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
-
-if (protocol != OFPUTIL_P_NONE
-&& ofputil_protocol_to_ofp_version(protocol) >= OFP13_VERSION) {
-return OFPR_NO_MATCH;
-}
-}
-return pin->up.reason;
-}
-
 /* Given 'pin', sends an OFPT_PACKET_IN message to each OpenFlow controller as
  * necessary according to their individual configurations.
  *
@@ -1556,11 +1536,9 @@ connmgr_send_packet_in(struct connmgr *mgr,
 struct ofconn *ofconn;
 
 LIST_FOR_EACH (ofconn, node, &mgr->all_conns) {
-enum ofp_packet_in_reason reason = wire_reason(ofconn, pin);
-
-if (ofconn_receives_async_msg(ofconn, OAM_PACKET_IN, reason)
+if (ofconn_receives_async_msg(ofconn, OAM_PACKET_IN, pin->up.reason)
 && ofconn->controller_id == pin->controller_id) {
-schedule_packet_in(ofconn, *pin, reason);
+schedule_packet_in(ofconn, *pin);
 }
 }
 }
@@ -1586,8 +1564,7 @@ do_send_packet_ins(struct ofconn *ofconn, struct list 
*txq)
 /* Takes 'pin', composes an OpenFlow packet-in message from it, and passes it
  * to 'ofconn''s packet scheduler for sending. */
 static void
-schedule_packet_in(struct ofconn *ofconn, struct ofproto_packet_in pin,
-   enum ofp_packet_in_reason wire_reason)
+schedule_packet_in(struct ofconn *ofconn, struct ofproto_packet_in pin)
 {
 struct connmgr *mgr = ofconn->connmgr;
 uint16_t controller_max_len;
@@ -1595,7 +1572,6 @@ schedule_packet_in(struct ofconn *ofconn, struct 
ofproto_packet_in pin,
 
 pin.up.total_len = pin.up.packet_len;
 
-pin.up.reason = wire_reason;
 if (pin.up.reason == OFPR_ACTION) {
 controller_max_len = pin.send_len;  /* max_len */
 } else {
-- 
1.8.5.2

___
dev mailing list
dev@openvswitch.org
http:/

[ovs-dev] [PATCH 1/4] ofp-util: Use enum ofp_table_config in struct ofputil_table_mod

2014-03-11 Thread Simon Horman
Use enum ofp_table_config as the type of the 'config' field
of struct ofputil_table_mod. This reflects the usage
of the field.

Signed-off-by: Simon Horman 
---
 lib/ofp-util.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/ofp-util.h b/lib/ofp-util.h
index bf02b9f..7483646 100644
--- a/lib/ofp-util.h
+++ b/lib/ofp-util.h
@@ -591,7 +591,7 @@ struct ofpbuf *ofputil_encode_port_mod(const struct 
ofputil_port_mod *,
 /* Abstract ofp_table_mod. */
 struct ofputil_table_mod {
 uint8_t table_id; /* ID of the table, 0xff indicates all tables. */
-uint32_t config;
+enum ofp_table_config config;
 };
 
 enum ofperr ofputil_decode_table_mod(const struct ofp_header *,
-- 
1.8.5.2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/4] connmgr: Remove unnecessary reason fixup logic

2014-03-11 Thread YAMAMOTO Takashi
> A packet_in message may be sent for one of two reasons.
> 
> 1. As the result of an controller action supplied in a rule.
>This is executed if a packet matches the match for the rule.
>The packet_in reason is explicitly part of the action and
>is thus correct.

for OFPACT_CONTROLLER (NX thing) it's correct.
for OFPACT_OUTPUT (OF standard way), it isn't.

for the latter, reason=OFPR_ACTION is hardcoded in xlate_output_action.
OF1.3 (1.3.3 5.4.) states reason=table-miss for table-miss flow entry
and thus needs the fixup logic in question.

YAMAMOTO Takashi

> 
> 2. As the result of the failure to match a packet against any rule.
>In this case a rule present in TBL_INTERNAL is used. This rule
>has the packet_in reason set to OFPR_NO_MATCH.
> 
>This is the behaviour described in OpenFlow 1.1 and 1.2 when
>a miss occurs. And to date it is the behaviour provided
>by Open vSwtich for all OpenFlow versions.
> 
>When this behaviour is in effect OFPR_NO_MATCH is the desired
>packet_in reason and by virtue of the TBL_INTERNAL rule described
>above is always that value.
> 
> So for both cases the packet_in reason is always correct.
> And following this reasoning there seems to be no need to fix it up.
> 
> What there is a need for, which this patch does not address, is for packets
> to be dropped rather than forwarded to OpenFlow1.3 controllers when a miss
> occurs and an alternate behaviour has not been selected using Table Mod. I
> plan to address this in subsequent patches.
> 
> As well as not been necessary, I believe that the logic that this patch
> removes has not been executed because pin->generated_by_table_miss is never
> true without a separate patch that I have proposed, "ofproto-dpip: Set
> generated_by_table_miss for miss rule".  So I do not expect there to be any
> run-time affect of this change.
> 
> This patch does not remove pin->generated_by_table_miss, although it
> is now set but never used, as I plan to use when implementing the
> OpenFlow1.3 drop-on-miss behaviour described above.
> 
> Cc: YAMAMOTO Takashi 
> Signed-off-by: Simon Horman 
> ---
>  ofproto/connmgr.c | 32 
>  1 file changed, 4 insertions(+), 28 deletions(-)
> 
> diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
> index 033ab7d..954a0f9 100644
> --- a/ofproto/connmgr.c
> +++ b/ofproto/connmgr.c
> @@ -1452,8 +1452,7 @@ ofconn_send(const struct ofconn *ofconn, struct ofpbuf 
> *msg,
>  ^L
>  /* Sending asynchronous messages. */
>  
> -static void schedule_packet_in(struct ofconn *, struct ofproto_packet_in,
> -   enum ofp_packet_in_reason wire_reason);
> +static void schedule_packet_in(struct ofconn *, struct ofproto_packet_in);
>  
>  /* Sends an OFPT_PORT_STATUS message with 'opp' and 'reason' to appropriate
>   * controllers managed by 'mgr'.  For messages caused by a controller
> @@ -1526,25 +1525,6 @@ connmgr_send_flow_removed(struct connmgr *mgr,
>  }
>  }
>  
> -/* Normally a send-to-controller action uses reason OFPR_ACTION.  However, in
> - * OpenFlow 1.3 and later, packet_ins generated by a send-to-controller 
> action
> - * in a "table-miss" flow (one with priority 0 and completely wildcarded) are
> - * sent as OFPR_NO_MATCH.  This function returns the reason that should
> - * actually be sent on 'ofconn' for 'pin'. */
> -static enum ofp_packet_in_reason
> -wire_reason(struct ofconn *ofconn, const struct ofproto_packet_in *pin)
> -{
> -if (pin->generated_by_table_miss && pin->up.reason == OFPR_ACTION) {
> -enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
> -
> -if (protocol != OFPUTIL_P_NONE
> -&& ofputil_protocol_to_ofp_version(protocol) >= OFP13_VERSION) {
> -return OFPR_NO_MATCH;
> -}
> -}
> -return pin->up.reason;
> -}
> -
>  /* Given 'pin', sends an OFPT_PACKET_IN message to each OpenFlow controller 
> as
>   * necessary according to their individual configurations.
>   *
> @@ -1556,11 +1536,9 @@ connmgr_send_packet_in(struct connmgr *mgr,
>  struct ofconn *ofconn;
>  
>  LIST_FOR_EACH (ofconn, node, &mgr->all_conns) {
> -enum ofp_packet_in_reason reason = wire_reason(ofconn, pin);
> -
> -if (ofconn_receives_async_msg(ofconn, OAM_PACKET_IN, reason)
> +if (ofconn_receives_async_msg(ofconn, OAM_PACKET_IN, pin->up.reason)
>  && ofconn->controller_id == pin->controller_id) {
> -schedule_packet_in(ofconn, *pin, reason);
> +schedule_packet_in(ofconn, *pin);
>  }
>  }
>  }
> @@ -1586,8 +1564,7 @@ do_send_packet_ins(struct ofconn *ofconn, struct list 
> *txq)
>  /* Takes 'pin', composes an OpenFlow packet-in message from it, and passes it
>   * to 'ofconn''s packet scheduler for sending. */
>  static void
> -schedule_packet_in(struct ofconn *ofconn, struct ofproto_packet_in pin,
> -   enum ofp_packet_in_reason wire_reason)
> +schedule_packet_in(struct o

Re: [ovs-dev] [PATCH] ovs-thread: count the number of cpu cores.

2014-03-11 Thread Gurucharan Shetty
>
> does this need to be wrapped by ovsthread_once?

I think so, looking at all the places count_cpu_cores() is called
from. I am probably missing what you are pointing at.

>
> YAMAMOTO Takashi
>
>> +#endif
>>  ovsthread_once_done(&once);
>>  }
>>
>> --
>> 1.7.9.5
>>
>> ___
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovs-thread: count the number of cpu cores.

2014-03-11 Thread YAMAMOTO Takashi
>> does this need to be wrapped by ovsthread_once?
> 
> I think so, looking at all the places count_cpu_cores() is called
> from. I am probably missing what you are pointing at.

it seems that i misremembered what GetSystemInfo was.
sorry for noise.

YAMAMOTO Takashi

> 
>>
>> YAMAMOTO Takashi
>>
>>> +#endif
>>>  ovsthread_once_done(&once);
>>>  }
>>>
>>> --
>>> 1.7.9.5
>>>
>>> ___
>>> dev mailing list
>>> dev@openvswitch.org
>>> http://openvswitch.org/mailman/listinfo/dev
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v8 repost 2] ofproto: Honour Table Mod settings for table-miss handling

2014-03-11 Thread Ben Pfaff
On Wed, Mar 12, 2014 at 11:14:52AM +0900, Simon Horman wrote:
> This reworks lookup of rules for both table 0 and table action translation.
> The result is that Table Mod settings, which can alter the miss-behaviour
> of tables, including table 0, on a per-table basis may be honoured.
> 
> Previous patches proposed by myself which build on earlier merged patches
> by Andy Zhou implement the ofproto side of Table Mod. So with this patch
> the feature should be complete.
> 
> Neither this patch, nor any other patches it builds on, alter the default
> behaviour of Open vSwitch. And in particular the OpenFlow1.1 behaviour is
> the default regardless of which OpenFlow version is negotiated between the
> switch and the controller.
> 
> An implementation detail, which lends itself to future work, is the
> handling of OFPTC_TABLE_MISS_CONTINUE. If a table has this behaviour set by
> Table Mod and a miss occurs then a loop is created, skipping to the next
> table. It is quite easy to create a situation where this loop covers ~255
> tables which is very expensive as the lookup for each table involves taking
> locks, amongst other things.
> 
> Cc: Andy Zhou 
> Signed-off-by: Simon Horman 

I think I made a wrong suggestion, or at least we did not understand it
the same way.  I remember saying a version or two back that I didn't
like the code duplication and to please reduce it if possible.  But
adding a new struct and a new callback function made it worse, in my
opinion.  Let's go back to the last version that didn't add callbacks,
and then I'll take another stab at that one.

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/4] ofp-util: Use enum ofp_table_config in struct ofputil_table_mod

2014-03-11 Thread Ben Pfaff
On Wed, Mar 12, 2014 at 11:22:37AM +0900, Simon Horman wrote:
> Use enum ofp_table_config as the type of the 'config' field
> of struct ofputil_table_mod. This reflects the usage
> of the field.
> 
> Signed-off-by: Simon Horman 

Applied.

I hope that you will heed Yamamoto-san's comments and resubmit the rest
of the series.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] lib/classifier: Use a prefix tree to optimize ports wildcarding.

2014-03-11 Thread Ben Pfaff
On Mon, Mar 10, 2014 at 06:49:55PM -0700, Ethan Jackson wrote:
> Yep this is the same thing.  I've successfully shown that this
> algorithm is the best for our use case and was intending to implement
> it myself.  However, for many rather complicated reasons it became
> important to have a prototype implementation quickly, so Jarno
> graciously offered to pound out the implementation.
> 
> On the issue of the algorithm choice.  We tried a bunch of things, but
> when we simulated it against some real firewall rule sets, we found
> that the simplest is the best: a decision tree.

Can we have some kind of quantitative explanation of the benefits in the
commit message?  I know from our face-to-face discussions that you have
measured how this reduces the number of megaflows required to cover the
whole port space in various circumstances.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH V3 2/5] dpif-netdev: Implement the API functions to allow multiple handler threads read upcall.

2014-03-11 Thread Ben Pfaff
On Tue, Mar 11, 2014 at 06:11:27PM -0700, Alex Wang wrote:
> > > Signed-off-by: Alex Wang 
> > >
> > > ---
> > > V2 -> V3:
> > > - rebase.
> > >
> > > PATCH -> V2:
> > > - explain the drop of upcall queueing priority in dpif-netdev.
> > > - use mhash to calculate the 5-tuple hash.
> >
> > Why does dpif_netdev_recv_set() ignore its 'enable' argument?
> 
> I saw in current dpif-netdev.c, the dpif_netdev_recv_set() does nothing.
> 
> I didn't ask for the reason.  But now seems to be the good time, do you know
> why it is not implemented?
> 
> If there is the need, I'd like follow the dpif-provider definition and
> implement it.

I don't know why it does nothing.  I see that it has been that way for
years, so obviously it is not particularly harmful.  But I would prefer
to follow the definition, if it does not require too much work to do so.

> > As written, flow_hash_5tuple() will incorporate ICMP type and code
> > into the hash (because those are stored into tp_src and tp_dst).  Is
> > that desirable?
> >
> 
> 
> I'm okay with that.  I think counting ICMP type and code in hash will not
> cause
> particular handler receiving unfairly more ICMP related upcalls.  (Is this
> what you
> concerned?)

I am not concerned about fairness in this case.  I am wondering whether
one should ensure ordered delivery of ICMP with different type and
code.  My guess is that it does not matter, so let's leave this as-is
unless anyone speaks up.

> Also, want to ask that if there is any issue with the first patch?

I don't recall what I thought about it, I'll look at it again now.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/4] connmgr: Remove unnecessary reason fixup logic

2014-03-11 Thread Simon Horman
On Wed, Mar 12, 2014 at 12:00:12PM +0900, YAMAMOTO Takashi wrote:
> > A packet_in message may be sent for one of two reasons.
> > 
> > 1. As the result of an controller action supplied in a rule.
> >This is executed if a packet matches the match for the rule.
> >The packet_in reason is explicitly part of the action and
> >is thus correct.
> 
> for OFPACT_CONTROLLER (NX thing) it's correct.
> for OFPACT_OUTPUT (OF standard way), it isn't.
> 
> for the latter, reason=OFPR_ACTION is hardcoded in xlate_output_action.
> OF1.3 (1.3.3 5.4.) states reason=table-miss for table-miss flow entry
> and thus needs the fixup logic in question.

Thanks, I think understand now.

But I wonder if there are some problems that cause the fixup never to be
invoked.

i. For the reasons described in patch 3 of this series
   rule_dpif_is_table_miss() always returns false and thus
   pin->generated_by_table_miss is always false.

   This means that the if condition in wire_reason() is never true.

ii. It seems to me that the fixup should be invoked if
pin->generated_by_table_miss is false as what you describe
immediately above is that the fixup logic is needed
explicit output actions rather than for table misses.

And I believe there are further problems which we reach (like a bonus level
of an arcade game) once the above issues are addressed:

Bonus i. The reason is forced to no_match even for
 OFPACT_CONTROLLER with reason set to action.

 I wonder if this could be resolved by adding a from_output_action
 field to struct ofproto_packet_in and setting it using a parameter
 passed to execute_controller_action().

Bonus ii. parse_controller() emits an output action rather than a
  controller action if the reason is action and the controller_id
  is 0. This seems to assume that an output action always
  has action as its reason. But as we know this is not
  the case for OpenFlow 1.3+

  I propose simplifying parse_controller() to always emit
  a controller action.

My proposal is as follows:

* Drop this patch
* Address the issues described above
* Add a test to exercise the fixup (I have one locally)
* Rebase patch 4 of this series


Does that sound reasonable to you or am I still confused about
how this is fixup?

> 
> YAMAMOTO Takashi
> 
> > 
> > 2. As the result of the failure to match a packet against any rule.
> >In this case a rule present in TBL_INTERNAL is used. This rule
> >has the packet_in reason set to OFPR_NO_MATCH.
> > 
> >This is the behaviour described in OpenFlow 1.1 and 1.2 when
> >a miss occurs. And to date it is the behaviour provided
> >by Open vSwtich for all OpenFlow versions.
> > 
> >When this behaviour is in effect OFPR_NO_MATCH is the desired
> >packet_in reason and by virtue of the TBL_INTERNAL rule described
> >above is always that value.
> > 
> > So for both cases the packet_in reason is always correct.
> > And following this reasoning there seems to be no need to fix it up.
> > 
> > What there is a need for, which this patch does not address, is for packets
> > to be dropped rather than forwarded to OpenFlow1.3 controllers when a miss
> > occurs and an alternate behaviour has not been selected using Table Mod. I
> > plan to address this in subsequent patches.
> > 
> > As well as not been necessary, I believe that the logic that this patch
> > removes has not been executed because pin->generated_by_table_miss is never
> > true without a separate patch that I have proposed, "ofproto-dpip: Set
> > generated_by_table_miss for miss rule".  So I do not expect there to be any
> > run-time affect of this change.
> > 
> > This patch does not remove pin->generated_by_table_miss, although it
> > is now set but never used, as I plan to use when implementing the
> > OpenFlow1.3 drop-on-miss behaviour described above.
> > 
> > Cc: YAMAMOTO Takashi 
> > Signed-off-by: Simon Horman 
> > ---
> >  ofproto/connmgr.c | 32 
> >  1 file changed, 4 insertions(+), 28 deletions(-)
> > 
> > diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
> > index 033ab7d..954a0f9 100644
> > --- a/ofproto/connmgr.c
> > +++ b/ofproto/connmgr.c
> > @@ -1452,8 +1452,7 @@ ofconn_send(const struct ofconn *ofconn, struct 
> > ofpbuf *msg,
> >  ^L
> >  /* Sending asynchronous messages. */
> >  
> > -static void schedule_packet_in(struct ofconn *, struct ofproto_packet_in,
> > -   enum ofp_packet_in_reason wire_reason);
> > +static void schedule_packet_in(struct ofconn *, struct ofproto_packet_in);
> >  
> >  /* Sends an OFPT_PORT_STATUS message with 'opp' and 'reason' to appropriate
> >   * controllers managed by 'mgr'.  For messages caused by a controller
> > @@ -1526,25 +1525,6 @@ connmgr_send_flow_removed(struct connmgr *mgr,
> >  }
> >  }
> >  
> > -/* Normally a send-to-controller action uses reason OFPR_ACTION.  However, 
> > in
> > - * OpenFlow 1

Re: [ovs-dev] [PATCH V3 1/5] dpif: Change dpif API to allow multiple handler threads read upcall.

2014-03-11 Thread Ben Pfaff
On Fri, Mar 07, 2014 at 06:04:38PM -0800, Alex Wang wrote:
> This commit changes the API in 'dpif-provider.h' to allow multiple
> handler threads call dpif_recv() simultaneously.
> 
> Signed-off-by: Alex Wang 
> 
> ---
> V2 -> V3:
> - detach the channel refreshing from recv_set().
> 
> Note:
>   the recv_set() still calls the refresh_channels(), the first time receiving
>   packet is enabled.  This is in that, when opening the dpif, the
>   open_dpif_backer() calls check_variable_length_userdata() and
>   check_max_mpls_depth() before setting the handler/revalidator thread.  Both
>   of these function requires channels established in dpif.
> 
>   If we move the udpif_set_threads() before the two check functions, both the
>   revalidator and the function iteslf will try deleting the flow installed.
>   This will cause the warning of "failed to flow_del" and unittest failures.
> 
> PATCH -> V2:
> - fold in Ben's clarification.
> - refine the comments.
> - invoke dpif_handlers_set() in udpif_set_threads().  this is a bug.
>   the previous code will cause the handlers polling from closed
>   fd.

I'm still trying to figure out whether I properly understand the new
API.  Here's a suggested replacement for the first paragraph of the
comment on 'port_get_pid'.  Is it correct?

/* Returns the Netlink PID value to supply in OVS_ACTION_ATTR_USERSPACE
 * actions as the OVS_USERSPACE_ATTR_PID attribute's value, for use in
 * flows whose packets arrived on port 'port_no'.  In the case where the
 * provider allocates multiple Netlink PIDs to a single port, it may use
 * 'hash' to spread load among them.  The caller need not use a particular
 * hash function, because it is not generally necessary to avoid reordering
 * between packets received via flow misses (which are spread among PIDs by
 * the datapath internally) and those received via userspace actions (which
 * are spread via the return value of this function).  A 5-tuple hash is
 * suitable.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v8 repost 2] ofproto: Honour Table Mod settings for table-miss handling

2014-03-11 Thread Simon Horman
On Tue, Mar 11, 2014 at 09:08:20PM -0700, Ben Pfaff wrote:
> On Wed, Mar 12, 2014 at 11:14:52AM +0900, Simon Horman wrote:
> > This reworks lookup of rules for both table 0 and table action translation.
> > The result is that Table Mod settings, which can alter the miss-behaviour
> > of tables, including table 0, on a per-table basis may be honoured.
> > 
> > Previous patches proposed by myself which build on earlier merged patches
> > by Andy Zhou implement the ofproto side of Table Mod. So with this patch
> > the feature should be complete.
> > 
> > Neither this patch, nor any other patches it builds on, alter the default
> > behaviour of Open vSwitch. And in particular the OpenFlow1.1 behaviour is
> > the default regardless of which OpenFlow version is negotiated between the
> > switch and the controller.
> > 
> > An implementation detail, which lends itself to future work, is the
> > handling of OFPTC_TABLE_MISS_CONTINUE. If a table has this behaviour set by
> > Table Mod and a miss occurs then a loop is created, skipping to the next
> > table. It is quite easy to create a situation where this loop covers ~255
> > tables which is very expensive as the lookup for each table involves taking
> > locks, amongst other things.
> > 
> > Cc: Andy Zhou 
> > Signed-off-by: Simon Horman 
> 
> I think I made a wrong suggestion, or at least we did not understand it
> the same way.  I remember saying a version or two back that I didn't
> like the code duplication and to please reduce it if possible.  But
> adding a new struct and a new callback function made it worse, in my
> opinion.  Let's go back to the last version that didn't add callbacks,
> and then I'll take another stab at that one.

Sure, I'll prepare v9 accordingly.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ofproto: Update rule's priority in eviction group.

2014-03-11 Thread Ben Pfaff
On Sun, Mar 09, 2014 at 05:48:04PM +0800, kmindg wrote:
> We do call heap_rebuild in ofproto_run, but we do not update rule's
> priority with latest hard_timeout and idle_timeout before heap_rebuild.
> 
> This patch ensures that rule's priority has been updated before
> heap_rebuild, and adds two test cases to check eviction with modified
> hard_timeout and idle_timwout.
> 
> Signed-off-by: kmindg 

Good catch.  I applied this.

I left the declaration of heap_raw_change() in heap.h.  I like to be
able to see a summary of all the available functions at the top of the
header, even if some of the functions have inline implementations later
in the header.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] stp: Fix bpdu tx problem in listening state

2014-03-11 Thread Ben Pfaff
On Sun, Mar 09, 2014 at 05:48:52PM +0800, kmindg wrote:
> The restriction only allows to send bpdu in forwarding state in
> compose_output_action__. But a port could send bpdu in listening
> and learning state according to comments in lib/stp.h(State of
> an STP port).
> 
> Signed-off-by: kmindg 

This appears to fix a bug, but I'm not sure exactly how bad that bug is,
or what the symptoms of it are.  Does this patch mean that currently OVS
isn't sending out BPDUs in certain STP states?  (In other words, that
STP is currently broken?)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v9] ofproto: Honour Table Mod settings for table-miss handling

2014-03-11 Thread Simon Horman
This reworks lookup of rules for both table 0 and table action translation.
The result is that Table Mod settings, which can alter the miss-behaviour
of tables, including table 0, on a per-table basis may be honoured.

Previous patches proposed by myself which build on earlier merged patches
by Andy Zhou implement the ofproto side of Table Mod. So with this patch
the feature should be complete.

Neither this patch, nor any other patches it builds on, alter the default
behaviour of Open vSwitch. And in particular the OpenFlow1.1 behaviour is
the default regardless of which OpenFlow version is negotiated between the
switch and the controller.

An implementation detail, which lends itself to future work, is the
handling of OFPTC_TABLE_MISS_CONTINUE. If a table has this behaviour set by
Table Mod and a miss occurs then a loop is created, skipping to the next
table. It is quite easy to create a situation where this loop covers ~255
tables which is very expensive as the lookup for each table involves taking
locks, amongst other things.

Cc: Andy Zhou 
Signed-off-by: Simon Horman 

---
v9
* As suggested by Ben Pfaff
  - Reverse consolidation of xlate_table_action() and rule_dpif_lookup()
into rule_dpif_lookup_from_table() that was introduced in v7.
+ There may be a cleaner approach but this was not it.

v8
* Rebase

v7
* As suggested by Ben Pfaff
  - Used simplified loop/case logic in rule_dpif_lookup_from_table()
+ I re-introduced next_id so that table_id is not advanced to
  ofproto->up.n_tables.
  - Consolidate logic of xlate_table_action() and rule_dpif_lookup()
into rule_dpif_lookup_from_table()
  - Do not honour table mod settings in the case of a resubmit action

v4 - v6
* Rebase

v3
* Remove bogus fall-through comment in OFPTC_TABLE_MISS_CONTROLLER case
  in rule_dpif_lookup_from_table(). This case always returns.
* Add fall-through from OFPTC_TABLE_MISS_CONTROLLER
  to OFPTC_TABLE_MISS_DROP in xlate_table_action() to handle
  situations where packet_in is not allowed.
* Re-arrange rule_dpif_lookup_from_table() and xlate_table_action()
  to only have one call to choose_miss_rule(). This seems tidier.
* Consistently call ovs-ofctl monitor without a -P argument
* Remove ofproto parameter name from declaration of table_get_config().
  This is in keeping with the coding style of other declarations in
  ofproto.h
* Only allow supported configurations in table_mod message

v2
* Use ofproto->up.n_tables instead of N_TABLES as the number
  of tables present.
* Do not use the presence of rules in a table to indicate that it is
  present. Instead treat all tables, other than TBL_INTERNAL, that
  exist in ofproto.up as present. This simplifies things slightly
  and seems to be a more logical approach.
* As pointed out by Yamamoto-san
  - Read config for each table that is missed rather
than just the first one. A logic bug that somehow I was blind to.
---
 OPENFLOW-1.1+|   5 -
 ofproto/ofproto-dpif-xlate.c |  71 +
 ofproto/ofproto-dpif.c   | 119 +--
 ofproto/ofproto-dpif.h   |  20 ++-
 ofproto/ofproto.c|  14 +-
 ofproto/ofproto.h|   5 +
 tests/ofproto-dpif.at| 353 +++
 7 files changed, 540 insertions(+), 47 deletions(-)

diff --git a/OPENFLOW-1.1+ b/OPENFLOW-1.1+
index 1b8a0ee..4363d28 100644
--- a/OPENFLOW-1.1+
+++ b/OPENFLOW-1.1+
@@ -54,11 +54,6 @@ OpenFlow 1.1
 The list of remaining work items for OpenFlow 1.1 is below.  It is
 probably incomplete.
 
-* OFPT_TABLE_MOD message.  This is new in OF1.1, so we need to
-  implement it.  It should be implemented so that the default OVS
-  behavior does not change.  Simon Horman has posted a patch.
-  [required for OF1.1 and OF1.2]
-
 * MPLS.  Simon Horman maintains a patch series that adds this
   feature.  This is partially merged.
   [optional for OF1.1+]
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index eb4931e..d93e5ca 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -226,8 +226,9 @@ static void xlate_actions__(struct xlate_in *, struct 
xlate_out *)
 OVS_REQ_RDLOCK(xlate_rwlock);
 static void xlate_normal(struct xlate_ctx *);
 static void xlate_report(struct xlate_ctx *, const char *);
-static void xlate_table_action(struct xlate_ctx *, ofp_port_t in_port,
-   uint8_t table_id, bool may_packet_in);
+static void xlate_table_action(struct xlate_ctx *, ofp_port_t in_port,
+   uint8_t table_id, bool may_packet_in,
+   bool force_controller_on_miss);
 static bool input_vid_is_valid(uint16_t vid, struct xbundle *, bool warn);
 static uint16_t input_vid_to_vlan(const struct xbundle *, uint16_t vid);
 static void output_normal(struct xlate_ctx *, const struct xbundle *,
@@ -1720,14 +1721,14 @@ compose_output_action__(struct xlate_ctx *ctx, 
ofp_por

Re: [ovs-dev] [PATCH] ofproto: Update rule's priority in eviction group.

2014-03-11 Thread Kmindg G
On Wed, Mar 12, 2014 at 1:36 PM, Ben Pfaff  wrote:
> On Sun, Mar 09, 2014 at 05:48:04PM +0800, kmindg wrote:
>> We do call heap_rebuild in ofproto_run, but we do not update rule's
>> priority with latest hard_timeout and idle_timeout before heap_rebuild.
>>
>> This patch ensures that rule's priority has been updated before
>> heap_rebuild, and adds two test cases to check eviction with modified
>> hard_timeout and idle_timwout.
>>
>> Signed-off-by: kmindg 
>
> Good catch.  I applied this.
>
> I left the declaration of heap_raw_change() in heap.h.  I like to be
> able to see a summary of all the available functions at the top of the
> header, even if some of the functions have inline implementations later
> in the header.

OK, I understand that now. :)
Thanks for applying this patch.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev