date:20120928

Re: [ovs-dev] Threaded userspace datapath

2012-09-28 Thread Giuseppe Lettieri


Hi all,

we have finally found the time to finish work on this patch. We have 
addressed all the issues that were mentioned in the previous mails, plus 
some more that showed up later. We now pass all unit tests on both Linux 
and FreeBSD and we can more confidently offer the patch to your 
attention, for possible inclusion. As usual, any comment would be much 
appreciated.


Regards,
Giuseppe

Il 27/08/2012 02:00, Ben Pfaff ha scritto:

On Fri, Aug 24, 2012 at 10:25:02AM +0200, Giuseppe Lettieri wrote:

Il 20/08/2012 20:48, Ed Maste ha scritto:

3. The datapath thread waits in poll() with each port's file descriptor in

the fd array.  However there's currently no fd to signal a bridge
reconfiguration, so the poll() has to timeout before the thread will pick up
the new config on the next time through its loop.

We are considering to add the notification in dpif_netdev_run(), a
callback which is now empty in the #ifdef THREADED case. According
to our tests this solves the problem in practice, but we are worried
that it may be a misuse of the interface, since the corresponding
dpif_netdev_wait() callback would still be empty. Any comment would
be much appreciated.

There's no inherent reason that a "wait" function has to be nonempty if
the corresponding "run" function is nonempty.  The question is simply
whether the poll loop will wake up when "run" needs to do some work, and
if you're confident that that will happen (I don't have enough context
here to check myself) then that's fine.




--
Dr. Ing. Giuseppe Lettieri
Dipartimento di Ingegneria della Informazione
Universita' di Pisa
Largo Lucio Lazzarino 2, 56122 Pisa - Italy
Ph. : (+39) 050-2217.649 (direct) .599 (switch)
Fax : (+39) 050-2217.600
e-mail: g.letti...@iet.unipi.it

>From f42467cae9be8003a95b25f1c409fc7b3cee0440 Mon Sep 17 00:00:00 2001
From: Giuseppe Lettieri 
Date: Fri, 28 Sep 2012 12:05:18 +0200
Subject: [PATCH] Threaded userspace datapath

This patch refactors the userlevel datapath (i.e., datapath_type=netdev)
into two threads: one only forwards the packets for which a flow is found,
and the other does all other processing. This arrangement can speed-up
forwarding performance, measuread in pps, by a factor of 5-10.

To enable compilation of the threaded datapath, pass '--enable-threaded'
to configure.

Signed-off-by: Gaetano Catalli 
Signed-off-by: Ed Maste 
Signed-off-by: Giuseppe Lettieri 
---
 configure.ac  |1 +
 lib/automake.mk   |1 +
 lib/dispatch.h|9 +
 lib/dpif-netdev.c |  473 -
 lib/fatal-signal.c|2 +-
 lib/netdev-bsd.c  |   91 ++
 lib/netdev-dummy.c|  123 -
 lib/netdev-linux.c|   44 +
 lib/netdev-provider.h |   19 ++
 lib/netdev-vport.c|8 +
 lib/netdev.c  |   22 +++
 lib/netdev.h  |7 +
 lib/vlog.c|   16 ++
 m4/openvswitch.m4 |   19 ++
 14 files changed, 827 insertions(+), 8 deletions(-)
 create mode 100644 lib/dispatch.h

diff --git a/configure.ac b/configure.ac
index 9bdffea..a70232f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -43,6 +43,7 @@ AC_SEARCH_LIBS([clock_gettime], [rt])
 AC_SEARCH_LIBS([timer_create], [rt])
 AC_SEARCH_LIBS([pcap_open_live], [pcap])
 
+OVS_CHECK_THREADED
 OVS_CHECK_COVERAGE
 OVS_CHECK_NDEBUG
 OVS_CHECK_NETLINK
diff --git a/lib/automake.mk b/lib/automake.mk
index 94b86f6..3479ec0 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -37,6 +37,7 @@ lib_libopenvswitch_a_SOURCES = \
 	lib/daemon.c \
 	lib/daemon.h \
 	lib/dhcp.h \
+	lib/dispatch.h \
 	lib/dummy.c \
 	lib/dummy.h \
 	lib/dhparams.h \
diff --git a/lib/dispatch.h b/lib/dispatch.h
new file mode 100644
index 000..80ac9c7
--- /dev/null
+++ b/lib/dispatch.h
@@ -0,0 +1,9 @@
+#include 
+#include "ofpbuf.h"
+
+#ifndef DISPATCH_H
+#define DISPATCH_H 1
+
+typedef void (*pkt_handler)(u_char *user, struct ofpbuf* buf);
+
+#endif /* DISPATCH_H */
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index dc4479e..e7013db 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -32,6 +32,15 @@
 #include 
 #include 
 
+#ifdef THREADED
+#include 
+#include 
+
+#include "socket-util.h"
+#include "fatal-signal.h"
+#include "dispatch.h"
+#endif
+
 #include "csum.h"
 #include "dpif.h"
 #include "dpif-provider.h"
@@ -55,6 +64,18 @@
 #include "vlog.h"
 
 VLOG_DEFINE_THIS_MODULE(dpif_netdev);
+/* Pthread lock macros, nops in the non-threaded case. */
+#ifdef THREADED
+#define INIT_MUTEX(mutex) pthread_mutex_init(mutex, NULL)
+#define DESTROY_MUTEX(mutex) pthread_mutex_destroy(mutex)
+#define LOCK(mutex) pthread_mutex_lock(mutex)
+#define UNLOCK(mutex) pthread_mutex_unlock(mutex)
+#else
+#define INIT_MUTEX(mutex)
+#define DESTROY_MUTEX(mutex)
+#define LOCK(mutex)
+#define UNLOCK(mutex)
+#endif
 
 /* Configuration parameters. */
 enum { MAX_PORTS = 256 };   /* Maximum number of ports. */
@@ -80,6 +101,49 @@ struct dp_netdev_queue {
 unsigned int head, tail;
 };
 
+#ifdef THREADED
+str

Re: [ovs-dev] [PATCH v2.0] Non-Datapath MPLS actions and matches

2012-09-28 Thread ravi kerur

On Thu, Sep 27, 2012 at 5:44 PM, Simon Horman  wrote:

> On Thu, Sep 27, 2012 at 04:45:22PM -0700, Jesse Gross wrote:
> > On Thu, Sep 27, 2012 at 12:05 PM, ravi kerur  wrote:
> > > On Thu, Sep 27, 2012 at 11:16 AM, Jesse Gross 
> wrote:
> > >> On Thu, Sep 27, 2012 at 9:26 AM, Ben Pfaff  wrote:
> > >> > On Wed, Sep 26, 2012 at 04:13:41PM +0900, Simon Horman wrote:
> > >> >> This patch provides an implementation of the non-datapath portions
> > >> >> of MPLS matches and actions.
> > >> >>
> > >> >> This patch is based on top of Ben Pfaff's series,
> > >> >> "set-field action support"
> > >> >>
> > >> >> Cc: Isaku Yamahata 
> > >> >> Cc: Ravi K 
> > >> >
> > >> > I think Ravi's full last name is "Kerur".
> > >> >
> > >> >> Signed-off-by: Simon Horman 
> > >> >
> > >> > Jesse, do you have any concerns about this?  I haven't read past the
> > >> > diffstat yet.  But if it seems like a reasonable intermediate
> approach
> > >> > then I'm happy to review it.
> > >>
> > >> I don't think there is inherently anything wrong in starting with a
> > >> userspace-only approach.  I have a couple of specific concerns based
> > >> on briefly skimming the patch:
> > >>  * It seems like this is really the userspace half of the code which
> > >> assumes that the kernel portions will still be doing the work on the
> > >> actual packet flows.  If that's the case then I don't think that
> > >> userspace support can go in independently.  Otherwise, userspace
> > >> should really be self-contained and setup slow-path flows to do the
> > >> work itself.
> > >
> > >
> > > mpls userspace is completely self-contained i.e. doesn't depend on OVS
> > > kernel code. I am saying this based on implementation and testing.
> During
> > > testing no OVS kernel module was loaded and testing such as ping,
> iperf,
> > > netperf and scp executed.
> >
> > At the very least, userspace-only code shouldn't add anything to
> > either the userspace/kernel interface or odp-util.c.  I also don't see
> > how any MPLS action will actually get processed.  I do see that MPLS
> > actions send packets to the local port and then install a flow but I
> > think there is a misunderstanding because that doesn't make a lot of
> > sense to me.
>
> That portion is probably my handiwork. Could you give some guidance on
> a method that would work? I'm more than happy to rework things.
>
> > >>  * If it is truly userspace only, the handling of multiple levels of
>  tags
> > >> seems a little incomplete since we actually have the full packet.
> > >
> > >
> > > Can you elaborate  please?
> >
> > Multiple levels of tags aren't handled, for example, when you pop a tag.
>
> Do you think that adding support for multiple levels of tags to the kernel
> is appropriate. Or should such cases be bounced to user-space?
>

 multiple tags do work in both userspace and kernel i.e. any
combination of push/pop actions. I am not sure what the confusion is?

Simon: I have not looked into your patch, has anything changed here?

>
> > >>  If this is supposed to be a quick stepping stone to kernel support
> > >> that seems less important since we will no longer have complete packet
> > >> access.
> > >>
> > >> So it basically comes down to what the short term plans are.  There's
> > >> also Leo's patch (which I haven't looked at) that I can post if there
> > >> are plans to do kernel support.
> > >
> > >
> > >  so will there be separate/different kernel interface?
> >
> > Separate from what?
>

 Our earlier disagreement was on packet recirculation for multiple
actions. You mention about Leo's patch and I assumed it contains packet
recirculation support and you want mpls changes for that new piece of
kernel datapath and wanted clarification on that.

> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> >
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] ofproto: Rationlise comments in handle_flow_mod()

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 03:07:08PM +0900, Simon Horman wrote:
> Signed-off-by: Simon Horman 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Threaded userspace datapath

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 12:49:16PM +0200, Giuseppe Lettieri wrote:
> we have finally found the time to finish work on this patch. We have
> addressed all the issues that were mentioned in the previous mails,
> plus some more that showed up later. We now pass all unit tests on
> both Linux and FreeBSD and we can more confidently offer the patch
> to your attention, for possible inclusion. As usual, any comment
> would be much appreciated.

This is looking good.  I read through some of the code and I have a
few comments.

I see a number of XXX comments still left.  Some of them, I guess, can
just be removed, but others I wonder about, and it would be nice to
have your opinion.

dp_netdev_notifier_init() and netdev_dummy_listen() both call pipe()
then set_nonblocking() on each fd.  I see the same pattern in a few
other places in the tree.  Perhaps we should have a helper in
socket-util.c to eliminate some redundancy.

I see "sizeof(fds) / sizeof(fds[0])" in a few places.  We have
ARRAY_SIZE in util.h for that.

In dp_netdev_notifier_ack(), dp_netdev_notifier_ack1(), and
dp_netdev_notifier_notify(), I think that the system calls should
retry on EINTR.  That would be cautious practice, anyhow.

I think that there are now synchronization issues in vlog.  The
vlog_write_file() implementation assumes a single-threaded model
(worker_request() isn't thread-safe).  We could introduce per-thread
state so that only the main thread of a process uses worker_request()
and other threads call write() directly from vlog_write_file() (since
write() should be thread-safe).

In dpif_netdev_exit_hook(), I'd be inclined to just use an ack and
skip the pthread_cancel().  Thread cancellation always seems like a
big hammer that is best avoided when one can.  (Another approach would
be to close the write side of the notification pipe.  Then the read
side would get EOF after draining any remaining acks, an unmistakable
sign.)

Does that give you enough to think about for a while?  I'll read the
next version of the patch in more detail.

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Threaded userspace datapath

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 09:23:01AM -0700, Ben Pfaff wrote:
> I think that there are now synchronization issues in vlog.  The
> vlog_write_file() implementation assumes a single-threaded model
> (worker_request() isn't thread-safe).  We could introduce per-thread
> state so that only the main thread of a process uses worker_request()
> and other threads call write() directly from vlog_write_file() (since
> write() should be thread-safe).

Oh, also I see some static data in vlog_valist() we'd have to do
something about.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v3] ofp-actions: Add parsing of set_field actions

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 02:59:10PM +0900, Simon Horman wrote:
> Based heavily on work by Isaku Yamahata 
> 
> Cc: Isaku Yamahata 
> Signed-off-by: Simon Horman 

Thanks, I added a test and tweaked the documentation and applied this
to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [RFC PATCH 0/5] Userspace tunnel logic

2012-09-28 Thread Jesse Gross

This is a port of the kernel tunneling logic to userspace, taking
advantage of the outer header information in the flow.  It logically
fits in after Kyle's tunnel kernel patch and before Justin's single
datapath work.  I expect that the final patch in this series will
be replaced by Justin's better indirection layer, along with some
other integration work.

There are a few TODO items noted at the top of tunne.c, mostly
around integration with ofproto-dpif and flags from the kernel.
Review is welcome but I'm not planning on applying this until the
other components are ready, so this is really just informational.

Jesse Gross (5):
  flow: Extend struct flow to contain tunnel outer header.
  odp-util: Enable communicating outer tunnel to/from the kernel.
  netdev-dummy: Add tunnel variation.
  tunnel: Userspace implementation of tunnel manipulation.
  ofproto-dpif: Basic hooks for rewriting tunnel packets.

 NEWS   |4 +
 lib/automake.mk|2 +
 lib/dpif-linux.c   |2 +-
 lib/dpif-netdev.c  |4 +-
 lib/dpif-provider.h|2 +-
 lib/dpif.c |2 +-
 lib/flow.c |   55 +++-
 lib/flow.h |   19 +-
 lib/learning-switch.c  |7 +-
 lib/match.c|   15 +-
 lib/meta-flow.c|8 +-
 lib/netdev-dummy.c |   88 ++
 lib/nx-match.c |3 +-
 lib/odp-util.c |   75 -
 lib/ofp-print.c|2 +-
 lib/ofp-util.c |6 +-
 lib/tunnel.c   |  739 
 lib/tunnel.h   |   36 +++
 ofproto/ofproto-dpif.c |   54 +++-
 ofproto/ofproto-provider.h |4 +-
 ofproto/ofproto.c  |4 +-
 tests/automake.mk  |1 +
 tests/odp.at   |   12 +-
 tests/ofp-print.at |6 +-
 tests/ofproto-dpif.at  |   58 ++--
 tests/ofproto.at   |   12 +-
 tests/test-classifier.c|   40 +--
 tests/test-flows.c |2 +-
 tests/testsuite.at |1 +
 tests/tunnel.at|  112 +++
 vswitchd/vswitch.xml   |   14 +-
 31 files changed, 1252 insertions(+), 137 deletions(-)
 create mode 100644 lib/tunnel.c
 create mode 100644 lib/tunnel.h
 create mode 100644 tests/tunnel.at

-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [RFC PATCH 1/5] flow: Extend struct flow to contain tunnel outer header.

2012-09-28 Thread Jesse Gross

Now that the kernel is supplying information about the outer IP
header for tunneled packets userspace needs to be able to track it
as part of the flow.  For the time being this is only used internally
by OVS and not exposed outwards to OpenFlow.  As a result, this
threads the information throughout userspace but simply stores the
existing tun_id in it.

Signed-off-by: Jesse Gross 
---
 lib/dpif-linux.c   |2 +-
 lib/dpif-netdev.c  |4 ++--
 lib/dpif-provider.h|2 +-
 lib/dpif.c |2 +-
 lib/flow.c |   55 +++
 lib/flow.h |   19 ---
 lib/learning-switch.c  |7 +-
 lib/match.c|   15 
 lib/meta-flow.c|8 +++
 lib/nx-match.c |3 ++-
 lib/odp-util.c |   12 +-
 lib/ofp-print.c|2 +-
 lib/ofp-util.c |6 ++---
 ofproto/ofproto-dpif.c |   14 +++
 ofproto/ofproto-provider.h |4 ++--
 ofproto/ofproto.c  |4 ++--
 tests/ofp-print.at |6 ++---
 tests/ofproto-dpif.at  |   56 ++--
 tests/ofproto.at   |   12 +-
 tests/test-classifier.c|   40 +++
 tests/test-flows.c |2 +-
 21 files changed, 169 insertions(+), 106 deletions(-)

diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c
index a2eb490..3a4a4e6 100644
--- a/lib/dpif-linux.c
+++ b/lib/dpif-linux.c
@@ -1399,7 +1399,7 @@ dpif_linux_vport_send(int dp_ifindex, uint32_t port_no,
 uint64_t action;
 
 ofpbuf_use_const(&packet, data, size);
-flow_extract(&packet, 0, htonll(0), 0, &flow);
+flow_extract(&packet, 0, NULL, 0, &flow);
 
 ofpbuf_use_stack(&key, &keybuf, sizeof keybuf);
 odp_flow_key_from_flow(&key, &flow);
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 1ce14ce..797cb06 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -916,7 +916,7 @@ dpif_netdev_execute(struct dpif *dpif, const struct 
dpif_execute *execute)
 ofpbuf_reserve(©, DP_NETDEV_HEADROOM);
 ofpbuf_put(©, execute->packet->data, execute->packet->size);
 
-flow_extract(©, 0, 0, -1, &key);
+flow_extract(©, 0, NULL, -1, &key);
 error = dpif_netdev_flow_from_nlattrs(execute->key, execute->key_len,
   &key);
 if (!error) {
@@ -1014,7 +1014,7 @@ dp_netdev_port_input(struct dp_netdev *dp, struct 
dp_netdev_port *port,
 if (packet->size < ETH_HEADER_LEN) {
 return;
 }
-flow_extract(packet, 0, 0, odp_port_to_ofp_port(port->port_no), &key);
+flow_extract(packet, 0, NULL, odp_port_to_ofp_port(port->port_no), &key);
 flow = dp_netdev_lookup_flow(dp, &key);
 if (flow) {
 dp_netdev_flow_used(flow, packet);
diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h
index 317e617..ffe084a 100644
--- a/lib/dpif-provider.h
+++ b/lib/dpif-provider.h
@@ -292,7 +292,7 @@ struct dpif_class {
  * taken from the flow specified in the 'execute->key_len' bytes of
  * 'execute->key'.  ('execute->key' is mostly redundant with
  * 'execute->packet', but it contains some metadata that cannot be
- * recovered from 'execute->packet', such as tun_id and in_port.) */
+ * recovered from 'execute->packet', such as tunnel and in_port.) */
 int (*execute)(struct dpif *dpif, const struct dpif_execute *execute);
 
 /* Executes each of the 'n_ops' operations in 'ops' on 'dpif', in the order
diff --git a/lib/dpif.c b/lib/dpif.c
index 2968966..7f859d7 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -977,7 +977,7 @@ dpif_execute__(struct dpif *dpif, const struct dpif_execute 
*execute)
  * the Ethernet frame specified in 'packet' taken from the flow specified in
  * the 'key_len' bytes of 'key'.  ('key' is mostly redundant with 'packet', but
  * it contains some metadata that cannot be recovered from 'packet', such as
- * tun_id and in_port.)
+ * tunnel and in_port.)
  *
  * Returns 0 if successful, otherwise a positive errno value. */
 int
diff --git a/lib/flow.c b/lib/flow.c
index e517a03..cc41cfe 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -316,7 +316,7 @@ invalid:
 
 }
 
-/* Initializes 'flow' members from 'packet', 'skb_priority', 'tun_id', and
+/* Initializes 'flow' members from 'packet', 'skb_priority', 'tnl', and
  * 'ofp_in_port'.
  *
  * Initializes 'packet' header pointers as follows:
@@ -334,8 +334,8 @@ invalid:
  *  present and has a correct length, and otherwise NULL.
  */
 void
-flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id,
- uint16_t ofp_in_port, struct flow *flow)
+flow_extract(struct ofpbuf *packet, uint32_t skb_priority,
+ struct flow_tnl *tnl, uint16_t ofp_in_port, struct flow *flow)
 {
 struct ofpbuf b = *packet;
 struct eth_header *eth;
@@ -343,7 +343,10 @@ flow_extract(struct ofpbuf *packet, uint32_t skb_priority,

[ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Jesse Gross

Now that both the kernel and userspace can handle information about
the tunnel outer headers this adds userspace support for communicating
between the two.  At the moment userspace doesn't know how to do
anything with the extra information on receive and will only generate
actions containing tun_id.  However, both sides know how to ignore the
extra information.

Signed-off-by: Jesse Gross 
---
 NEWS  |2 ++
 lib/odp-util.c|   75 +
 tests/odp.at  |   12 
 tests/ofproto-dpif.at |2 +-
 4 files changed, 73 insertions(+), 18 deletions(-)

diff --git a/NEWS b/NEWS
index 71b2e29..6cd2947 100644
--- a/NEWS
+++ b/NEWS
@@ -37,6 +37,8 @@ post-v1.8.0
 - The autopath action.
 - Interface type "null".
 - Numeric values for reserved ports (see "ovs-ofctl" note above).
+- Tunneling requires the version of the kernel module paired with this
+  release (or a newer release).
 
 
 v1.8.0 - xx xxx 
diff --git a/lib/odp-util.c b/lib/odp-util.c
index f4b30e9..d46189f 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -925,6 +925,30 @@ parse_odp_key_attr(const char *s, const struct simap 
*port_names,
 }
 
 {
+char tun_id_s[32];
+unsigned long long int flags;
+int tos, ttl;
+struct ovs_key_ipv4_tunnel tun_key;
+int n = -1;
+
+if (sscanf(s, "ipv4_tunnel(tun_id=%31[x0123456789abcdefABCDEF],"
+   "flags=%lli,src="IP_SCAN_FMT",dst="IP_SCAN_FMT
+   ",tos=%i,ttl=%i)%n", tun_id_s, &flags,
+IP_SCAN_ARGS(&tun_key.ipv4_src),
+IP_SCAN_ARGS(&tun_key.ipv4_dst), &tos, &ttl,
+&n) > 0 && n > 0) {
+tun_key.tun_id = htonll(strtoull(tun_id_s, NULL, 0));
+tun_key.tun_flags = flags;
+tun_key.ipv4_tos = tos;
+tun_key.ipv4_ttl = ttl;
+memset(&tun_key.pad, 0, sizeof tun_key.pad);
+nl_msg_put_unspec(key, OVS_KEY_ATTR_IPV4_TUNNEL, &tun_key,
+  sizeof tun_key);
+return n;
+}
+}
+
+{
 unsigned long long int in_port;
 int n = -1;
 
@@ -1273,8 +1297,18 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct 
flow *flow)
 nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->skb_priority);
 }
 
-if (flow->tunnel.tun_id != htonll(0)) {
-nl_msg_put_be64(buf, OVS_KEY_ATTR_TUN_ID, flow->tunnel.tun_id);
+if (flow->tunnel.ip_dst != 0) {
+struct ovs_key_ipv4_tunnel *tun_key;
+
+tun_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV4_TUNNEL,
+   sizeof *tun_key);
+tun_key->tun_id = flow->tunnel.tun_id;
+tun_key->ipv4_src = flow->tunnel.ip_src;
+tun_key->ipv4_dst = flow->tunnel.ip_dst;
+tun_key->tun_flags = 0; // XXX: Need flags
+tun_key->ipv4_tos = flow->tunnel.ip_tos;
+tun_key->ipv4_ttl = flow->tunnel.ip_ttl;
+memset(&tun_key->pad, 0, sizeof tun_key->pad);
 }
 
 if (flow->in_port != OFPP_NONE && flow->in_port != OFPP_CONTROLLER) {
@@ -1765,9 +1799,17 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t 
key_len,
 expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY;
 }
 
-if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUN_ID)) {
-flow->tunnel.tun_id = nl_attr_get_be64(attrs[OVS_KEY_ATTR_TUN_ID]);
-expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TUN_ID;
+if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL)) {
+const struct ovs_key_ipv4_tunnel *tun_key;
+
+tun_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV4_TUNNEL]);
+flow->tunnel.tun_id = tun_key->tun_id;
+flow->tunnel.ip_src = tun_key->ipv4_src;
+flow->tunnel.ip_dst = tun_key->ipv4_dst;
+flow->tunnel.flags = 0; // XXX need flags
+flow->tunnel.ip_tos = tun_key->ipv4_tos;
+flow->tunnel.ip_ttl = tun_key->ipv4_ttl;
+expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL;
 }
 
 if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IN_PORT)) {
@@ -1858,16 +1900,27 @@ commit_set_action(struct ofpbuf *odp_actions, enum 
ovs_key_attr key_type,
 }
 
 static void
-commit_set_tun_id_action(const struct flow *flow, struct flow *base,
+commit_set_tunnel_action(const struct flow *flow, struct flow *base,
  struct ofpbuf *odp_actions)
 {
-if (base->tunnel.tun_id == flow->tunnel.tun_id) {
+struct ovs_key_ipv4_tunnel tun_key;
+
+if (!memcmp(&base->tunnel, &flow->tunnel, sizeof base->tunnel)) {
 return;
 }
-base->tunnel.tun_id = flow->tunnel.tun_id;
 
-commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID,
-  &base->tunnel.tun_id, sizeof(base->tunnel.tun_id));
+memcpy(&base->tunnel, &flow->tunnel, sizeof base->tunnel);
+
+tun_key.tun_id = base->tunnel.tun_id;
+tun_key.ip

[ovs-dev] [RFC PATCH 3/5] netdev-dummy: Add tunnel variation.

2012-09-28 Thread Jesse Gross

In order to be able to unit test the tunnel flow setup code we
need to be able to generate netdevs that appear to be like
tunnel devices.  This adds a dummy_tunnel netdev with two
differences from the normal dummy devices: the ability to set and
read config and a different type to identify them as needing
tunnel translation.

Signed-off-by: Jesse Gross 
---
 lib/netdev-dummy.c |   88 
 1 file changed, 88 insertions(+)

diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index 6aa4084..bb65c12 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -29,6 +29,7 @@
 #include "packets.h"
 #include "poll-loop.h"
 #include "shash.h"
+#include "smap.h"
 #include "sset.h"
 #include "unixctl.h"
 #include "vlog.h"
@@ -37,6 +38,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_dummy);
 
 struct netdev_dev_dummy {
 struct netdev_dev netdev_dev;
+struct smap config;
 uint8_t hwaddr[ETH_ADDR_LEN];
 int mtu;
 struct netdev_stats stats;
@@ -94,6 +96,7 @@ netdev_dummy_create(const struct netdev_class *class, const 
char *name,
 
 netdev_dev = xzalloc(sizeof *netdev_dev);
 netdev_dev_init(&netdev_dev->netdev_dev, name, class);
+smap_init(&netdev_dev->config);
 netdev_dev->hwaddr[0] = 0xaa;
 netdev_dev->hwaddr[1] = 0x55;
 netdev_dev->hwaddr[2] = n >> 24;
@@ -119,12 +122,35 @@ netdev_dummy_destroy(struct netdev_dev *netdev_dev_)
 {
 struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
 
+smap_destroy(&netdev_dev->config);
 shash_find_and_delete(&dummy_netdev_devs,
   netdev_dev_get_name(netdev_dev_));
 free(netdev_dev);
 }
 
 static int
+netdev_dummy_get_config(struct netdev_dev *netdev_dev_, struct smap *args)
+{
+struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
+
+smap_destroy(args);
+smap_clone(args, &netdev_dev->config);
+
+return 0;
+}
+
+static int
+netdev_dummy_set_config(struct netdev_dev *netdev_dev_, const struct smap 
*args)
+{
+struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
+
+smap_destroy(&netdev_dev->config);
+smap_clone(&netdev_dev->config, args);
+
+return 0;
+}
+
+static int
 netdev_dummy_open(struct netdev_dev *netdev_dev_, struct netdev **netdevp)
 {
 struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
@@ -370,6 +396,67 @@ static const struct netdev_class dummy_class = {
 netdev_dummy_change_seq
 };
 
+static const struct netdev_class dummy_tunnel_class = {
+"dummy_tunnel",
+NULL,   /* init */
+NULL,   /* run */
+NULL,   /* wait */
+
+netdev_dummy_create,
+netdev_dummy_destroy,
+netdev_dummy_get_config,
+netdev_dummy_set_config,
+
+netdev_dummy_open,
+netdev_dummy_close,
+
+netdev_dummy_listen,
+netdev_dummy_recv,
+netdev_dummy_recv_wait,
+netdev_dummy_drain,
+
+NULL,   /* send */
+NULL,   /* send_wait */
+
+netdev_dummy_set_etheraddr,
+netdev_dummy_get_etheraddr,
+netdev_dummy_get_mtu,
+netdev_dummy_set_mtu,
+NULL,   /* get_ifindex */
+NULL,   /* get_carrier */
+NULL,   /* get_carrier_resets */
+NULL,   /* get_miimon */
+netdev_dummy_get_stats,
+netdev_dummy_set_stats,
+
+NULL,   /* get_features */
+NULL,   /* set_advertisements */
+
+NULL,   /* set_policing */
+NULL,   /* get_qos_types */
+NULL,   /* get_qos_capabilities */
+NULL,   /* get_qos */
+NULL,   /* set_qos */
+NULL,   /* get_queue */
+NULL,   /* set_queue */
+NULL,   /* delete_queue */
+NULL,   /* get_queue_stats */
+NULL,   /* dump_queues */
+NULL,   /* dump_queue_stats */
+
+NULL,   /* get_in4 */
+NULL,   /* set_in4 */
+NULL,   /* get_in6 */
+NULL,   /* add_router */
+NULL,   /* get_next_hop */
+NULL,   /* get_drv_info */
+NULL,   /* arp_lookup */
+
+netdev_dummy_update_flags,
+
+netdev_dummy_change_seq
+};
+
 static struct ofpbuf *
 eth_from_packet_or_flow(const char *s)
 {
@@ -529,4 +616,5 @@ netdev_dummy_register(bool override)
 sset_destroy(&types);
 }
 netdev_register_provider(&dummy_class);
+netdev_register_provider(&dummy_tunnel_class);
 }
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [RFC PATCH 4/5] tunnel: Userspace implementation of tunnel manipulation.

2012-09-28 Thread Jesse Gross

The kernel tunneling code currently needs to handle a large number
of operations when tunnel packets are encapsulated and decapsulated.
Some examples of this are: finding the correct tunnel port on receive,
TTL and ToS inheritance, ECN handling, etc.  All of these can be done
on a per-flow basis in userspace now that we have both the inner and
outer header information, which allows us to both simpify the kernel
and take advantage of userspace's information.  This ports the logic
from the kernel to userspace and also pulls in the tunnel-specific
configuration handling from netdev-vport.c.  Once tunnel packets are
redirected into this code, the redundant pieces can be removed from
other places.

Signed-off-by: Jesse Gross 
---
 NEWS |2 +
 lib/automake.mk  |2 +
 lib/tunnel.c |  739 ++
 lib/tunnel.h |   36 +++
 tests/automake.mk|1 +
 tests/testsuite.at   |1 +
 tests/tunnel.at  |  112 
 vswitchd/vswitch.xml |   14 +-
 8 files changed, 896 insertions(+), 11 deletions(-)
 create mode 100644 lib/tunnel.c
 create mode 100644 lib/tunnel.h
 create mode 100644 tests/tunnel.at

diff --git a/NEWS b/NEWS
index 6cd2947..d02369a 100644
--- a/NEWS
+++ b/NEWS
@@ -39,6 +39,8 @@ post-v1.8.0
 - Numeric values for reserved ports (see "ovs-ofctl" note above).
 - Tunneling requires the version of the kernel module paired with this
   release (or a newer release).
+- Inheritance of the Don't Fragment bit in IP tunnels (df_inherit) is
+  no longer supported.
 
 
 v1.8.0 - xx xxx 
diff --git a/lib/automake.mk b/lib/automake.mk
index 94b86f6..9513cd9 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -181,6 +181,8 @@ lib_libopenvswitch_a_SOURCES = \
lib/timeval.h \
lib/token-bucket.c \
lib/token-bucket.h \
+   lib/tunnel.c \
+   lib/tunnel.h \
lib/type-props.h \
lib/unaligned.h \
lib/unicode.c \
diff --git a/lib/tunnel.c b/lib/tunnel.c
new file mode 100644
index 000..115ff07
--- /dev/null
+++ b/lib/tunnel.c
@@ -0,0 +1,739 @@
+/* Copyright (c) 2012 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include 
+
+#include "byte-order.h"
+#include "daemon.h"
+#include "dirs.h"
+#include "dynamic-string.h"
+#include "hash.h"
+#include "hmap.h"
+#include "packets.h"
+#include "smap.h"
+#include "socket-util.h"
+#include "tunnel.h"
+#include "unixctl.h"
+#include "vlog.h"
+
+/* TODO:
+ *
+ * Better hooks for tunnel creation/deletion/input/output
+ * Mechanism to create DP ports
+ * Ability to generate actions on input for ECN
+ * Revalidate flows on port add/remove/reconfigure
+ * Port stats
+ * Kernel interface needs flags defined (in particular for keys)
+ * IPsec flag needed?
+ */
+
+VLOG_DEFINE_THIS_MODULE(tunnel);
+
+#define DEFAULT_TTL 64
+
+struct tnl_match {
+ovs_be64 in_key;
+ovs_be32 ip_src;
+ovs_be32 ip_dst;
+uint32_t dp_portno;
+bool key_present;
+bool in_key_flow;
+uint8_t zeroed[2];
+};
+BUILD_ASSERT_DECL(sizeof(struct tnl_match) % 8 == 0);
+
+struct tnl_port {
+struct hmap_node match_node;
+struct hmap_node portno_node;
+
+const struct netdev *netdev;
+struct tnl_match match;
+ovs_be64 out_key;
+uint32_t tnl_portno;
+uint8_t ttl;
+uint8_t tos;
+bool out_key_flow;
+bool key_present;
+bool ttl_inherit;
+bool tos_inherit;
+bool dont_fragment;
+bool csum;
+};
+
+static struct hmap tnl_match_map = HMAP_INITIALIZER(&tnl_match_map);
+static struct hmap tnl_portno_map = HMAP_INITIALIZER(&tnl_portno_map);
+
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+static struct vlog_rate_limit dbg_rl = VLOG_RATE_LIMIT_INIT(60, 60);
+
+static int tnl_config_get(const struct netdev *, struct tnl_port *);
+static struct tnl_port *tnl_find(struct tnl_match *);
+static struct tnl_port *tnl_find_exact(struct tnl_match *);
+static uint32_t tnl_hash(struct tnl_match *);
+static void tnl_match_fmt(const struct tnl_match *, struct ds *);
+static unixctl_cb_func tnl_port_dump;
+static void tnl_port_mod_log(uint32_t tnl_portno, const char *action);
+static struct tnl_port *tnl_portno_find(uint32_t tnl_portno);
+
+void tnl_init(void)
+{
+static bool inited;
+
+if (inited) {
+return;
+}
+inited = true;
+
+unixctl_command_register("tunnel/show", "[interface]", 0, 1,
+

[ovs-dev] [RFC PATCH 5/5] ofproto-dpif: Basic hooks for rewriting tunnel packets.

2012-09-28 Thread Jesse Gross

This is a basic set of hooks into the ofproto-dpif code so that
tunneled packets can be have their flows rewritten to reflect the
outward facing view of how ports are laid out.  This is far from
being a comprehensive layer and should be replaced once a better
level of indirection is available.

Signed-off-by: Jesse Gross 
---
 ofproto/ofproto-dpif.c |   40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 16633f1..8f0e65e 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -50,6 +50,7 @@
 #include "poll-loop.h"
 #include "simap.h"
 #include "timer.h"
+#include "tunnel.h"
 #include "unaligned.h"
 #include "unixctl.h"
 #include "vlan-bitmap.h"
@@ -803,6 +804,8 @@ construct(struct ofproto *ofproto_)
 hmap_init(&ofproto->vlandev_map);
 hmap_init(&ofproto->realdev_vid_map);
 
+tnl_init();
+
 hmap_insert(&all_ofproto_dpifs, &ofproto->all_ofproto_dpifs_node,
 hash_string(ofproto->up.name, 0));
 memset(&ofproto->stats, 0, sizeof ofproto->stats);
@@ -2536,6 +2539,11 @@ port_add(struct ofproto *ofproto_, struct netdev 
*netdev, uint16_t *ofp_portp)
 
 error = dpif_port_add(ofproto->dpif, netdev, &odp_port);
 if (!error) {
+error = tnl_port_add(odp_port, odp_port, netdev);
+if (error) {
+dpif_port_del(ofproto->dpif, odp_port);
+return error;
+}
 *ofp_portp = odp_port_to_ofp_port(odp_port);
 }
 return error;
@@ -2545,9 +2553,12 @@ static int
 port_del(struct ofproto *ofproto_, uint16_t ofp_port)
 {
 struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
+uint16_t odp_port = ofp_port_to_odp_port(ofp_port);
 int error;
 
-error = dpif_port_del(ofproto->dpif, ofp_port_to_odp_port(ofp_port));
+tnl_port_del(odp_port);
+
+error = dpif_port_del(ofproto->dpif, odp_port);
 if (!error) {
 struct ofport_dpif *ofport = get_ofp_port(ofproto, ofp_port);
 if (ofport) {
@@ -3025,8 +3036,9 @@ handle_flow_miss(struct ofproto_dpif *ofproto, struct 
flow_miss *miss,
  * an ODP_FIT_* value that indicates how well 'key' fits our expectations for
  * what a flow key should contain.
  *
- * This function also includes some logic to help make VLAN splinters
- * transparent to the rest of the upcall processing logic.  In particular, if
+ * This function also includes some logic to help with ports that have special
+ * meaning to their flows such as tunnels and VLAN splinters.  Tunnels may
+ * map their source and destination addresses to a logical port whereas if
  * the extracted in_port is a VLAN splinter port, it replaces flow->in_port by
  * the "real" port, sets flow->vlan_tci correctly for the VLAN of the VLAN
  * splinter port, and pushes a VLAN header onto 'packet' (if it is nonnull).
@@ -3043,13 +3055,21 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif 
*ofproto,
   struct ofpbuf *packet)
 {
 enum odp_key_fitness fitness;
+bool adjusted;
+int err;
 
 fitness = odp_flow_key_to_flow(key, key_len, flow);
 if (fitness == ODP_FIT_ERROR) {
 return fitness;
 }
-*initial_tci = flow->vlan_tci;
 
+err = tnl_port_receive(flow, &adjusted);
+if (err) {
+/* XXX: We should really install a drop flow here. */
+return ODP_FIT_ERROR;
+}
+
+*initial_tci = flow->vlan_tci;
 if (vsp_adjust_flow(ofproto, flow)) {
 if (packet) {
 /* Make the packet resemble the flow, so that it gets sent to an
@@ -3069,10 +3089,12 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif 
*ofproto,
 eth_push_vlan(packet, flow->vlan_tci);
 }
 
-/* Let the caller know that we can't reproduce 'key' from 'flow'. */
-if (fitness == ODP_FIT_PERFECT) {
-fitness = ODP_FIT_TOO_MUCH;
-}
+adjusted = true;
+}
+
+/* Let the caller know that we can't reproduce 'key' from 'flow'. */
+if (adjusted && fitness == ODP_FIT_PERFECT) {
+fitness = ODP_FIT_TOO_MUCH;
 }
 
 return fitness;
@@ -4981,6 +5003,8 @@ compose_output_action__(struct action_xlate_ctx *ctx, 
uint16_t ofp_port,
 if (out_port != odp_port) {
 ctx->flow.vlan_tci = htons(0);
 }
+out_port = tnl_port_send(&ctx->flow, out_port);
+
 commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions);
 nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port);
 
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Threaded userspace datapath

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 3:49 AM, Giuseppe Lettieri
 wrote:
> Hi all,
>
> we have finally found the time to finish work on this patch. We have
> addressed all the issues that were mentioned in the previous mails, plus
> some more that showed up later. We now pass all unit tests on both Linux and
> FreeBSD and we can more confidently offer the patch to your attention, for
> possible inclusion. As usual, any comment would be much appreciated.

I haven't been following this work all that closely but earlier on
there was a discussion of how the performance benefits might be
related to how poll is used.  Restructuring that seems like a much
more desirable path but I never saw any real follow up.  What was the
reason for continuing to go down this path?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 0/2] Add flow-based tunneling support

2012-09-28 Thread Kyle Mestery

These patches build on the first two patches which Simon Horman sent out in
May to move Open vSwitch towards flow-based tunneling. The first patche adds
a tun_key, deprecating the tun_id member of the ovs_skb_cb struct. This patche
retain compatibilty with existing tunneling, but once the userspace code is
submitted, this will be deprecated. The second patch makes an attempt at adding
the new tun_key structure into the flow matching logic in the kernel.

Kyle Mestery (2):
  This is a first pass at providing a tun_key which can be used as
the basis for flow-based tunnelling. The tun_key includes and
replaces the tun_id in both struct ovs_skb_cb and struct
sw_tun_key.
  Move struct ovs_key_ipv4_tunnel out of struct sw_flow_key and into   
 struct sw_flow. This allows it to "float" and be used for matching
only when needed. Modify the matching code in
ovs_flow_tbl_lookup() to match on the tunnel header if it's
set.

 NEWS|  3 ++
 datapath/actions.c  | 25 +++--
 datapath/datapath.c | 37 +--
 datapath/datapath.h |  6 ++-
 datapath/flow.c | 88 +---
 datapath/flow.h | 25 -
 datapath/tunnel.c   | 89 +++--
 datapath/tunnel.h   | 14 ++-
 datapath/vport-capwap.c | 12 +++---
 datapath/vport-gre.c| 15 
 datapath/vport.c|  2 +-
 include/linux/openvswitch.h | 13 ++-
 lib/dpif-netdev.c   |  1 +
 lib/odp-util.c  | 52 +-
 lib/odp-util.h  |  3 +-
 15 files changed, 289 insertions(+), 96 deletions(-)

-- 
1.7.11.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 1/2] Add support for tun_key to OVS datapath

2012-09-28 Thread Kyle Mestery

This is a first pass at providing a tun_key which can be
used as the basis for flow-based tunnelling. The
tun_key includes and replaces the tun_id in both struct
ovs_skb_cb and struct sw_tun_key.

This patch allows all existing tun_id behaviour to still work. Existing
users of tun_id are redirected to tun_key->tun_id to retain compatibility.
However, when the userspace code is updated to make use of the new tun_key,
the old behaviour will be deprecated and removed.

NOTE: With these changes, the tunneling code no longer assumes input and
output keys are symmetric.  If they are not, PMTUD needs to be disabled
for tunneling to work.

Signed-off-by: Kyle Mestery 
Cc: Simon Horman 
Cc: Jesse Gross 
---
V6:
- Address latest round of comments from Jesse.
V5:
- Address another round of comments from Jesse.
V4:
- Address 2 comments from Jesse:
  - When processing actions, if OVS_CB(skb)->tun_key is NULL, point it at one
on the stack temporarily. This goes away when we remove the ability to set
tun_id outside the scope of tun_key.
  - Move tun_key to the end of struct sw_flow_key.
V3:
- Fix issues found during review by Jesse.
- Add a NEWS entry around tunnel code no longer assuming symmetric input and
  output tunnel keys.

V2:
- Fix blank line addition/removal found by Simon.
- Fix hex printing output found by Simon.
---
 NEWS|  3 ++
 datapath/actions.c  | 24 ++---
 datapath/datapath.c | 10 +-
 datapath/datapath.h |  5 +--
 datapath/flow.c | 61 +++-
 datapath/flow.h | 12 ---
 datapath/tunnel.c   | 84 +++--
 datapath/tunnel.h   | 14 ++--
 datapath/vport-capwap.c | 12 +++
 datapath/vport-gre.c| 15 
 datapath/vport.c|  2 +-
 include/linux/openvswitch.h | 13 ++-
 lib/dpif-netdev.c   |  1 +
 lib/odp-util.c  | 15 +++-
 lib/odp-util.h  |  3 +-
 15 files changed, 202 insertions(+), 72 deletions(-)

diff --git a/NEWS b/NEWS
index 29fd9f3..ae831e3 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,8 @@
 post-v1.8.0
 
+- The tunneling code no longer assumes input and output keys are symmetric.
+  If they are not, PMTUD needs to be disabled for tunneling to work. Note
+  this only applies to flow-based keys.
 - FreeBSD is now a supported platform, thanks to code contributions from
   Gaetano Catalli, Ed Maste, and Giuseppe Lettieri.
 - ovs-bugtool: New --ovs option to report only OVS related information.
diff --git a/datapath/actions.c b/datapath/actions.c
index ec9b595..a70cde8 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -333,7 +333,8 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
 }
 
 static int execute_set_action(struct sk_buff *skb,
-const struct nlattr *nested_attr)
+const struct nlattr *nested_attr,
+struct ovs_key_ipv4_tunnel *tun_key)
 {
int err = 0;
 
@@ -343,7 +344,21 @@ static int execute_set_action(struct sk_buff *skb,
break;
 
case OVS_KEY_ATTR_TUN_ID:
-   OVS_CB(skb)->tun_id = nla_get_be64(nested_attr);
+   if (OVS_CB(skb)->tun_key == NULL) {
+   /* If tun_key is NULL for this skb, assign it to
+* a value the caller passed in for action processing
+* and output. This can disappear once we drop support
+* for setting tun_id outside of tun_key.
+*/
+   memset(tun_key, 0, sizeof(struct ovs_key_ipv4_tunnel));
+   OVS_CB(skb)->tun_key = tun_key;
+   }
+
+   OVS_CB(skb)->tun_key->tun_id = nla_get_be64(nested_attr);
+   break;
+
+   case OVS_KEY_ATTR_IPV4_TUNNEL:
+   OVS_CB(skb)->tun_key = nla_data(nested_attr);
break;
 
case OVS_KEY_ATTR_ETHERNET:
@@ -377,6 +392,7 @@ static int do_execute_actions(struct datapath *dp, struct 
sk_buff *skb,
int prev_port = -1;
const struct nlattr *a;
int rem;
+   struct ovs_key_ipv4_tunnel tun_key;
 
for (a = attr, rem = len; rem > 0;
 a = nla_next(a, &rem)) {
@@ -407,7 +423,7 @@ static int do_execute_actions(struct datapath *dp, struct 
sk_buff *skb,
break;
 
case OVS_ACTION_ATTR_SET:
-   err = execute_set_action(skb, nla_data(a));
+   err = execute_set_action(skb, nla_data(a), &tun_key);
break;
 
case OVS_ACTION_ATTR_SAMPLE:
@@ -469,7 +485,7 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff 
*skb)
goto out_loop;
}
 
-   OVS_CB(skb)->tun_id = 0;
+   OVS_CB(skb)->tun_key = NUL

[ovs-dev] [PATCH 2/2] Add flow matching for tunnel keys

2012-09-28 Thread Kyle Mestery

Move struct ovs_key_ipv4_tunnel out of struct
sw_flow_key and into struct sw_flow. This allows it to
"float" and be used for matching only when needed.
Modify the matching code in ovs_flow_tbl_lookup() to
match on the tunnel header if it's set.

In particular, I'd like review of the method used for adding the matching
logic here. The method I've chosen here adds one additional check during
flow lookup to see if a tunnel key is set, and if so will add one comparison
before matching a flow. If this is deemed suboptimal, I'm open to suggestions
around optimizing this.

Signed-off-by: Kyle Mestery 
---
 datapath/actions.c  |  1 +
 datapath/datapath.c | 29 ++--
 datapath/datapath.h |  1 +
 datapath/flow.c | 65 +
 datapath/flow.h | 17 +++---
 datapath/tunnel.c   |  5 +++--
 lib/odp-util.c  | 37 ++
 7 files changed, 109 insertions(+), 46 deletions(-)

diff --git a/datapath/actions.c b/datapath/actions.c
index a70cde8..063d3af 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -288,6 +288,7 @@ static int output_userspace(struct datapath *dp, struct 
sk_buff *skb,
 
upcall.cmd = OVS_PACKET_CMD_ACTION;
upcall.key = &OVS_CB(skb)->flow->key;
+   upcall.tun_key = &OVS_CB(skb)->flow->tun_key;
upcall.userdata = NULL;
upcall.pid = 0;
 
diff --git a/datapath/datapath.c b/datapath/datapath.c
index d8a198e..845494c 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -313,10 +313,11 @@ void ovs_dp_process_received_packet(struct vport *p, 
struct sk_buff *skb)
 
if (!OVS_CB(skb)->flow) {
struct sw_flow_key key;
+   struct ovs_key_ipv4_tunnel tun_key;
int key_len;
 
/* Extract flow from 'skb' into 'key'. */
-   error = ovs_flow_extract(skb, p->port_no, &key, &key_len);
+   error = ovs_flow_extract(skb, p->port_no, &key, &tun_key, 
&key_len);
if (unlikely(error)) {
kfree_skb(skb);
return;
@@ -324,12 +325,13 @@ void ovs_dp_process_received_packet(struct vport *p, 
struct sk_buff *skb)
 
/* Look up flow. */
flow = ovs_flow_tbl_lookup(rcu_dereference(dp->table),
-  &key, key_len);
+  &key, &tun_key, key_len);
if (unlikely(!flow)) {
struct dp_upcall_info upcall;
 
upcall.cmd = OVS_PACKET_CMD_MISS;
upcall.key = &key;
+   upcall.tun_key = &tun_key;
upcall.userdata = NULL;
upcall.pid = p->upcall_pid;
ovs_dp_upcall(dp, skb, &upcall);
@@ -492,7 +494,7 @@ static int queue_userspace_packet(struct net *net, int 
dp_ifindex,
upcall->dp_ifindex = dp_ifindex;
 
nla = nla_nest_start(user_skb, OVS_PACKET_ATTR_KEY);
-   ovs_flow_to_nlattrs(upcall_info->key, user_skb);
+   ovs_flow_to_nlattrs(upcall_info->key, upcall_info->tun_key, user_skb);
nla_nest_end(user_skb, nla);
 
if (upcall_info->userdata)
@@ -787,13 +789,13 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, 
struct genl_info *info)
if (IS_ERR(flow))
goto err_kfree_skb;
 
-   err = ovs_flow_extract(packet, -1, &flow->key, &key_len);
+   err = ovs_flow_extract(packet, -1, &flow->key, &flow->tun_key, 
&key_len);
if (err)
goto err_flow_put;
 
err = ovs_flow_metadata_from_nlattrs(&flow->key.phy.priority,
 &flow->key.phy.in_port,
-&flow->key.tun.tun_key,
+&flow->tun_key,
 a[OVS_PACKET_ATTR_KEY]);
if (err)
goto err_flow_put;
@@ -922,7 +924,7 @@ static int ovs_flow_cmd_fill_info(struct sw_flow *flow, 
struct datapath *dp,
nla = nla_nest_start(skb, OVS_FLOW_ATTR_KEY);
if (!nla)
goto nla_put_failure;
-   err = ovs_flow_to_nlattrs(&flow->key, skb);
+   err = ovs_flow_to_nlattrs(&flow->key, &flow->tun_key, skb);
if (err)
goto error;
nla_nest_end(skb, nla);
@@ -1016,6 +1018,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
struct nlattr **a = info->attrs;
struct ovs_header *ovs_header = info->userhdr;
struct sw_flow_key key;
+   struct ovs_key_ipv4_tunnel tun_key;
struct sw_flow *flow;
struct sk_buff *reply;
struct datapath *dp;
@@ -1027,7 +1030,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
error = -EINVAL;
if (!a[OVS_FLOW_ATTR_KEY])
goto error;
-   er

Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Kyle Mestery (kmestery)

On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote:
> Now that both the kernel and userspace can handle information about
> the tunnel outer headers this adds userspace support for communicating
> between the two.  At the moment userspace doesn't know how to do
> anything with the extra information on receive and will only generate
> actions containing tun_id.  However, both sides know how to ignore the
> extra information.
> 
> Signed-off-by: Jesse Gross 

The patch I just submitted overlaps with this patch a little bit. I think there 
wasn't
a really set delineation between what I was doing and what you were doing, so
we may have to merge this patch with mine. I think we both pretty much made
similar changes though, so it shouldn't be too bad. Acking this patch for now,
assuming we'll work through the merge bits.

Acked-by: Kyle Mestery 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 1/5] flow: Extend struct flow to contain tunnel outer header.

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 11:12:21AM -0700, Jesse Gross wrote:
> Now that the kernel is supplying information about the outer IP
> header for tunneled packets userspace needs to be able to track it
> as part of the flow.  For the time being this is only used internally
> by OVS and not exposed outwards to OpenFlow.  As a result, this
> threads the information throughout userspace but simply stores the
> existing tun_id in it.
> 
> Signed-off-by: Jesse Gross 

Is this gracefully extensible if later we need to support IPv6 tunnel
endpoints?

It took me a while to figure out the purpose of the 'tnl' parameter to
flow_extract().  I think it would have been more obvious if the
parameter were marked 'const'.

If you can be satisfied with 16 bits of flags (only 3 flags exist so
far) then struct flow_tnl can be reduced by 4 bytes, including 2 bytes
of padding.

Should struct flow_metadata get a struct flow_tnl member?  I guess it
depends on whether we plan to expose anything other than tunnel ID via
OpenFlow.

format_tunnel_flags() is kind of oddly written.  It doesn't need to be
a loop if you dropped the "else"s and put "if (flags)" before the
final "else" clause.

The flow_format() output might be easier to read if it omitted all
tunnel information if there isn't any, instead of writing "tunnel(0)".

match_format() doesn't print any of the new fields.  I assume that's
intentional?

In process_packet_in(),
memset(&tnl, 0, sizeof tnl);
tnl.tun_id = pi.fmd.tun_id;

flow_extract(&pkt, 0, &tnl, pi.fmd.in_port, &flow);
might more simply be written as:
flow_extract(&pkt, 0, NULL, pi.fmd.in_port, &flow);
flow.tunnel.tun_id = pi.fmd.tun_id;
and I do see it written that way in other places.
Similarly in ofproto_unixctl_trace().

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] "brctl show" doesn't list attached interfaces

2012-09-28 Thread Jesse Gross

On Thu, Sep 27, 2012 at 8:36 PM, Ben Pfaff  wrote:
> On Fri, Sep 28, 2012 at 11:56:47AM +0900, Tadaaki Nagao wrote:
>> In "Re: [ovs-dev] "brctl show" doesn't list attached interfaces",
>> Ben Pfaff  wrote:
>> > I see datapaths in brctl show output even without brcompat:
>> >
>> > oot@debian:~# rmmod brcompat
>> > Error: Module brcompat is not currently loaded
>> > root@debian:~# brctl show
>> > bridge name bridge id   STP enabled interfaces
>> > root@debian:~# ovs-vsctl add-br br0
>> > root@debian:~# brctl show
>> > bridge name bridge id   STP enabled interfaces
>> > br0 .d61b592dd94d   no
>> > root@debian:~# ovs-dpctl show
>> > system@br0:
>> > lookups: hit:0 missed:0 lost:0
>> > flows: 0
>> > port 0: br0 (internal)
>> > root@debian:~#
>> >
>> > I assumed that was what Isaku meant.
>>
>> (Though I'm not Isaku ;-),
>
> Sorry about that.  I apologize.
>
>> what I tried to meant was, as Jesse wrote, OVS ports aren't appearing
>> in brctl output.  I wasn't clear enough, sorry about it.
>>
>> Actually, we are trying OVS + KVM + LibVirt + CloudStack.  From what I
>> heard from my colleague, KVM support code in CloudStack uses "brctl show"
>> to list the ports attached to a bridge.  Simply changing them to use
>> "ovs-dpctl show" would work for us, but I thought it would be better if
>> compatibility to Linux bridge is also kept in brctl output.
>
> It would be better to use "ovs-vsctl list-ports".  The output of
> "ovs-dpctl show" is an implementation detail that is subject to change
> and, in fact, we plan to change soon.

Regardless of the mechanism, it is better to build on the OVS native
commands rather than relying on brctl since they are much more robust.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] "brctl show" doesn't list attached interfaces

2012-09-28 Thread Jesse Gross

On Thu, Sep 27, 2012 at 8:18 PM, Tadaaki Nagao  wrote:
> Hi,
>
> In "Re: [ovs-dev] "brctl show" doesn't list attached interfaces",
> Jesse Gross  wrote:
>> I think it's the opposite problem - that OVS ports aren't appearing in
>> brctl output.  There's a bug with namespaces on certain kernels when
>> populating the sysfs entries that Pravin ran into so we don't do that
>> on the theory that missing entries are better than crashing.
>
> In datapath/dp_sysfs_dp.c:ovs_dp_sysfs_add_dp() there's a similar NULL
> check with the same "could panic" comment, which makes sense to me because
> sysfs_create_group() is actually called after the check.
>
> 8 #ifdef CONFIG_NET_NS
> /* Due to bug in 2.6.32 kernel, sysfs_create_group() could panic
>  * in other namespace than init_net. Following check is to avoid it. 
> */
> if (!kobj->sd)
> return -ENOENT;
> #endif
> /* Create /sys/class/net//bridge directory. */
> err = sysfs_create_group(kobj, &bridge_group);
> 8
> OTOH, after the check in datapath/dp_sysfs_if.c:ovs_dp_sysfs_add_if()
> there's no call to sysfs_create_group() as I mentioned in the first mail.
> And that made me uncertain what the check is actually meant for...

Pravin originally wrote that code, so I'll let him comment on the call
in ovs_dp_sysfs_add_if().
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 11:12:22AM -0700, Jesse Gross wrote:
> Now that both the kernel and userspace can handle information about
> the tunnel outer headers this adds userspace support for communicating
> between the two.  At the moment userspace doesn't know how to do
> anything with the extra information on receive and will only generate
> actions containing tun_id.  However, both sides know how to ignore the
> extra information.
> 
> Signed-off-by: Jesse Gross 

What's your thought about how to handle the "XXX: need flags" bits?
(Or does a later patch address that?  I haven't read ahead.)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 3/5] netdev-dummy: Add tunnel variation.

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 11:12:23AM -0700, Jesse Gross wrote:
> In order to be able to unit test the tunnel flow setup code we
> need to be able to generate netdevs that appear to be like
> tunnel devices.  This adds a dummy_tunnel netdev with two
> differences from the normal dummy devices: the ability to set and
> read config and a different type to identify them as needing
> tunnel translation.
> 
> Signed-off-by: Jesse Gross 

Seems reasonable, thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] lib: Add xpipe_nonblocking helper

2012-09-28 Thread Ed Maste

Signed-off-by: Ed Maste 
---
As suggested by Ben Pfaff in the 'Threaded userspace datapath' thread.
This looks like a useful utility function refactoring, independent of
any outcome of that discussion.

 lib/fatal-signal.c | 4 +---
 lib/process.c  | 4 +---
 lib/signals.c  | 4 +---
 lib/socket-util.c  | 7 +++
 lib/socket-util.h  | 1 +
 5 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/lib/fatal-signal.c b/lib/fatal-signal.c
index 21ebb5a..7cfbd05 100644
--- a/lib/fatal-signal.c
+++ b/lib/fatal-signal.c
@@ -74,9 +74,7 @@ fatal_signal_init(void)

 inited = true;

-xpipe(signal_fds);
-xset_nonblocking(signal_fds[0]);
-xset_nonblocking(signal_fds[1]);
+xpipe_nonblocking(signal_fds);

 sigemptyset(&fatal_signal_set);
 for (i = 0; i < ARRAY_SIZE(fatal_signals); i++) {
diff --git a/lib/process.c b/lib/process.c
index 91dfc06..9f5c35f 100644
--- a/lib/process.c
+++ b/lib/process.c
@@ -82,9 +82,7 @@ process_init(void)
 inited = true;

 /* Create notification pipe. */
-xpipe(fds);
-xset_nonblocking(fds[0]);
-xset_nonblocking(fds[1]);
+xpipe_nonblocking(fds);

 /* Set up child termination signal handler. */
 memset(&sa, 0, sizeof sa);
diff --git a/lib/signals.c b/lib/signals.c
index b712f7e..152afcf 100644
--- a/lib/signals.c
+++ b/lib/signals.c
@@ -63,9 +63,7 @@ signal_init(void)
 static bool inited;
 if (!inited) {
 inited = true;
-xpipe(fds);
-xset_nonblocking(fds[0]);
-xset_nonblocking(fds[1]);
+xpipe_nonblocking(fds);
 }
 }

diff --git a/lib/socket-util.c b/lib/socket-util.c
index e32c03d..f8b44cc 100644
--- a/lib/socket-util.c
+++ b/lib/socket-util.c
@@ -894,6 +894,14 @@ xpipe(int fds[2])
 }

 void
+xpipe_nonblocking(int fds[2])
+{
+xpipe(fds);
+xset_nonblocking(fds[0]);
+xset_nonblocking(fds[1]);
+}
+
+void
 xsocketpair(int domain, int type, int protocol, int fds[2])
 {
 if (socketpair(domain, type, protocol, fds)) {
diff --git a/lib/socket-util.h b/lib/socket-util.h
index bacb236..a00b32e 100644
--- a/lib/socket-util.h
+++ b/lib/socket-util.h
@@ -64,6 +64,7 @@ int fsync_parent_dir(const char *file_name);
 int get_mtime(const char *file_name, struct timespec *mtime);

 void xpipe(int fds[2]);
+void xpipe_nonblocking(int fds[2]);

 char *describe_fd(int fd);

--
1.7.11.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 1/5] flow: Extend struct flow to contain tunnel outer header.

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 12:54 PM, Ben Pfaff  wrote:
> On Fri, Sep 28, 2012 at 11:12:21AM -0700, Jesse Gross wrote:
>> Now that the kernel is supplying information about the outer IP
>> header for tunneled packets userspace needs to be able to track it
>> as part of the flow.  For the time being this is only used internally
>> by OVS and not exposed outwards to OpenFlow.  As a result, this
>> threads the information throughout userspace but simply stores the
>> existing tun_id in it.
>>
>> Signed-off-by: Jesse Gross 
>
> Is this gracefully extensible if later we need to support IPv6 tunnel
> endpoints?

We'll need to add extra space for the longer addresses and update the
tunnel code that cares, of course, but I don't think there is any
other real work.

> It took me a while to figure out the purpose of the 'tnl' parameter to
> flow_extract().  I think it would have been more obvious if the
> parameter were marked 'const'.

OK.

> If you can be satisfied with 16 bits of flags (only 3 flags exist so
> far) then struct flow_tnl can be reduced by 4 bytes, including 2 bytes
> of padding.

On 64-bit machines that would still result in the compiler adding 4
bytes of padding since it's no long 8-byte aligned.  However, this
patch already makes sure that we memset the whole struct so we can
take the benefit on 32-bit machines.

> Should struct flow_metadata get a struct flow_tnl member?  I guess it
> depends on whether we plan to expose anything other than tunnel ID via
> OpenFlow.

There are no current plans for any of this to be visible to OpenFlow.

> format_tunnel_flags() is kind of oddly written.  It doesn't need to be
> a loop if you dropped the "else"s and put "if (flags)" before the
> final "else" clause.

Hmm, yes, I'm not sure why I wrote it that way.

> The flow_format() output might be easier to read if it omitted all
> tunnel information if there isn't any, instead of writing "tunnel(0)".

It's fine to get rid of it.

> match_format() doesn't print any of the new fields.  I assume that's
> intentional?

They're not currently part of the match.

> In process_packet_in(),
> memset(&tnl, 0, sizeof tnl);
> tnl.tun_id = pi.fmd.tun_id;
>
> flow_extract(&pkt, 0, &tnl, pi.fmd.in_port, &flow);
> might more simply be written as:
> flow_extract(&pkt, 0, NULL, pi.fmd.in_port, &flow);
> flow.tunnel.tun_id = pi.fmd.tun_id;
> and I do see it written that way in other places.
> Similarly in ofproto_unixctl_trace().

Yes, I guess that's cleaner.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 4/5] tunnel: Userspace implementation of tunnel manipulation.

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 11:12:24AM -0700, Jesse Gross wrote:
> The kernel tunneling code currently needs to handle a large number
> of operations when tunnel packets are encapsulated and decapsulated.
> Some examples of this are: finding the correct tunnel port on receive,
> TTL and ToS inheritance, ECN handling, etc.  All of these can be done
> on a per-flow basis in userspace now that we have both the inner and
> outer header information, which allows us to both simpify the kernel
> and take advantage of userspace's information.  This ports the logic
> from the kernel to userspace and also pulls in the tunnel-specific
> configuration handling from netdev-vport.c.  Once tunnel packets are
> redirected into this code, the redundant pieces can be removed from
> other places.
> 
> Signed-off-by: Jesse Gross 

I guess that DF inheritance is not an important feature?  (Should we
accept the option and just warn about it, for better compatibility?)

In tunnel.c, I would move #include "tunnel.h" just after #include
, to ensure that tunnel.h is self-contained.

In tunnel.c, we usually put a line break just before the function's
name in a function definition.

It looks like tnl_port_add__() succeeds if the tunnel type is invalid,
but I don't know why.  Oh, maybe I see--this layer is meant to be a
sort of shim that leaves non-tunnels alone?  Some kind of high-level
comment (maybe in tunnel.h) would be helpful.

In tnl_hash(), since you made sure that struct tnl_match is a multiple
of 4 bytes, you could use hash_words() instead of hash_bytes() for a
slight speedup.

I didn't read tnl_find() or tnl_config_get() carefully.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 1:55 PM, Ben Pfaff  wrote:
> On Fri, Sep 28, 2012 at 11:12:22AM -0700, Jesse Gross wrote:
>> Now that both the kernel and userspace can handle information about
>> the tunnel outer headers this adds userspace support for communicating
>> between the two.  At the moment userspace doesn't know how to do
>> anything with the extra information on receive and will only generate
>> actions containing tun_id.  However, both sides know how to ignore the
>> extra information.
>>
>> Signed-off-by: Jesse Gross 
>
> What's your thought about how to handle the "XXX: need flags" bits?
> (Or does a later patch address that?  I haven't read ahead.)

The kernel interface in Kyle's first patch has a place for flags but
doesn't actually define any yet.  As I worked on the userspace
portions I decided that it made sense to expose those three that are
in userspace.  We'll have to add them to the kernel interface as well
since there's no way currently to translate in odp-util.c, which is
what those XXX comments refer to.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] lib: Add xpipe_nonblocking helper

2012-09-28 Thread Ben Pfaff

On Fri, Sep 28, 2012 at 09:06:41PM +, Ed Maste wrote:
> Signed-off-by: Ed Maste 
> ---
> As suggested by Ben Pfaff in the 'Threaded userspace datapath' thread.
> This looks like a useful utility function refactoring, independent of
> any outcome of that discussion.

Applied to master, thank you.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 4/5] tunnel: Userspace implementation of tunnel manipulation.

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 2:18 PM, Ben Pfaff  wrote:
> On Fri, Sep 28, 2012 at 11:12:24AM -0700, Jesse Gross wrote:
>> The kernel tunneling code currently needs to handle a large number
>> of operations when tunnel packets are encapsulated and decapsulated.
>> Some examples of this are: finding the correct tunnel port on receive,
>> TTL and ToS inheritance, ECN handling, etc.  All of these can be done
>> on a per-flow basis in userspace now that we have both the inner and
>> outer header information, which allows us to both simpify the kernel
>> and take advantage of userspace's information.  This ports the logic
>> from the kernel to userspace and also pulls in the tunnel-specific
>> configuration handling from netdev-vport.c.  Once tunnel packets are
>> redirected into this code, the redundant pieces can be removed from
>> other places.
>>
>> Signed-off-by: Jesse Gross 
>
> I guess that DF inheritance is not an important feature?  (Should we
> accept the option and just warn about it, for better compatibility?)

By default we always set the DF bit to on and the times that I've
heard of people wanting to change that is when they know things are
broken and want to completely turn it off, so I doubt that inheritance
is really useful.

I wish that we had exposed this as one of our fragment states in the
flow setup (as in "none; none, don't fragment; first; later) but we
don't currently have that information in userspace.  We could add it
to the interface but it doesn't seem worth it for something that will
probably never get used.

We do accept and ignore unknown flags with a warning.  It will be a
generic message in this case but I think that's probably fine here.

> In tunnel.c, I would move #include "tunnel.h" just after #include
> , to ensure that tunnel.h is self-contained.

OK.

> In tunnel.c, we usually put a line break just before the function's
> name in a function definition.

OK, I wasn't sure whether we were coming or going to that style.

> It looks like tnl_port_add__() succeeds if the tunnel type is invalid,
> but I don't know why.  Oh, maybe I see--this layer is meant to be a
> sort of shim that leaves non-tunnels alone?  Some kind of high-level
> comment (maybe in tunnel.h) would be helpful.

Yes, currently it allows non-tunnels to be passed in to the various
public functions.  I'll leave it to Justin as to whether that's the
best behavior once we have better integration.  For now I've added a
comment to document the current behavior.

> In tnl_hash(), since you made sure that struct tnl_match is a multiple
> of 4 bytes, you could use hash_words() instead of hash_bytes() for a
> slight speedup.

Good idea.

Thanks for the reviews.  I'm going to send out the updated series, not
because I think that it has changed enough to require more review but
to make sure that anyone trying to build on top of it has an updated
version.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Altavoz soporte para IPAD / Visita la Rioja y sus vinos / Radio FM pulsera

2012-09-28 Thread GlobalBono

Cabestan. Layout Simple
   Asegúrate de no perderte ninguna oferta, añade ofer...@globalbono.com a tu 
lista de contactos.
Si no ves correctamente las imágenes, pulsa [ 
http://email.globalbono.com/E28092012142809.cfm?WL=1599&WS=208833_8707685&WA=822
 ] aquí
[ http://email.globalbono.com/Go/index.cfm?WL=564&WS=208833_8707685&WA=822 ]
 Tus Ofertas de hoy
Síguenos en:
[ http://email.globalbono.com/Go/index.cfm?WL=50&WS=208833_8707685&WA=822 ] [ 
http://email.globalbono.com/Go/index.cfm?WL=51&WS=208833_8707685&WA=822 ] 







[ http://email.globalbono.com/Go/index.cfm?WL=1152&WS=208833_8707685&WA=822 ] 



[ http://email.globalbono.com/Go/index.cfm?WL=1579&WS=208833_8707685&WA=822 ] 
46% Dto. Radio FM para el brazo
Si eres un amante de la radio pero tu inconveniente es que no paras quieto, 
aquí tienes la radio que se...

5,94¤


[ http://email.globalbono.com/Go/index.cfm?WL=1579&WS=208833_8707685&WA=822 ] 


Antes

10,94¤Dto: 46%
Ahorro: 5,00¤
















[ http://email.globalbono.com/Go/index.cfm?WL=1578&WS=208833_8707685&WA=822 ] 
41% Dto. Altavoz soporte para Ipad
Base de soporte para Ipad y otros dispositivos de audio con altavoces. Perfecto 
para llevar a cualquier...

17,24¤


[ http://email.globalbono.com/Go/index.cfm?WL=1578&WS=208833_8707685&WA=822 ] 


Antes

29,00¤Dto: 41%
Ahorro: 11,76¤
















[ http://email.globalbono.com/Go/index.cfm?WL=1582&WS=208833_8707685&WA=822 ] 
47% Dto. Funda de polipiel para Ipad
Si quieres proteger tu Ipad de forma segura para así evitar rallazos en su 
pantalla, aquí tienes una fu...

11,64¤


[ http://email.globalbono.com/Go/index.cfm?WL=1582&WS=208833_8707685&WA=822 ] 


Antes

22,00¤Dto: 47%
Ahorro: 10,36¤
















[ http://email.globalbono.com/Go/index.cfm?WL=1581&WS=208833_8707685&WA=822 ] 
54% Dto. BODEGAS MUGA en La Rioja. Cata un vino con historia.
La escapada a BODEGAS MUGA en La Rioja te permite adentrarte en el 
maravilloso y singular...

55,00¤


[ http://email.globalbono.com/Go/index.cfm?WL=1581&WS=208833_8707685&WA=822 ] 


Antes

120,00¤Dto: 54%
Ahorro: 65,00¤
















[ http://email.globalbono.com/Go/index.cfm?WL=1580&WS=208833_8707685&WA=822 ] 
65% Dto. Vive Madrid a lo grande. Glamour y noches de diversión.
La escapada COPAS CON GLAMOUR te da acceso a un mundo exclusivo y 
privilegiado. Podrás vi...

45,00¤


[ http://email.globalbono.com/Go/index.cfm?WL=1580&WS=208833_8707685&WA=822 ] 


Antes

130,00¤Dto: 65%
Ahorro: 85,00¤
















[ http://email.globalbono.com/Go/index.cfm?WL=1583&WS=208833_8707685&WA=822 ] 
56% Dto. Escapada RELAX en Ávila. Descansa el cuerpo y la mente.
Disfruta de una completa sesión de relajación con esta estupenda escapada que 
te ofrecemos. Podrás disf...

140,00¤


[ http://email.globalbono.com/Go/index.cfm?WL=1583&WS=208833_8707685&WA=822 ] 


Antes

320,00¤Dto: 56%
Ahorro: 180,00¤















[ http://email.globalbono.com/Go/index.cfm?WL=706&WS=208833_8707685&WA=822 ] 



De conformidad con lo que establece la Ley Orgánica 15/1999 de Protección de 
Datos de Carácter Personal, le informamos que sus datos personales serán 
incorporados a un fichero bajo la responsabilidad de GlobalBono, con la 
finalidad de poder atender los compromisos derivados de la relación que 
mantenemos con usted. Puede ejercer sus derechos de acceso, cancelación, 
rectificación y oposición mediante un escrito a la dirección email 
i...@globalbono.com. Si en el plazo de 30 días no nos comunica lo contrario, 
entenderemos que los datos no han sido modificados, que se compromete a 
notificarnos cualquier variación y que tenemos el consentimiento para 
utilizarlos a fin de poder fidelizar la relación entre las partes.
   ¿Quieres dejar de recibir este email? Haz click  [ 
http://email.globalbono.com/baja/form_baja_GB.cfm?WL=182&WS=208833_8707685&WA=822
 ] aquí. 


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [RFC PATCH v2 1/5] flow: Extend struct flow to contain tunnel outer header.

2012-09-28 Thread Jesse Gross

Now that the kernel is supplying information about the outer IP
header for tunneled packets userspace needs to be able to track it
as part of the flow.  For the time being this is only used internally
by OVS and not exposed outwards to OpenFlow.  As a result, this
threads the information throughout userspace but simply stores the
existing tun_id in it.

Signed-off-by: Jesse Gross 
---
 lib/dpif-linux.c   |2 +-
 lib/dpif-netdev.c  |4 ++--
 lib/dpif-provider.h|2 +-
 lib/dpif.c |2 +-
 lib/flow.c |   55 +++
 lib/flow.h |   20 
 lib/learning-switch.c  |3 ++-
 lib/match.c|   15 
 lib/meta-flow.c|8 +++
 lib/nx-match.c |3 ++-
 lib/odp-util.c |   12 +-
 lib/ofp-print.c|2 +-
 lib/ofp-util.c |6 ++---
 ofproto/ofproto-dpif.c |   11 +
 ofproto/ofproto-provider.h |4 ++--
 ofproto/ofproto.c  |4 ++--
 tests/ofp-print.at |6 ++---
 tests/ofproto-dpif.at  |   56 ++--
 tests/ofproto.at   |   12 +-
 tests/test-classifier.c|   40 +++
 tests/test-flows.c |2 +-
 21 files changed, 162 insertions(+), 107 deletions(-)

diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c
index a2eb490..3a4a4e6 100644
--- a/lib/dpif-linux.c
+++ b/lib/dpif-linux.c
@@ -1399,7 +1399,7 @@ dpif_linux_vport_send(int dp_ifindex, uint32_t port_no,
 uint64_t action;
 
 ofpbuf_use_const(&packet, data, size);
-flow_extract(&packet, 0, htonll(0), 0, &flow);
+flow_extract(&packet, 0, NULL, 0, &flow);
 
 ofpbuf_use_stack(&key, &keybuf, sizeof keybuf);
 odp_flow_key_from_flow(&key, &flow);
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 1ce14ce..797cb06 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -916,7 +916,7 @@ dpif_netdev_execute(struct dpif *dpif, const struct 
dpif_execute *execute)
 ofpbuf_reserve(©, DP_NETDEV_HEADROOM);
 ofpbuf_put(©, execute->packet->data, execute->packet->size);
 
-flow_extract(©, 0, 0, -1, &key);
+flow_extract(©, 0, NULL, -1, &key);
 error = dpif_netdev_flow_from_nlattrs(execute->key, execute->key_len,
   &key);
 if (!error) {
@@ -1014,7 +1014,7 @@ dp_netdev_port_input(struct dp_netdev *dp, struct 
dp_netdev_port *port,
 if (packet->size < ETH_HEADER_LEN) {
 return;
 }
-flow_extract(packet, 0, 0, odp_port_to_ofp_port(port->port_no), &key);
+flow_extract(packet, 0, NULL, odp_port_to_ofp_port(port->port_no), &key);
 flow = dp_netdev_lookup_flow(dp, &key);
 if (flow) {
 dp_netdev_flow_used(flow, packet);
diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h
index 317e617..ffe084a 100644
--- a/lib/dpif-provider.h
+++ b/lib/dpif-provider.h
@@ -292,7 +292,7 @@ struct dpif_class {
  * taken from the flow specified in the 'execute->key_len' bytes of
  * 'execute->key'.  ('execute->key' is mostly redundant with
  * 'execute->packet', but it contains some metadata that cannot be
- * recovered from 'execute->packet', such as tun_id and in_port.) */
+ * recovered from 'execute->packet', such as tunnel and in_port.) */
 int (*execute)(struct dpif *dpif, const struct dpif_execute *execute);
 
 /* Executes each of the 'n_ops' operations in 'ops' on 'dpif', in the order
diff --git a/lib/dpif.c b/lib/dpif.c
index 2968966..7f859d7 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -977,7 +977,7 @@ dpif_execute__(struct dpif *dpif, const struct dpif_execute 
*execute)
  * the Ethernet frame specified in 'packet' taken from the flow specified in
  * the 'key_len' bytes of 'key'.  ('key' is mostly redundant with 'packet', but
  * it contains some metadata that cannot be recovered from 'packet', such as
- * tun_id and in_port.)
+ * tunnel and in_port.)
  *
  * Returns 0 if successful, otherwise a positive errno value. */
 int
diff --git a/lib/flow.c b/lib/flow.c
index e517a03..93bd9b2 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -316,7 +316,7 @@ invalid:
 
 }
 
-/* Initializes 'flow' members from 'packet', 'skb_priority', 'tun_id', and
+/* Initializes 'flow' members from 'packet', 'skb_priority', 'tnl', and
  * 'ofp_in_port'.
  *
  * Initializes 'packet' header pointers as follows:
@@ -334,8 +334,9 @@ invalid:
  *  present and has a correct length, and otherwise NULL.
  */
 void
-flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id,
- uint16_t ofp_in_port, struct flow *flow)
+flow_extract(struct ofpbuf *packet, uint32_t skb_priority,
+ const struct flow_tnl *tnl, uint16_t ofp_in_port,
+ struct flow *flow)
 {
 struct ofpbuf b = *packet;
 struct eth_header *eth;
@@ -343,7 +344,10 @@ flow_extract(struct ofpbuf *packet, uint32

[ovs-dev] [RFC PATCH v2 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Jesse Gross

Now that both the kernel and userspace can handle information about
the tunnel outer headers this adds userspace support for communicating
between the two.  At the moment userspace doesn't know how to do
anything with the extra information on receive and will only generate
actions containing tun_id.  However, both sides know how to ignore the
extra information.

Signed-off-by: Jesse Gross 
---
 NEWS  |2 ++
 lib/odp-util.c|   75 +
 tests/odp.at  |   12 
 tests/ofproto-dpif.at |2 +-
 4 files changed, 73 insertions(+), 18 deletions(-)

diff --git a/NEWS b/NEWS
index ae831e3..6265e2f 100644
--- a/NEWS
+++ b/NEWS
@@ -38,6 +38,8 @@ post-v1.8.0
 - The autopath action.
 - Interface type "null".
 - Numeric values for reserved ports (see "ovs-ofctl" note above).
+- Tunneling requires the version of the kernel module paired with this
+  release (or a newer release).
 
 
 v1.8.0 - xx xxx 
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 9ed17ed..ac0042c 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -925,6 +925,30 @@ parse_odp_key_attr(const char *s, const struct simap 
*port_names,
 }
 
 {
+char tun_id_s[32];
+unsigned long long int flags;
+int tos, ttl;
+struct ovs_key_ipv4_tunnel tun_key;
+int n = -1;
+
+if (sscanf(s, "ipv4_tunnel(tun_id=%31[x0123456789abcdefABCDEF],"
+   "flags=%lli,src="IP_SCAN_FMT",dst="IP_SCAN_FMT
+   ",tos=%i,ttl=%i)%n", tun_id_s, &flags,
+IP_SCAN_ARGS(&tun_key.ipv4_src),
+IP_SCAN_ARGS(&tun_key.ipv4_dst), &tos, &ttl,
+&n) > 0 && n > 0) {
+tun_key.tun_id = htonll(strtoull(tun_id_s, NULL, 0));
+tun_key.tun_flags = flags;
+tun_key.ipv4_tos = tos;
+tun_key.ipv4_ttl = ttl;
+memset(&tun_key.pad, 0, sizeof tun_key.pad);
+nl_msg_put_unspec(key, OVS_KEY_ATTR_IPV4_TUNNEL, &tun_key,
+  sizeof tun_key);
+return n;
+}
+}
+
+{
 unsigned long long int in_port;
 int n = -1;
 
@@ -1273,8 +1297,18 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct 
flow *flow)
 nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->skb_priority);
 }
 
-if (flow->tunnel.tun_id != htonll(0)) {
-nl_msg_put_be64(buf, OVS_KEY_ATTR_TUN_ID, flow->tunnel.tun_id);
+if (flow->tunnel.ip_dst != 0) {
+struct ovs_key_ipv4_tunnel *tun_key;
+
+tun_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV4_TUNNEL,
+   sizeof *tun_key);
+tun_key->tun_id = flow->tunnel.tun_id;
+tun_key->ipv4_src = flow->tunnel.ip_src;
+tun_key->ipv4_dst = flow->tunnel.ip_dst;
+tun_key->tun_flags = 0; // XXX: Need flags
+tun_key->ipv4_tos = flow->tunnel.ip_tos;
+tun_key->ipv4_ttl = flow->tunnel.ip_ttl;
+memset(&tun_key->pad, 0, sizeof tun_key->pad);
 }
 
 if (flow->in_port != OFPP_NONE && flow->in_port != OFPP_CONTROLLER) {
@@ -1765,9 +1799,17 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t 
key_len,
 expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY;
 }
 
-if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUN_ID)) {
-flow->tunnel.tun_id = nl_attr_get_be64(attrs[OVS_KEY_ATTR_TUN_ID]);
-expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TUN_ID;
+if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL)) {
+const struct ovs_key_ipv4_tunnel *tun_key;
+
+tun_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV4_TUNNEL]);
+flow->tunnel.tun_id = tun_key->tun_id;
+flow->tunnel.ip_src = tun_key->ipv4_src;
+flow->tunnel.ip_dst = tun_key->ipv4_dst;
+flow->tunnel.flags = 0; // XXX need flags
+flow->tunnel.ip_tos = tun_key->ipv4_tos;
+flow->tunnel.ip_ttl = tun_key->ipv4_ttl;
+expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL;
 }
 
 if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IN_PORT)) {
@@ -1858,16 +1900,27 @@ commit_set_action(struct ofpbuf *odp_actions, enum 
ovs_key_attr key_type,
 }
 
 static void
-commit_set_tun_id_action(const struct flow *flow, struct flow *base,
+commit_set_tunnel_action(const struct flow *flow, struct flow *base,
  struct ofpbuf *odp_actions)
 {
-if (base->tunnel.tun_id == flow->tunnel.tun_id) {
+struct ovs_key_ipv4_tunnel tun_key;
+
+if (!memcmp(&base->tunnel, &flow->tunnel, sizeof base->tunnel)) {
 return;
 }
-base->tunnel.tun_id = flow->tunnel.tun_id;
 
-commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID,
-  &base->tunnel.tun_id, sizeof(base->tunnel.tun_id));
+memcpy(&base->tunnel, &flow->tunnel, sizeof base->tunnel);
+
+tun_key.tun_id = base->tunnel.tun_id;
+tun_key.ip

[ovs-dev] [RFC PATCH v2 3/5] netdev-dummy: Add tunnel variation.

2012-09-28 Thread Jesse Gross

In order to be able to unit test the tunnel flow setup code we
need to be able to generate netdevs that appear to be like
tunnel devices.  This adds a dummy_tunnel netdev with two
differences from the normal dummy devices: the ability to set and
read config and a different type to identify them as needing
tunnel translation.

Signed-off-by: Jesse Gross 
---
 lib/netdev-dummy.c |   88 
 1 file changed, 88 insertions(+)

diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index 6aa4084..bb65c12 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -29,6 +29,7 @@
 #include "packets.h"
 #include "poll-loop.h"
 #include "shash.h"
+#include "smap.h"
 #include "sset.h"
 #include "unixctl.h"
 #include "vlog.h"
@@ -37,6 +38,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_dummy);
 
 struct netdev_dev_dummy {
 struct netdev_dev netdev_dev;
+struct smap config;
 uint8_t hwaddr[ETH_ADDR_LEN];
 int mtu;
 struct netdev_stats stats;
@@ -94,6 +96,7 @@ netdev_dummy_create(const struct netdev_class *class, const 
char *name,
 
 netdev_dev = xzalloc(sizeof *netdev_dev);
 netdev_dev_init(&netdev_dev->netdev_dev, name, class);
+smap_init(&netdev_dev->config);
 netdev_dev->hwaddr[0] = 0xaa;
 netdev_dev->hwaddr[1] = 0x55;
 netdev_dev->hwaddr[2] = n >> 24;
@@ -119,12 +122,35 @@ netdev_dummy_destroy(struct netdev_dev *netdev_dev_)
 {
 struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
 
+smap_destroy(&netdev_dev->config);
 shash_find_and_delete(&dummy_netdev_devs,
   netdev_dev_get_name(netdev_dev_));
 free(netdev_dev);
 }
 
 static int
+netdev_dummy_get_config(struct netdev_dev *netdev_dev_, struct smap *args)
+{
+struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
+
+smap_destroy(args);
+smap_clone(args, &netdev_dev->config);
+
+return 0;
+}
+
+static int
+netdev_dummy_set_config(struct netdev_dev *netdev_dev_, const struct smap 
*args)
+{
+struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
+
+smap_destroy(&netdev_dev->config);
+smap_clone(&netdev_dev->config, args);
+
+return 0;
+}
+
+static int
 netdev_dummy_open(struct netdev_dev *netdev_dev_, struct netdev **netdevp)
 {
 struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_);
@@ -370,6 +396,67 @@ static const struct netdev_class dummy_class = {
 netdev_dummy_change_seq
 };
 
+static const struct netdev_class dummy_tunnel_class = {
+"dummy_tunnel",
+NULL,   /* init */
+NULL,   /* run */
+NULL,   /* wait */
+
+netdev_dummy_create,
+netdev_dummy_destroy,
+netdev_dummy_get_config,
+netdev_dummy_set_config,
+
+netdev_dummy_open,
+netdev_dummy_close,
+
+netdev_dummy_listen,
+netdev_dummy_recv,
+netdev_dummy_recv_wait,
+netdev_dummy_drain,
+
+NULL,   /* send */
+NULL,   /* send_wait */
+
+netdev_dummy_set_etheraddr,
+netdev_dummy_get_etheraddr,
+netdev_dummy_get_mtu,
+netdev_dummy_set_mtu,
+NULL,   /* get_ifindex */
+NULL,   /* get_carrier */
+NULL,   /* get_carrier_resets */
+NULL,   /* get_miimon */
+netdev_dummy_get_stats,
+netdev_dummy_set_stats,
+
+NULL,   /* get_features */
+NULL,   /* set_advertisements */
+
+NULL,   /* set_policing */
+NULL,   /* get_qos_types */
+NULL,   /* get_qos_capabilities */
+NULL,   /* get_qos */
+NULL,   /* set_qos */
+NULL,   /* get_queue */
+NULL,   /* set_queue */
+NULL,   /* delete_queue */
+NULL,   /* get_queue_stats */
+NULL,   /* dump_queues */
+NULL,   /* dump_queue_stats */
+
+NULL,   /* get_in4 */
+NULL,   /* set_in4 */
+NULL,   /* get_in6 */
+NULL,   /* add_router */
+NULL,   /* get_next_hop */
+NULL,   /* get_drv_info */
+NULL,   /* arp_lookup */
+
+netdev_dummy_update_flags,
+
+netdev_dummy_change_seq
+};
+
 static struct ofpbuf *
 eth_from_packet_or_flow(const char *s)
 {
@@ -529,4 +616,5 @@ netdev_dummy_register(bool override)
 sset_destroy(&types);
 }
 netdev_register_provider(&dummy_class);
+netdev_register_provider(&dummy_tunnel_class);
 }
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [RFC PATCH v2 4/5] tunnel: Userspace implementation of tunnel manipulation.

2012-09-28 Thread Jesse Gross

The kernel tunneling code currently needs to handle a large number
of operations when tunnel packets are encapsulated and decapsulated.
Some examples of this are: finding the correct tunnel port on receive,
TTL and ToS inheritance, ECN handling, etc.  All of these can be done
on a per-flow basis in userspace now that we have both the inner and
outer header information, which allows us to both simpify the kernel
and take advantage of userspace's information.  This ports the logic
from the kernel to userspace and also pulls in the tunnel-specific
configuration handling from netdev-vport.c.  Once tunnel packets are
redirected into this code, the redundant pieces can be removed from
other places.

Signed-off-by: Jesse Gross 
---
 NEWS |2 +
 lib/automake.mk  |2 +
 lib/tunnel.c |  759 ++
 lib/tunnel.h |   45 +++
 tests/automake.mk|1 +
 tests/testsuite.at   |1 +
 tests/tunnel.at  |  112 
 vswitchd/vswitch.xml |   14 +-
 8 files changed, 925 insertions(+), 11 deletions(-)
 create mode 100644 lib/tunnel.c
 create mode 100644 lib/tunnel.h
 create mode 100644 tests/tunnel.at

diff --git a/NEWS b/NEWS
index 6265e2f..be401f4 100644
--- a/NEWS
+++ b/NEWS
@@ -40,6 +40,8 @@ post-v1.8.0
 - Numeric values for reserved ports (see "ovs-ofctl" note above).
 - Tunneling requires the version of the kernel module paired with this
   release (or a newer release).
+- Inheritance of the Don't Fragment bit in IP tunnels (df_inherit) is
+  no longer supported.
 
 
 v1.8.0 - xx xxx 
diff --git a/lib/automake.mk b/lib/automake.mk
index 94b86f6..9513cd9 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -181,6 +181,8 @@ lib_libopenvswitch_a_SOURCES = \
lib/timeval.h \
lib/token-bucket.c \
lib/token-bucket.h \
+   lib/tunnel.c \
+   lib/tunnel.h \
lib/type-props.h \
lib/unaligned.h \
lib/unicode.c \
diff --git a/lib/tunnel.c b/lib/tunnel.c
new file mode 100644
index 000..2217b83
--- /dev/null
+++ b/lib/tunnel.c
@@ -0,0 +1,759 @@
+/* Copyright (c) 2012 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include 
+
+#include "byte-order.h"
+#include "daemon.h"
+#include "dirs.h"
+#include "dynamic-string.h"
+#include "hash.h"
+#include "hmap.h"
+#include "packets.h"
+#include "smap.h"
+#include "socket-util.h"
+#include "tunnel.h"
+#include "unixctl.h"
+#include "vlog.h"
+
+/* TODO:
+ *
+ * Better hooks for tunnel creation/deletion/input/output
+ * Mechanism to create DP ports
+ * Ability to generate actions on input for ECN
+ * Revalidate flows on port add/remove/reconfigure
+ * Port stats
+ * Kernel interface needs flags defined (in particular for keys)
+ * IPsec flag needed?
+ */
+
+VLOG_DEFINE_THIS_MODULE(tunnel);
+
+#define DEFAULT_TTL 64
+
+struct tnl_match {
+ovs_be64 in_key;
+ovs_be32 ip_src;
+ovs_be32 ip_dst;
+uint32_t dp_portno;
+bool key_present;
+bool in_key_flow;
+};
+BUILD_ASSERT_DECL(sizeof(struct tnl_match) % 4 == 0);
+
+struct tnl_port {
+struct hmap_node match_node;
+struct hmap_node portno_node;
+
+const struct netdev *netdev;
+struct tnl_match match;
+ovs_be64 out_key;
+uint32_t tnl_portno;
+uint8_t ttl;
+uint8_t tos;
+bool out_key_flow;
+bool key_present;
+bool ttl_inherit;
+bool tos_inherit;
+bool dont_fragment;
+bool csum;
+};
+
+static struct hmap tnl_match_map = HMAP_INITIALIZER(&tnl_match_map);
+static struct hmap tnl_portno_map = HMAP_INITIALIZER(&tnl_portno_map);
+
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+static struct vlog_rate_limit dbg_rl = VLOG_RATE_LIMIT_INIT(60, 60);
+
+static int tnl_config_get(const struct netdev *, struct tnl_port *);
+static struct tnl_port *tnl_find(struct tnl_match *);
+static struct tnl_port *tnl_find_exact(struct tnl_match *);
+static uint32_t tnl_hash(struct tnl_match *);
+static void tnl_match_fmt(const struct tnl_match *, struct ds *);
+static unixctl_cb_func tnl_port_dump;
+static void tnl_port_mod_log(uint32_t tnl_portno, const char *action);
+static struct tnl_port *tnl_portno_find(uint32_t tnl_portno);
+
+void
+tnl_init(void)
+{
+static bool inited;
+
+if (inited) {
+return;
+}
+inited = true;
+
+unixctl_command_register("tunnel/show", "[interface]", 0, 1,
+ tnl_port_dump, N

[ovs-dev] [RFC PATCH v2 5/5] ofproto-dpif: Basic hooks for rewriting tunnel packets.

2012-09-28 Thread Jesse Gross

This is a basic set of hooks into the ofproto-dpif code so that
tunneled packets can be have their flows rewritten to reflect the
outward facing view of how ports are laid out.  This is far from
being a comprehensive layer and should be replaced once a better
level of indirection is available.

Signed-off-by: Jesse Gross 
---
 ofproto/ofproto-dpif.c |   40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index ff6df41..6fbe406 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -50,6 +50,7 @@
 #include "poll-loop.h"
 #include "simap.h"
 #include "timer.h"
+#include "tunnel.h"
 #include "unaligned.h"
 #include "unixctl.h"
 #include "vlan-bitmap.h"
@@ -803,6 +804,8 @@ construct(struct ofproto *ofproto_)
 hmap_init(&ofproto->vlandev_map);
 hmap_init(&ofproto->realdev_vid_map);
 
+tnl_init();
+
 hmap_insert(&all_ofproto_dpifs, &ofproto->all_ofproto_dpifs_node,
 hash_string(ofproto->up.name, 0));
 memset(&ofproto->stats, 0, sizeof ofproto->stats);
@@ -2536,6 +2539,11 @@ port_add(struct ofproto *ofproto_, struct netdev 
*netdev, uint16_t *ofp_portp)
 
 error = dpif_port_add(ofproto->dpif, netdev, &odp_port);
 if (!error) {
+error = tnl_port_add(odp_port, odp_port, netdev);
+if (error) {
+dpif_port_del(ofproto->dpif, odp_port);
+return error;
+}
 *ofp_portp = odp_port_to_ofp_port(odp_port);
 }
 return error;
@@ -2545,9 +2553,12 @@ static int
 port_del(struct ofproto *ofproto_, uint16_t ofp_port)
 {
 struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
+uint16_t odp_port = ofp_port_to_odp_port(ofp_port);
 int error;
 
-error = dpif_port_del(ofproto->dpif, ofp_port_to_odp_port(ofp_port));
+tnl_port_del(odp_port);
+
+error = dpif_port_del(ofproto->dpif, odp_port);
 if (!error) {
 struct ofport_dpif *ofport = get_ofp_port(ofproto, ofp_port);
 if (ofport) {
@@ -3025,8 +3036,9 @@ handle_flow_miss(struct ofproto_dpif *ofproto, struct 
flow_miss *miss,
  * an ODP_FIT_* value that indicates how well 'key' fits our expectations for
  * what a flow key should contain.
  *
- * This function also includes some logic to help make VLAN splinters
- * transparent to the rest of the upcall processing logic.  In particular, if
+ * This function also includes some logic to help with ports that have special
+ * meaning to their flows such as tunnels and VLAN splinters.  Tunnels may
+ * map their source and destination addresses to a logical port whereas if
  * the extracted in_port is a VLAN splinter port, it replaces flow->in_port by
  * the "real" port, sets flow->vlan_tci correctly for the VLAN of the VLAN
  * splinter port, and pushes a VLAN header onto 'packet' (if it is nonnull).
@@ -3043,13 +3055,21 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif 
*ofproto,
   struct ofpbuf *packet)
 {
 enum odp_key_fitness fitness;
+bool adjusted;
+int err;
 
 fitness = odp_flow_key_to_flow(key, key_len, flow);
 if (fitness == ODP_FIT_ERROR) {
 return fitness;
 }
-*initial_tci = flow->vlan_tci;
 
+err = tnl_port_receive(flow, &adjusted);
+if (err) {
+/* XXX: We should really install a drop flow here. */
+return ODP_FIT_ERROR;
+}
+
+*initial_tci = flow->vlan_tci;
 if (vsp_adjust_flow(ofproto, flow)) {
 if (packet) {
 /* Make the packet resemble the flow, so that it gets sent to an
@@ -3069,10 +3089,12 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif 
*ofproto,
 eth_push_vlan(packet, flow->vlan_tci);
 }
 
-/* Let the caller know that we can't reproduce 'key' from 'flow'. */
-if (fitness == ODP_FIT_PERFECT) {
-fitness = ODP_FIT_TOO_MUCH;
-}
+adjusted = true;
+}
+
+/* Let the caller know that we can't reproduce 'key' from 'flow'. */
+if (adjusted && fitness == ODP_FIT_PERFECT) {
+fitness = ODP_FIT_TOO_MUCH;
 }
 
 return fitness;
@@ -4981,6 +5003,8 @@ compose_output_action__(struct action_xlate_ctx *ctx, 
uint16_t ofp_port,
 if (out_port != odp_port) {
 ctx->flow.vlan_tci = htons(0);
 }
+out_port = tnl_port_send(&ctx->flow, out_port);
+
 commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions);
 nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port);
 
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Threaded userspace datapath

2012-09-28 Thread Giuseppe Lettieri

To me, it was that the existing code mostly worked and had already proved to be 
a real performance improvement. This coupled with the limited manpower we could 
pour on the project. I understand that turning a single-threaded program into a 
multi-threaded one looks scary, and with reason. On the other end, reworking 
the main loop of the bridge looks just as scary, and there is still the risk to 
be disappointed by the final performance. 

More positively, this approach only affects the userlevel datapath, which is 
also the one that really needs a performance boost. A change in the main loop 
of the bridge would be much more intrusive, I think, since all datapaths would 
likely need to be touched.


Il giorno 28/set/2012, alle ore 20:48, Jesse Gross  ha 
scritto:

> I haven't been following this work all that closely but earlier on
> there was a discussion of how the performance benefits might be
> related to how poll is used.  Restructuring that seems like a much
> more desirable path but I never saw any real follow up.  What was the
> reason for continuing to go down this path?


Dr. Ing. Giuseppe Lettieri
Dipartimento di Ingegneria della Informazione
Universita' di Pisa
Largo Lucio Lazzarino 2, 56122 Pisa - Italy
Ph. : (+39) 050-2217.649 (direct) .599 (switch)
Fax : (+39) 050-2217.600
e-mail: g.letti...@iet.unipi.it

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery)
 wrote:
> On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote:
>> Now that both the kernel and userspace can handle information about
>> the tunnel outer headers this adds userspace support for communicating
>> between the two.  At the moment userspace doesn't know how to do
>> anything with the extra information on receive and will only generate
>> actions containing tun_id.  However, both sides know how to ignore the
>> extra information.
>>
>> Signed-off-by: Jesse Gross 
>
>
> The patch I just submitted overlaps with this patch a little bit. I think 
> there wasn't
> a really set delineation between what I was doing and what you were doing, so
> we may have to merge this patch with mine. I think we both pretty much made
> similar changes though, so it shouldn't be too bad. Acking this patch for now,
> assuming we'll work through the merge bits.

I agree the changes are pretty similar.  My inclination is that it's a
little easier to use the version in my patch if that's OK with you for
a couple of reasons:
 * I'm not sure that the change logically belongs in the second patch
of your series.  It could be moved into the first patch though.
 * There isn't much benefit to userspace supporting both models of
kernel tunnel implementation because the kernel will reject any
actions that it doesn't understand.  As a result, once this patch goes
in we will immediately require a new version of the kernel module.
Ideally I'd like to push that requirement back as late as practical so
that we can start applying patches without worrying about breaking
compatibility more than necessary.  However, when we do put it in, we
might as well go all the way and drop support for the old mechanism in
userspace.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Kyle Mestery (kmestery)

On Sep 28, 2012, at 5:40 PM, Jesse Gross wrote:
> On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery)
>  wrote:
>> On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote:
>>> Now that both the kernel and userspace can handle information about
>>> the tunnel outer headers this adds userspace support for communicating
>>> between the two.  At the moment userspace doesn't know how to do
>>> anything with the extra information on receive and will only generate
>>> actions containing tun_id.  However, both sides know how to ignore the
>>> extra information.
>>> 
>>> Signed-off-by: Jesse Gross 
>> 
>> 
>> The patch I just submitted overlaps with this patch a little bit. I think 
>> there wasn't
>> a really set delineation between what I was doing and what you were doing, so
>> we may have to merge this patch with mine. I think we both pretty much made
>> similar changes though, so it shouldn't be too bad. Acking this patch for 
>> now,
>> assuming we'll work through the merge bits.
> 
> I agree the changes are pretty similar.  My inclination is that it's a
> little easier to use the version in my patch if that's OK with you for
> a couple of reasons:
> * I'm not sure that the change logically belongs in the second patch
> of your series.  It could be moved into the first patch though.
> * There isn't much benefit to userspace supporting both models of
> kernel tunnel implementation because the kernel will reject any
> actions that it doesn't understand.  As a result, once this patch goes
> in we will immediately require a new version of the kernel module.
> Ideally I'd like to push that requirement back as late as practical so
> that we can start applying patches without worrying about breaking
> compatibility more than necessary.  However, when we do put it in, we
> might as well go all the way and drop support for the old mechanism in
> userspace.

That makes sense to me. Do you want me to rebase my first patch and fold this
one into that? I can then rework my second patch based on this. Or even, we
could move this patch as a second patch in my series, and I could just rebase
my third patch.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 3:44 PM, Kyle Mestery (kmestery)
 wrote:
> On Sep 28, 2012, at 5:40 PM, Jesse Gross wrote:
>> On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery)
>>  wrote:
>>> On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote:
 Now that both the kernel and userspace can handle information about
 the tunnel outer headers this adds userspace support for communicating
 between the two.  At the moment userspace doesn't know how to do
 anything with the extra information on receive and will only generate
 actions containing tun_id.  However, both sides know how to ignore the
 extra information.

 Signed-off-by: Jesse Gross 
>>>
>>>
>>> The patch I just submitted overlaps with this patch a little bit. I think 
>>> there wasn't
>>> a really set delineation between what I was doing and what you were doing, 
>>> so
>>> we may have to merge this patch with mine. I think we both pretty much made
>>> similar changes though, so it shouldn't be too bad. Acking this patch for 
>>> now,
>>> assuming we'll work through the merge bits.
>>
>> I agree the changes are pretty similar.  My inclination is that it's a
>> little easier to use the version in my patch if that's OK with you for
>> a couple of reasons:
>> * I'm not sure that the change logically belongs in the second patch
>> of your series.  It could be moved into the first patch though.
>> * There isn't much benefit to userspace supporting both models of
>> kernel tunnel implementation because the kernel will reject any
>> actions that it doesn't understand.  As a result, once this patch goes
>> in we will immediately require a new version of the kernel module.
>> Ideally I'd like to push that requirement back as late as practical so
>> that we can start applying patches without worrying about breaking
>> compatibility more than necessary.  However, when we do put it in, we
>> might as well go all the way and drop support for the old mechanism in
>> userspace.
>
> That makes sense to me. Do you want me to rebase my first patch and fold this
> one into that? I can then rework my second patch based on this. Or even, we
> could move this patch as a second patch in my series, and I could just rebase
> my third patch.

I actually don't think that much needs to be done - this patch depends
on the previous patch to do all the userspace flow changes and we
don't really want to roll all of them together.  I would just drop the
odp-util.c changes from your second patch and then your patches fit in
right before my series.  That will avoid the conflict and shouldn't
have any dependency problems.

I haven't had a chance to review your patches that closely yet but my
impression is that the bulk of the second patch doesn't have any
changes visible to userspace.  Is that right?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.

2012-09-28 Thread Kyle Mestery (kmestery)

On Sep 28, 2012, at 5:54 PM, Jesse Gross wrote:
> On Fri, Sep 28, 2012 at 3:44 PM, Kyle Mestery (kmestery)
>  wrote:
>> On Sep 28, 2012, at 5:40 PM, Jesse Gross wrote:
>>> On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery)
>>>  wrote:
 On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote:
> Now that both the kernel and userspace can handle information about
> the tunnel outer headers this adds userspace support for communicating
> between the two.  At the moment userspace doesn't know how to do
> anything with the extra information on receive and will only generate
> actions containing tun_id.  However, both sides know how to ignore the
> extra information.
> 
> Signed-off-by: Jesse Gross 
 
 
 The patch I just submitted overlaps with this patch a little bit. I think 
 there wasn't
 a really set delineation between what I was doing and what you were doing, 
 so
 we may have to merge this patch with mine. I think we both pretty much made
 similar changes though, so it shouldn't be too bad. Acking this patch for 
 now,
 assuming we'll work through the merge bits.
>>> 
>>> I agree the changes are pretty similar.  My inclination is that it's a
>>> little easier to use the version in my patch if that's OK with you for
>>> a couple of reasons:
>>> * I'm not sure that the change logically belongs in the second patch
>>> of your series.  It could be moved into the first patch though.
>>> * There isn't much benefit to userspace supporting both models of
>>> kernel tunnel implementation because the kernel will reject any
>>> actions that it doesn't understand.  As a result, once this patch goes
>>> in we will immediately require a new version of the kernel module.
>>> Ideally I'd like to push that requirement back as late as practical so
>>> that we can start applying patches without worrying about breaking
>>> compatibility more than necessary.  However, when we do put it in, we
>>> might as well go all the way and drop support for the old mechanism in
>>> userspace.
>> 
>> That makes sense to me. Do you want me to rebase my first patch and fold this
>> one into that? I can then rework my second patch based on this. Or even, we
>> could move this patch as a second patch in my series, and I could just rebase
>> my third patch.
> 
> I actually don't think that much needs to be done - this patch depends
> on the previous patch to do all the userspace flow changes and we
> don't really want to roll all of them together.  I would just drop the
> odp-util.c changes from your second patch and then your patches fit in
> right before my series.  That will avoid the conflict and shouldn't
> have any dependency problems.
> 
OK, got it.

> I haven't had a chance to review your patches that closely yet but my
> impression is that the bulk of the second patch doesn't have any
> changes visible to userspace.  Is that right?

That's correct.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v2.0] Non-Datapath MPLS actions and matches

2012-09-28 Thread Jesse Gross

On Thu, Sep 27, 2012 at 5:44 PM, Simon Horman  wrote:
> On Thu, Sep 27, 2012 at 04:45:22PM -0700, Jesse Gross wrote:
>> On Thu, Sep 27, 2012 at 12:05 PM, ravi kerur  wrote:
>> > On Thu, Sep 27, 2012 at 11:16 AM, Jesse Gross  wrote:
>> >> On Thu, Sep 27, 2012 at 9:26 AM, Ben Pfaff  wrote:
>> >> > On Wed, Sep 26, 2012 at 04:13:41PM +0900, Simon Horman wrote:
>> >> >> This patch provides an implementation of the non-datapath portions
>> >> >> of MPLS matches and actions.
>> >> >>
>> >> >> This patch is based on top of Ben Pfaff's series,
>> >> >> "set-field action support"
>> >> >>
>> >> >> Cc: Isaku Yamahata 
>> >> >> Cc: Ravi K 
>> >> >
>> >> > I think Ravi's full last name is "Kerur".
>> >> >
>> >> >> Signed-off-by: Simon Horman 
>> >> >
>> >> > Jesse, do you have any concerns about this?  I haven't read past the
>> >> > diffstat yet.  But if it seems like a reasonable intermediate approach
>> >> > then I'm happy to review it.
>> >>
>> >> I don't think there is inherently anything wrong in starting with a
>> >> userspace-only approach.  I have a couple of specific concerns based
>> >> on briefly skimming the patch:
>> >>  * It seems like this is really the userspace half of the code which
>> >> assumes that the kernel portions will still be doing the work on the
>> >> actual packet flows.  If that's the case then I don't think that
>> >> userspace support can go in independently.  Otherwise, userspace
>> >> should really be self-contained and setup slow-path flows to do the
>> >> work itself.
>> >
>> >
>> > mpls userspace is completely self-contained i.e. doesn't depend on OVS
>> > kernel code. I am saying this based on implementation and testing. During
>> > testing no OVS kernel module was loaded and testing such as ping, iperf,
>> > netperf and scp executed.
>>
>> At the very least, userspace-only code shouldn't add anything to
>> either the userspace/kernel interface or odp-util.c.  I also don't see
>> how any MPLS action will actually get processed.  I do see that MPLS
>> actions send packets to the local port and then install a flow but I
>> think there is a misunderstanding because that doesn't make a lot of
>> sense to me.
>
> That portion is probably my handiwork. Could you give some guidance on
> a method that would work? I'm more than happy to rework things.

Currently this is changing the userspace flow to reflect any
modifications that we want to make.  This is normally correct because
then later when we want to actually output a packet (where any
modifications would get noticed) we do a comparison from the last
output and the current flow and generate any kernel actions necessary.
 However, in this case we don't have any kernel actions to do MPLS
modifications so the flow changes just get ignored.

Instead, what we need to do is set up a slow path flow to explicitly
send all MPLS packets to userspace if there is any MPLS match or
action.  We don't currently have anything in this category but it
would use the same infrastructure as CFM and LACP which also need to
go to userspace since they are actually consumed there.  Once the MPLS
packets are in userspace, it should actually modify them directly
rather than just change the flow.

>> >>  * If it is truly userspace only, the handling of multiple levels of  tags
>> >> seems a little incomplete since we actually have the full packet.
>> >
>> >
>> > Can you elaborate  please?
>>
>> Multiple levels of tags aren't handled, for example, when you pop a tag.
>
> Do you think that adding support for multiple levels of tags to the kernel
> is appropriate. Or should such cases be bounced to user-space?

The kernel should support enough tags to handle the common cases.
Initially I would make this 1 but I could see it becoming 2 in the
future.  However, regardless of how many tags we support it's always
possible that a greater number of tags are used than we support.  I
can see three ways to handle this:

1. Don't support it.  You can run into a similar problem with vlan
tags because it's possible to pop tags off and then do further
operations but the kernel doesn't look deep enough into the packet.
There isn't really a mechanism to deal with this today and to my
knowledge nobody has actually complained.

2. Handle in userspace.  It's certainly a reasonable approach for corner cases.

3. Some form of recirculation.  Allowing the kernel to look at a
packet, pop some tags off, and then look at it again allows processing
an infinite number of tags in a manner that won't significantly affect
throughput (although each round will result in more flow setups).  It
also allows use cases like pop MPLS tag and do IP routing, which is
both fairly common and can't be realistically achieved any other way
in kernel.

If the short term goal is a kernel implementation, I would do #1 since
it's much simpler, is a strict increase in functionality from what we
have today and doesn't preclude doing #3 in the future.  If you want
something more complete then you can handle the

Re: [ovs-dev] [PATCH v2.0] Non-Datapath MPLS actions and matches

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 6:25 AM, ravi kerur  wrote:
> On Thu, Sep 27, 2012 at 5:44 PM, Simon Horman  wrote:
>> On Thu, Sep 27, 2012 at 04:45:22PM -0700, Jesse Gross wrote:
>> > On Thu, Sep 27, 2012 at 12:05 PM, ravi kerur  wrote:
>> > > On Thu, Sep 27, 2012 at 11:16 AM, Jesse Gross 
>> > >>  * If it is truly userspace only, the handling of multiple levels of
>> > >> tags
>> > >> seems a little incomplete since we actually have the full packet.
>> > >
>> > >
>> > > Can you elaborate  please?
>> >
>> > Multiple levels of tags aren't handled, for example, when you pop a tag.
>>
>> Do you think that adding support for multiple levels of tags to the kernel
>> is appropriate. Or should such cases be bounced to user-space?
>
>
>  multiple tags do work in both userspace and kernel i.e. any combination
> of push/pop actions. I am not sure what the confusion is?
>
> Simon: I have not looked into your patch, has anything changed here?

This patch is not the same as yours.  Please actually read it before
making assertions.

>> > >>  If this is supposed to be a quick stepping stone to kernel support
>> > >> that seems less important since we will no longer have complete
>> > >> packet
>> > >> access.
>> > >>
>> > >> So it basically comes down to what the short term plans are.  There's
>> > >> also Leo's patch (which I haven't looked at) that I can post if there
>> > >> are plans to do kernel support.
>> > >
>> > >
>> > >  so will there be separate/different kernel interface?
>> >
>> > Separate from what?
>
>
>  Our earlier disagreement was on packet recirculation for multiple
> actions. You mention about Leo's patch and I assumed it contains packet
> recirculation support and you want mpls changes for that new piece of kernel
> datapath and wanted clarification on that.

All I'm looking for is something either implements recirculation or at
the very least doesn't preclude it from being done in the future.  The
patch that I was referring to takes the latter approach and implements
just the basic operations in the kernel.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Threaded userspace datapath

2012-09-28 Thread Jesse Gross

On Fri, Sep 28, 2012 at 3:17 PM, Giuseppe Lettieri
 wrote:
> To me, it was that the existing code mostly worked and had already proved to 
> be a real performance improvement. This coupled with the limited manpower we 
> could pour on the project. I understand that turning a single-threaded 
> program into a multi-threaded one looks scary, and with reason. On the other 
> end, reworking the main loop of the bridge looks just as scary, and there is 
> still the risk to be disappointed by the final performance.
>
> More positively, this approach only affects the userlevel datapath, which is 
> also the one that really needs a performance boost. A change in the main loop 
> of the bridge would be much more intrusive, I think, since all datapaths 
> would likely need to be touched.

Does this actually require changing the main loop?  If it really is
the poll related changes that you mention it seems like they could be
contained to the netdevs and userspace datapath.  You could handle a
batch of packets each time through the main loop.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

42 matches

Mail list logo