Re: [ovs-dev] Threaded userspace datapath
Hi all, we have finally found the time to finish work on this patch. We have addressed all the issues that were mentioned in the previous mails, plus some more that showed up later. We now pass all unit tests on both Linux and FreeBSD and we can more confidently offer the patch to your attention, for possible inclusion. As usual, any comment would be much appreciated. Regards, Giuseppe Il 27/08/2012 02:00, Ben Pfaff ha scritto: On Fri, Aug 24, 2012 at 10:25:02AM +0200, Giuseppe Lettieri wrote: Il 20/08/2012 20:48, Ed Maste ha scritto: 3. The datapath thread waits in poll() with each port's file descriptor in the fd array. However there's currently no fd to signal a bridge reconfiguration, so the poll() has to timeout before the thread will pick up the new config on the next time through its loop. We are considering to add the notification in dpif_netdev_run(), a callback which is now empty in the #ifdef THREADED case. According to our tests this solves the problem in practice, but we are worried that it may be a misuse of the interface, since the corresponding dpif_netdev_wait() callback would still be empty. Any comment would be much appreciated. There's no inherent reason that a "wait" function has to be nonempty if the corresponding "run" function is nonempty. The question is simply whether the poll loop will wake up when "run" needs to do some work, and if you're confident that that will happen (I don't have enough context here to check myself) then that's fine. -- Dr. Ing. Giuseppe Lettieri Dipartimento di Ingegneria della Informazione Universita' di Pisa Largo Lucio Lazzarino 2, 56122 Pisa - Italy Ph. : (+39) 050-2217.649 (direct) .599 (switch) Fax : (+39) 050-2217.600 e-mail: g.letti...@iet.unipi.it >From f42467cae9be8003a95b25f1c409fc7b3cee0440 Mon Sep 17 00:00:00 2001 From: Giuseppe Lettieri Date: Fri, 28 Sep 2012 12:05:18 +0200 Subject: [PATCH] Threaded userspace datapath This patch refactors the userlevel datapath (i.e., datapath_type=netdev) into two threads: one only forwards the packets for which a flow is found, and the other does all other processing. This arrangement can speed-up forwarding performance, measuread in pps, by a factor of 5-10. To enable compilation of the threaded datapath, pass '--enable-threaded' to configure. Signed-off-by: Gaetano Catalli Signed-off-by: Ed Maste Signed-off-by: Giuseppe Lettieri --- configure.ac |1 + lib/automake.mk |1 + lib/dispatch.h|9 + lib/dpif-netdev.c | 473 - lib/fatal-signal.c|2 +- lib/netdev-bsd.c | 91 ++ lib/netdev-dummy.c| 123 - lib/netdev-linux.c| 44 + lib/netdev-provider.h | 19 ++ lib/netdev-vport.c|8 + lib/netdev.c | 22 +++ lib/netdev.h |7 + lib/vlog.c| 16 ++ m4/openvswitch.m4 | 19 ++ 14 files changed, 827 insertions(+), 8 deletions(-) create mode 100644 lib/dispatch.h diff --git a/configure.ac b/configure.ac index 9bdffea..a70232f 100644 --- a/configure.ac +++ b/configure.ac @@ -43,6 +43,7 @@ AC_SEARCH_LIBS([clock_gettime], [rt]) AC_SEARCH_LIBS([timer_create], [rt]) AC_SEARCH_LIBS([pcap_open_live], [pcap]) +OVS_CHECK_THREADED OVS_CHECK_COVERAGE OVS_CHECK_NDEBUG OVS_CHECK_NETLINK diff --git a/lib/automake.mk b/lib/automake.mk index 94b86f6..3479ec0 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -37,6 +37,7 @@ lib_libopenvswitch_a_SOURCES = \ lib/daemon.c \ lib/daemon.h \ lib/dhcp.h \ + lib/dispatch.h \ lib/dummy.c \ lib/dummy.h \ lib/dhparams.h \ diff --git a/lib/dispatch.h b/lib/dispatch.h new file mode 100644 index 000..80ac9c7 --- /dev/null +++ b/lib/dispatch.h @@ -0,0 +1,9 @@ +#include +#include "ofpbuf.h" + +#ifndef DISPATCH_H +#define DISPATCH_H 1 + +typedef void (*pkt_handler)(u_char *user, struct ofpbuf* buf); + +#endif /* DISPATCH_H */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index dc4479e..e7013db 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -32,6 +32,15 @@ #include #include +#ifdef THREADED +#include +#include + +#include "socket-util.h" +#include "fatal-signal.h" +#include "dispatch.h" +#endif + #include "csum.h" #include "dpif.h" #include "dpif-provider.h" @@ -55,6 +64,18 @@ #include "vlog.h" VLOG_DEFINE_THIS_MODULE(dpif_netdev); +/* Pthread lock macros, nops in the non-threaded case. */ +#ifdef THREADED +#define INIT_MUTEX(mutex) pthread_mutex_init(mutex, NULL) +#define DESTROY_MUTEX(mutex) pthread_mutex_destroy(mutex) +#define LOCK(mutex) pthread_mutex_lock(mutex) +#define UNLOCK(mutex) pthread_mutex_unlock(mutex) +#else +#define INIT_MUTEX(mutex) +#define DESTROY_MUTEX(mutex) +#define LOCK(mutex) +#define UNLOCK(mutex) +#endif /* Configuration parameters. */ enum { MAX_PORTS = 256 }; /* Maximum number of ports. */ @@ -80,6 +101,49 @@ struct dp_netdev_queue { unsigned int head, tail; }; +#ifdef THREADED +str
Re: [ovs-dev] [PATCH v2.0] Non-Datapath MPLS actions and matches
On Thu, Sep 27, 2012 at 5:44 PM, Simon Horman wrote: > On Thu, Sep 27, 2012 at 04:45:22PM -0700, Jesse Gross wrote: > > On Thu, Sep 27, 2012 at 12:05 PM, ravi kerur wrote: > > > On Thu, Sep 27, 2012 at 11:16 AM, Jesse Gross > wrote: > > >> On Thu, Sep 27, 2012 at 9:26 AM, Ben Pfaff wrote: > > >> > On Wed, Sep 26, 2012 at 04:13:41PM +0900, Simon Horman wrote: > > >> >> This patch provides an implementation of the non-datapath portions > > >> >> of MPLS matches and actions. > > >> >> > > >> >> This patch is based on top of Ben Pfaff's series, > > >> >> "set-field action support" > > >> >> > > >> >> Cc: Isaku Yamahata > > >> >> Cc: Ravi K > > >> > > > >> > I think Ravi's full last name is "Kerur". > > >> > > > >> >> Signed-off-by: Simon Horman > > >> > > > >> > Jesse, do you have any concerns about this? I haven't read past the > > >> > diffstat yet. But if it seems like a reasonable intermediate > approach > > >> > then I'm happy to review it. > > >> > > >> I don't think there is inherently anything wrong in starting with a > > >> userspace-only approach. I have a couple of specific concerns based > > >> on briefly skimming the patch: > > >> * It seems like this is really the userspace half of the code which > > >> assumes that the kernel portions will still be doing the work on the > > >> actual packet flows. If that's the case then I don't think that > > >> userspace support can go in independently. Otherwise, userspace > > >> should really be self-contained and setup slow-path flows to do the > > >> work itself. > > > > > > > > > mpls userspace is completely self-contained i.e. doesn't depend on OVS > > > kernel code. I am saying this based on implementation and testing. > During > > > testing no OVS kernel module was loaded and testing such as ping, > iperf, > > > netperf and scp executed. > > > > At the very least, userspace-only code shouldn't add anything to > > either the userspace/kernel interface or odp-util.c. I also don't see > > how any MPLS action will actually get processed. I do see that MPLS > > actions send packets to the local port and then install a flow but I > > think there is a misunderstanding because that doesn't make a lot of > > sense to me. > > That portion is probably my handiwork. Could you give some guidance on > a method that would work? I'm more than happy to rework things. > > > >> * If it is truly userspace only, the handling of multiple levels of > tags > > >> seems a little incomplete since we actually have the full packet. > > > > > > > > > Can you elaborate please? > > > > Multiple levels of tags aren't handled, for example, when you pop a tag. > > Do you think that adding support for multiple levels of tags to the kernel > is appropriate. Or should such cases be bounced to user-space? > multiple tags do work in both userspace and kernel i.e. any combination of push/pop actions. I am not sure what the confusion is? Simon: I have not looked into your patch, has anything changed here? > > > >> If this is supposed to be a quick stepping stone to kernel support > > >> that seems less important since we will no longer have complete packet > > >> access. > > >> > > >> So it basically comes down to what the short term plans are. There's > > >> also Leo's patch (which I haven't looked at) that I can post if there > > >> are plans to do kernel support. > > > > > > > > > so will there be separate/different kernel interface? > > > > Separate from what? > Our earlier disagreement was on packet recirculation for multiple actions. You mention about Leo's patch and I assumed it contains packet recirculation support and you want mpls changes for that new piece of kernel datapath and wanted clarification on that. > > ___ > > dev mailing list > > dev@openvswitch.org > > http://openvswitch.org/mailman/listinfo/dev > > > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ofproto: Rationlise comments in handle_flow_mod()
On Fri, Sep 28, 2012 at 03:07:08PM +0900, Simon Horman wrote: > Signed-off-by: Simon Horman Applied, thanks! ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Threaded userspace datapath
On Fri, Sep 28, 2012 at 12:49:16PM +0200, Giuseppe Lettieri wrote: > we have finally found the time to finish work on this patch. We have > addressed all the issues that were mentioned in the previous mails, > plus some more that showed up later. We now pass all unit tests on > both Linux and FreeBSD and we can more confidently offer the patch > to your attention, for possible inclusion. As usual, any comment > would be much appreciated. This is looking good. I read through some of the code and I have a few comments. I see a number of XXX comments still left. Some of them, I guess, can just be removed, but others I wonder about, and it would be nice to have your opinion. dp_netdev_notifier_init() and netdev_dummy_listen() both call pipe() then set_nonblocking() on each fd. I see the same pattern in a few other places in the tree. Perhaps we should have a helper in socket-util.c to eliminate some redundancy. I see "sizeof(fds) / sizeof(fds[0])" in a few places. We have ARRAY_SIZE in util.h for that. In dp_netdev_notifier_ack(), dp_netdev_notifier_ack1(), and dp_netdev_notifier_notify(), I think that the system calls should retry on EINTR. That would be cautious practice, anyhow. I think that there are now synchronization issues in vlog. The vlog_write_file() implementation assumes a single-threaded model (worker_request() isn't thread-safe). We could introduce per-thread state so that only the main thread of a process uses worker_request() and other threads call write() directly from vlog_write_file() (since write() should be thread-safe). In dpif_netdev_exit_hook(), I'd be inclined to just use an ack and skip the pthread_cancel(). Thread cancellation always seems like a big hammer that is best avoided when one can. (Another approach would be to close the write side of the notification pipe. Then the read side would get EOF after draining any remaining acks, an unmistakable sign.) Does that give you enough to think about for a while? I'll read the next version of the patch in more detail. Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Threaded userspace datapath
On Fri, Sep 28, 2012 at 09:23:01AM -0700, Ben Pfaff wrote: > I think that there are now synchronization issues in vlog. The > vlog_write_file() implementation assumes a single-threaded model > (worker_request() isn't thread-safe). We could introduce per-thread > state so that only the main thread of a process uses worker_request() > and other threads call write() directly from vlog_write_file() (since > write() should be thread-safe). Oh, also I see some static data in vlog_valist() we'd have to do something about. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3] ofp-actions: Add parsing of set_field actions
On Fri, Sep 28, 2012 at 02:59:10PM +0900, Simon Horman wrote: > Based heavily on work by Isaku Yamahata > > Cc: Isaku Yamahata > Signed-off-by: Simon Horman Thanks, I added a test and tweaked the documentation and applied this to master. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC PATCH 0/5] Userspace tunnel logic
This is a port of the kernel tunneling logic to userspace, taking advantage of the outer header information in the flow. It logically fits in after Kyle's tunnel kernel patch and before Justin's single datapath work. I expect that the final patch in this series will be replaced by Justin's better indirection layer, along with some other integration work. There are a few TODO items noted at the top of tunne.c, mostly around integration with ofproto-dpif and flags from the kernel. Review is welcome but I'm not planning on applying this until the other components are ready, so this is really just informational. Jesse Gross (5): flow: Extend struct flow to contain tunnel outer header. odp-util: Enable communicating outer tunnel to/from the kernel. netdev-dummy: Add tunnel variation. tunnel: Userspace implementation of tunnel manipulation. ofproto-dpif: Basic hooks for rewriting tunnel packets. NEWS |4 + lib/automake.mk|2 + lib/dpif-linux.c |2 +- lib/dpif-netdev.c |4 +- lib/dpif-provider.h|2 +- lib/dpif.c |2 +- lib/flow.c | 55 +++- lib/flow.h | 19 +- lib/learning-switch.c |7 +- lib/match.c| 15 +- lib/meta-flow.c|8 +- lib/netdev-dummy.c | 88 ++ lib/nx-match.c |3 +- lib/odp-util.c | 75 - lib/ofp-print.c|2 +- lib/ofp-util.c |6 +- lib/tunnel.c | 739 lib/tunnel.h | 36 +++ ofproto/ofproto-dpif.c | 54 +++- ofproto/ofproto-provider.h |4 +- ofproto/ofproto.c |4 +- tests/automake.mk |1 + tests/odp.at | 12 +- tests/ofp-print.at |6 +- tests/ofproto-dpif.at | 58 ++-- tests/ofproto.at | 12 +- tests/test-classifier.c| 40 +-- tests/test-flows.c |2 +- tests/testsuite.at |1 + tests/tunnel.at| 112 +++ vswitchd/vswitch.xml | 14 +- 31 files changed, 1252 insertions(+), 137 deletions(-) create mode 100644 lib/tunnel.c create mode 100644 lib/tunnel.h create mode 100644 tests/tunnel.at -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC PATCH 1/5] flow: Extend struct flow to contain tunnel outer header.
Now that the kernel is supplying information about the outer IP header for tunneled packets userspace needs to be able to track it as part of the flow. For the time being this is only used internally by OVS and not exposed outwards to OpenFlow. As a result, this threads the information throughout userspace but simply stores the existing tun_id in it. Signed-off-by: Jesse Gross --- lib/dpif-linux.c |2 +- lib/dpif-netdev.c |4 ++-- lib/dpif-provider.h|2 +- lib/dpif.c |2 +- lib/flow.c | 55 +++ lib/flow.h | 19 --- lib/learning-switch.c |7 +- lib/match.c| 15 lib/meta-flow.c|8 +++ lib/nx-match.c |3 ++- lib/odp-util.c | 12 +- lib/ofp-print.c|2 +- lib/ofp-util.c |6 ++--- ofproto/ofproto-dpif.c | 14 +++ ofproto/ofproto-provider.h |4 ++-- ofproto/ofproto.c |4 ++-- tests/ofp-print.at |6 ++--- tests/ofproto-dpif.at | 56 ++-- tests/ofproto.at | 12 +- tests/test-classifier.c| 40 +++ tests/test-flows.c |2 +- 21 files changed, 169 insertions(+), 106 deletions(-) diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c index a2eb490..3a4a4e6 100644 --- a/lib/dpif-linux.c +++ b/lib/dpif-linux.c @@ -1399,7 +1399,7 @@ dpif_linux_vport_send(int dp_ifindex, uint32_t port_no, uint64_t action; ofpbuf_use_const(&packet, data, size); -flow_extract(&packet, 0, htonll(0), 0, &flow); +flow_extract(&packet, 0, NULL, 0, &flow); ofpbuf_use_stack(&key, &keybuf, sizeof keybuf); odp_flow_key_from_flow(&key, &flow); diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 1ce14ce..797cb06 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -916,7 +916,7 @@ dpif_netdev_execute(struct dpif *dpif, const struct dpif_execute *execute) ofpbuf_reserve(©, DP_NETDEV_HEADROOM); ofpbuf_put(©, execute->packet->data, execute->packet->size); -flow_extract(©, 0, 0, -1, &key); +flow_extract(©, 0, NULL, -1, &key); error = dpif_netdev_flow_from_nlattrs(execute->key, execute->key_len, &key); if (!error) { @@ -1014,7 +1014,7 @@ dp_netdev_port_input(struct dp_netdev *dp, struct dp_netdev_port *port, if (packet->size < ETH_HEADER_LEN) { return; } -flow_extract(packet, 0, 0, odp_port_to_ofp_port(port->port_no), &key); +flow_extract(packet, 0, NULL, odp_port_to_ofp_port(port->port_no), &key); flow = dp_netdev_lookup_flow(dp, &key); if (flow) { dp_netdev_flow_used(flow, packet); diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h index 317e617..ffe084a 100644 --- a/lib/dpif-provider.h +++ b/lib/dpif-provider.h @@ -292,7 +292,7 @@ struct dpif_class { * taken from the flow specified in the 'execute->key_len' bytes of * 'execute->key'. ('execute->key' is mostly redundant with * 'execute->packet', but it contains some metadata that cannot be - * recovered from 'execute->packet', such as tun_id and in_port.) */ + * recovered from 'execute->packet', such as tunnel and in_port.) */ int (*execute)(struct dpif *dpif, const struct dpif_execute *execute); /* Executes each of the 'n_ops' operations in 'ops' on 'dpif', in the order diff --git a/lib/dpif.c b/lib/dpif.c index 2968966..7f859d7 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -977,7 +977,7 @@ dpif_execute__(struct dpif *dpif, const struct dpif_execute *execute) * the Ethernet frame specified in 'packet' taken from the flow specified in * the 'key_len' bytes of 'key'. ('key' is mostly redundant with 'packet', but * it contains some metadata that cannot be recovered from 'packet', such as - * tun_id and in_port.) + * tunnel and in_port.) * * Returns 0 if successful, otherwise a positive errno value. */ int diff --git a/lib/flow.c b/lib/flow.c index e517a03..cc41cfe 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -316,7 +316,7 @@ invalid: } -/* Initializes 'flow' members from 'packet', 'skb_priority', 'tun_id', and +/* Initializes 'flow' members from 'packet', 'skb_priority', 'tnl', and * 'ofp_in_port'. * * Initializes 'packet' header pointers as follows: @@ -334,8 +334,8 @@ invalid: * present and has a correct length, and otherwise NULL. */ void -flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id, - uint16_t ofp_in_port, struct flow *flow) +flow_extract(struct ofpbuf *packet, uint32_t skb_priority, + struct flow_tnl *tnl, uint16_t ofp_in_port, struct flow *flow) { struct ofpbuf b = *packet; struct eth_header *eth; @@ -343,7 +343,10 @@ flow_extract(struct ofpbuf *packet, uint32_t skb_priority,
[ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
Now that both the kernel and userspace can handle information about the tunnel outer headers this adds userspace support for communicating between the two. At the moment userspace doesn't know how to do anything with the extra information on receive and will only generate actions containing tun_id. However, both sides know how to ignore the extra information. Signed-off-by: Jesse Gross --- NEWS |2 ++ lib/odp-util.c| 75 + tests/odp.at | 12 tests/ofproto-dpif.at |2 +- 4 files changed, 73 insertions(+), 18 deletions(-) diff --git a/NEWS b/NEWS index 71b2e29..6cd2947 100644 --- a/NEWS +++ b/NEWS @@ -37,6 +37,8 @@ post-v1.8.0 - The autopath action. - Interface type "null". - Numeric values for reserved ports (see "ovs-ofctl" note above). +- Tunneling requires the version of the kernel module paired with this + release (or a newer release). v1.8.0 - xx xxx diff --git a/lib/odp-util.c b/lib/odp-util.c index f4b30e9..d46189f 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -925,6 +925,30 @@ parse_odp_key_attr(const char *s, const struct simap *port_names, } { +char tun_id_s[32]; +unsigned long long int flags; +int tos, ttl; +struct ovs_key_ipv4_tunnel tun_key; +int n = -1; + +if (sscanf(s, "ipv4_tunnel(tun_id=%31[x0123456789abcdefABCDEF]," + "flags=%lli,src="IP_SCAN_FMT",dst="IP_SCAN_FMT + ",tos=%i,ttl=%i)%n", tun_id_s, &flags, +IP_SCAN_ARGS(&tun_key.ipv4_src), +IP_SCAN_ARGS(&tun_key.ipv4_dst), &tos, &ttl, +&n) > 0 && n > 0) { +tun_key.tun_id = htonll(strtoull(tun_id_s, NULL, 0)); +tun_key.tun_flags = flags; +tun_key.ipv4_tos = tos; +tun_key.ipv4_ttl = ttl; +memset(&tun_key.pad, 0, sizeof tun_key.pad); +nl_msg_put_unspec(key, OVS_KEY_ATTR_IPV4_TUNNEL, &tun_key, + sizeof tun_key); +return n; +} +} + +{ unsigned long long int in_port; int n = -1; @@ -1273,8 +1297,18 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow) nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->skb_priority); } -if (flow->tunnel.tun_id != htonll(0)) { -nl_msg_put_be64(buf, OVS_KEY_ATTR_TUN_ID, flow->tunnel.tun_id); +if (flow->tunnel.ip_dst != 0) { +struct ovs_key_ipv4_tunnel *tun_key; + +tun_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV4_TUNNEL, + sizeof *tun_key); +tun_key->tun_id = flow->tunnel.tun_id; +tun_key->ipv4_src = flow->tunnel.ip_src; +tun_key->ipv4_dst = flow->tunnel.ip_dst; +tun_key->tun_flags = 0; // XXX: Need flags +tun_key->ipv4_tos = flow->tunnel.ip_tos; +tun_key->ipv4_ttl = flow->tunnel.ip_ttl; +memset(&tun_key->pad, 0, sizeof tun_key->pad); } if (flow->in_port != OFPP_NONE && flow->in_port != OFPP_CONTROLLER) { @@ -1765,9 +1799,17 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY; } -if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUN_ID)) { -flow->tunnel.tun_id = nl_attr_get_be64(attrs[OVS_KEY_ATTR_TUN_ID]); -expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TUN_ID; +if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL)) { +const struct ovs_key_ipv4_tunnel *tun_key; + +tun_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV4_TUNNEL]); +flow->tunnel.tun_id = tun_key->tun_id; +flow->tunnel.ip_src = tun_key->ipv4_src; +flow->tunnel.ip_dst = tun_key->ipv4_dst; +flow->tunnel.flags = 0; // XXX need flags +flow->tunnel.ip_tos = tun_key->ipv4_tos; +flow->tunnel.ip_ttl = tun_key->ipv4_ttl; +expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL; } if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IN_PORT)) { @@ -1858,16 +1900,27 @@ commit_set_action(struct ofpbuf *odp_actions, enum ovs_key_attr key_type, } static void -commit_set_tun_id_action(const struct flow *flow, struct flow *base, +commit_set_tunnel_action(const struct flow *flow, struct flow *base, struct ofpbuf *odp_actions) { -if (base->tunnel.tun_id == flow->tunnel.tun_id) { +struct ovs_key_ipv4_tunnel tun_key; + +if (!memcmp(&base->tunnel, &flow->tunnel, sizeof base->tunnel)) { return; } -base->tunnel.tun_id = flow->tunnel.tun_id; -commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID, - &base->tunnel.tun_id, sizeof(base->tunnel.tun_id)); +memcpy(&base->tunnel, &flow->tunnel, sizeof base->tunnel); + +tun_key.tun_id = base->tunnel.tun_id; +tun_key.ip
[ovs-dev] [RFC PATCH 3/5] netdev-dummy: Add tunnel variation.
In order to be able to unit test the tunnel flow setup code we need to be able to generate netdevs that appear to be like tunnel devices. This adds a dummy_tunnel netdev with two differences from the normal dummy devices: the ability to set and read config and a different type to identify them as needing tunnel translation. Signed-off-by: Jesse Gross --- lib/netdev-dummy.c | 88 1 file changed, 88 insertions(+) diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c index 6aa4084..bb65c12 100644 --- a/lib/netdev-dummy.c +++ b/lib/netdev-dummy.c @@ -29,6 +29,7 @@ #include "packets.h" #include "poll-loop.h" #include "shash.h" +#include "smap.h" #include "sset.h" #include "unixctl.h" #include "vlog.h" @@ -37,6 +38,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_dummy); struct netdev_dev_dummy { struct netdev_dev netdev_dev; +struct smap config; uint8_t hwaddr[ETH_ADDR_LEN]; int mtu; struct netdev_stats stats; @@ -94,6 +96,7 @@ netdev_dummy_create(const struct netdev_class *class, const char *name, netdev_dev = xzalloc(sizeof *netdev_dev); netdev_dev_init(&netdev_dev->netdev_dev, name, class); +smap_init(&netdev_dev->config); netdev_dev->hwaddr[0] = 0xaa; netdev_dev->hwaddr[1] = 0x55; netdev_dev->hwaddr[2] = n >> 24; @@ -119,12 +122,35 @@ netdev_dummy_destroy(struct netdev_dev *netdev_dev_) { struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); +smap_destroy(&netdev_dev->config); shash_find_and_delete(&dummy_netdev_devs, netdev_dev_get_name(netdev_dev_)); free(netdev_dev); } static int +netdev_dummy_get_config(struct netdev_dev *netdev_dev_, struct smap *args) +{ +struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); + +smap_destroy(args); +smap_clone(args, &netdev_dev->config); + +return 0; +} + +static int +netdev_dummy_set_config(struct netdev_dev *netdev_dev_, const struct smap *args) +{ +struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); + +smap_destroy(&netdev_dev->config); +smap_clone(&netdev_dev->config, args); + +return 0; +} + +static int netdev_dummy_open(struct netdev_dev *netdev_dev_, struct netdev **netdevp) { struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); @@ -370,6 +396,67 @@ static const struct netdev_class dummy_class = { netdev_dummy_change_seq }; +static const struct netdev_class dummy_tunnel_class = { +"dummy_tunnel", +NULL, /* init */ +NULL, /* run */ +NULL, /* wait */ + +netdev_dummy_create, +netdev_dummy_destroy, +netdev_dummy_get_config, +netdev_dummy_set_config, + +netdev_dummy_open, +netdev_dummy_close, + +netdev_dummy_listen, +netdev_dummy_recv, +netdev_dummy_recv_wait, +netdev_dummy_drain, + +NULL, /* send */ +NULL, /* send_wait */ + +netdev_dummy_set_etheraddr, +netdev_dummy_get_etheraddr, +netdev_dummy_get_mtu, +netdev_dummy_set_mtu, +NULL, /* get_ifindex */ +NULL, /* get_carrier */ +NULL, /* get_carrier_resets */ +NULL, /* get_miimon */ +netdev_dummy_get_stats, +netdev_dummy_set_stats, + +NULL, /* get_features */ +NULL, /* set_advertisements */ + +NULL, /* set_policing */ +NULL, /* get_qos_types */ +NULL, /* get_qos_capabilities */ +NULL, /* get_qos */ +NULL, /* set_qos */ +NULL, /* get_queue */ +NULL, /* set_queue */ +NULL, /* delete_queue */ +NULL, /* get_queue_stats */ +NULL, /* dump_queues */ +NULL, /* dump_queue_stats */ + +NULL, /* get_in4 */ +NULL, /* set_in4 */ +NULL, /* get_in6 */ +NULL, /* add_router */ +NULL, /* get_next_hop */ +NULL, /* get_drv_info */ +NULL, /* arp_lookup */ + +netdev_dummy_update_flags, + +netdev_dummy_change_seq +}; + static struct ofpbuf * eth_from_packet_or_flow(const char *s) { @@ -529,4 +616,5 @@ netdev_dummy_register(bool override) sset_destroy(&types); } netdev_register_provider(&dummy_class); +netdev_register_provider(&dummy_tunnel_class); } -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC PATCH 4/5] tunnel: Userspace implementation of tunnel manipulation.
The kernel tunneling code currently needs to handle a large number of operations when tunnel packets are encapsulated and decapsulated. Some examples of this are: finding the correct tunnel port on receive, TTL and ToS inheritance, ECN handling, etc. All of these can be done on a per-flow basis in userspace now that we have both the inner and outer header information, which allows us to both simpify the kernel and take advantage of userspace's information. This ports the logic from the kernel to userspace and also pulls in the tunnel-specific configuration handling from netdev-vport.c. Once tunnel packets are redirected into this code, the redundant pieces can be removed from other places. Signed-off-by: Jesse Gross --- NEWS |2 + lib/automake.mk |2 + lib/tunnel.c | 739 ++ lib/tunnel.h | 36 +++ tests/automake.mk|1 + tests/testsuite.at |1 + tests/tunnel.at | 112 vswitchd/vswitch.xml | 14 +- 8 files changed, 896 insertions(+), 11 deletions(-) create mode 100644 lib/tunnel.c create mode 100644 lib/tunnel.h create mode 100644 tests/tunnel.at diff --git a/NEWS b/NEWS index 6cd2947..d02369a 100644 --- a/NEWS +++ b/NEWS @@ -39,6 +39,8 @@ post-v1.8.0 - Numeric values for reserved ports (see "ovs-ofctl" note above). - Tunneling requires the version of the kernel module paired with this release (or a newer release). +- Inheritance of the Don't Fragment bit in IP tunnels (df_inherit) is + no longer supported. v1.8.0 - xx xxx diff --git a/lib/automake.mk b/lib/automake.mk index 94b86f6..9513cd9 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -181,6 +181,8 @@ lib_libopenvswitch_a_SOURCES = \ lib/timeval.h \ lib/token-bucket.c \ lib/token-bucket.h \ + lib/tunnel.c \ + lib/tunnel.h \ lib/type-props.h \ lib/unaligned.h \ lib/unicode.c \ diff --git a/lib/tunnel.c b/lib/tunnel.c new file mode 100644 index 000..115ff07 --- /dev/null +++ b/lib/tunnel.c @@ -0,0 +1,739 @@ +/* Copyright (c) 2012 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include + +#include "byte-order.h" +#include "daemon.h" +#include "dirs.h" +#include "dynamic-string.h" +#include "hash.h" +#include "hmap.h" +#include "packets.h" +#include "smap.h" +#include "socket-util.h" +#include "tunnel.h" +#include "unixctl.h" +#include "vlog.h" + +/* TODO: + * + * Better hooks for tunnel creation/deletion/input/output + * Mechanism to create DP ports + * Ability to generate actions on input for ECN + * Revalidate flows on port add/remove/reconfigure + * Port stats + * Kernel interface needs flags defined (in particular for keys) + * IPsec flag needed? + */ + +VLOG_DEFINE_THIS_MODULE(tunnel); + +#define DEFAULT_TTL 64 + +struct tnl_match { +ovs_be64 in_key; +ovs_be32 ip_src; +ovs_be32 ip_dst; +uint32_t dp_portno; +bool key_present; +bool in_key_flow; +uint8_t zeroed[2]; +}; +BUILD_ASSERT_DECL(sizeof(struct tnl_match) % 8 == 0); + +struct tnl_port { +struct hmap_node match_node; +struct hmap_node portno_node; + +const struct netdev *netdev; +struct tnl_match match; +ovs_be64 out_key; +uint32_t tnl_portno; +uint8_t ttl; +uint8_t tos; +bool out_key_flow; +bool key_present; +bool ttl_inherit; +bool tos_inherit; +bool dont_fragment; +bool csum; +}; + +static struct hmap tnl_match_map = HMAP_INITIALIZER(&tnl_match_map); +static struct hmap tnl_portno_map = HMAP_INITIALIZER(&tnl_portno_map); + +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); +static struct vlog_rate_limit dbg_rl = VLOG_RATE_LIMIT_INIT(60, 60); + +static int tnl_config_get(const struct netdev *, struct tnl_port *); +static struct tnl_port *tnl_find(struct tnl_match *); +static struct tnl_port *tnl_find_exact(struct tnl_match *); +static uint32_t tnl_hash(struct tnl_match *); +static void tnl_match_fmt(const struct tnl_match *, struct ds *); +static unixctl_cb_func tnl_port_dump; +static void tnl_port_mod_log(uint32_t tnl_portno, const char *action); +static struct tnl_port *tnl_portno_find(uint32_t tnl_portno); + +void tnl_init(void) +{ +static bool inited; + +if (inited) { +return; +} +inited = true; + +unixctl_command_register("tunnel/show", "[interface]", 0, 1, +
[ovs-dev] [RFC PATCH 5/5] ofproto-dpif: Basic hooks for rewriting tunnel packets.
This is a basic set of hooks into the ofproto-dpif code so that tunneled packets can be have their flows rewritten to reflect the outward facing view of how ports are laid out. This is far from being a comprehensive layer and should be replaced once a better level of indirection is available. Signed-off-by: Jesse Gross --- ofproto/ofproto-dpif.c | 40 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 16633f1..8f0e65e 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -50,6 +50,7 @@ #include "poll-loop.h" #include "simap.h" #include "timer.h" +#include "tunnel.h" #include "unaligned.h" #include "unixctl.h" #include "vlan-bitmap.h" @@ -803,6 +804,8 @@ construct(struct ofproto *ofproto_) hmap_init(&ofproto->vlandev_map); hmap_init(&ofproto->realdev_vid_map); +tnl_init(); + hmap_insert(&all_ofproto_dpifs, &ofproto->all_ofproto_dpifs_node, hash_string(ofproto->up.name, 0)); memset(&ofproto->stats, 0, sizeof ofproto->stats); @@ -2536,6 +2539,11 @@ port_add(struct ofproto *ofproto_, struct netdev *netdev, uint16_t *ofp_portp) error = dpif_port_add(ofproto->dpif, netdev, &odp_port); if (!error) { +error = tnl_port_add(odp_port, odp_port, netdev); +if (error) { +dpif_port_del(ofproto->dpif, odp_port); +return error; +} *ofp_portp = odp_port_to_ofp_port(odp_port); } return error; @@ -2545,9 +2553,12 @@ static int port_del(struct ofproto *ofproto_, uint16_t ofp_port) { struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_); +uint16_t odp_port = ofp_port_to_odp_port(ofp_port); int error; -error = dpif_port_del(ofproto->dpif, ofp_port_to_odp_port(ofp_port)); +tnl_port_del(odp_port); + +error = dpif_port_del(ofproto->dpif, odp_port); if (!error) { struct ofport_dpif *ofport = get_ofp_port(ofproto, ofp_port); if (ofport) { @@ -3025,8 +3036,9 @@ handle_flow_miss(struct ofproto_dpif *ofproto, struct flow_miss *miss, * an ODP_FIT_* value that indicates how well 'key' fits our expectations for * what a flow key should contain. * - * This function also includes some logic to help make VLAN splinters - * transparent to the rest of the upcall processing logic. In particular, if + * This function also includes some logic to help with ports that have special + * meaning to their flows such as tunnels and VLAN splinters. Tunnels may + * map their source and destination addresses to a logical port whereas if * the extracted in_port is a VLAN splinter port, it replaces flow->in_port by * the "real" port, sets flow->vlan_tci correctly for the VLAN of the VLAN * splinter port, and pushes a VLAN header onto 'packet' (if it is nonnull). @@ -3043,13 +3055,21 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto, struct ofpbuf *packet) { enum odp_key_fitness fitness; +bool adjusted; +int err; fitness = odp_flow_key_to_flow(key, key_len, flow); if (fitness == ODP_FIT_ERROR) { return fitness; } -*initial_tci = flow->vlan_tci; +err = tnl_port_receive(flow, &adjusted); +if (err) { +/* XXX: We should really install a drop flow here. */ +return ODP_FIT_ERROR; +} + +*initial_tci = flow->vlan_tci; if (vsp_adjust_flow(ofproto, flow)) { if (packet) { /* Make the packet resemble the flow, so that it gets sent to an @@ -3069,10 +3089,12 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto, eth_push_vlan(packet, flow->vlan_tci); } -/* Let the caller know that we can't reproduce 'key' from 'flow'. */ -if (fitness == ODP_FIT_PERFECT) { -fitness = ODP_FIT_TOO_MUCH; -} +adjusted = true; +} + +/* Let the caller know that we can't reproduce 'key' from 'flow'. */ +if (adjusted && fitness == ODP_FIT_PERFECT) { +fitness = ODP_FIT_TOO_MUCH; } return fitness; @@ -4981,6 +5003,8 @@ compose_output_action__(struct action_xlate_ctx *ctx, uint16_t ofp_port, if (out_port != odp_port) { ctx->flow.vlan_tci = htons(0); } +out_port = tnl_port_send(&ctx->flow, out_port); + commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions); nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port); -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Threaded userspace datapath
On Fri, Sep 28, 2012 at 3:49 AM, Giuseppe Lettieri wrote: > Hi all, > > we have finally found the time to finish work on this patch. We have > addressed all the issues that were mentioned in the previous mails, plus > some more that showed up later. We now pass all unit tests on both Linux and > FreeBSD and we can more confidently offer the patch to your attention, for > possible inclusion. As usual, any comment would be much appreciated. I haven't been following this work all that closely but earlier on there was a discussion of how the performance benefits might be related to how poll is used. Restructuring that seems like a much more desirable path but I never saw any real follow up. What was the reason for continuing to go down this path? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 0/2] Add flow-based tunneling support
These patches build on the first two patches which Simon Horman sent out in May to move Open vSwitch towards flow-based tunneling. The first patche adds a tun_key, deprecating the tun_id member of the ovs_skb_cb struct. This patche retain compatibilty with existing tunneling, but once the userspace code is submitted, this will be deprecated. The second patch makes an attempt at adding the new tun_key structure into the flow matching logic in the kernel. Kyle Mestery (2): This is a first pass at providing a tun_key which can be used as the basis for flow-based tunnelling. The tun_key includes and replaces the tun_id in both struct ovs_skb_cb and struct sw_tun_key. Move struct ovs_key_ipv4_tunnel out of struct sw_flow_key and into struct sw_flow. This allows it to "float" and be used for matching only when needed. Modify the matching code in ovs_flow_tbl_lookup() to match on the tunnel header if it's set. NEWS| 3 ++ datapath/actions.c | 25 +++-- datapath/datapath.c | 37 +-- datapath/datapath.h | 6 ++- datapath/flow.c | 88 +--- datapath/flow.h | 25 - datapath/tunnel.c | 89 +++-- datapath/tunnel.h | 14 ++- datapath/vport-capwap.c | 12 +++--- datapath/vport-gre.c| 15 datapath/vport.c| 2 +- include/linux/openvswitch.h | 13 ++- lib/dpif-netdev.c | 1 + lib/odp-util.c | 52 +- lib/odp-util.h | 3 +- 15 files changed, 289 insertions(+), 96 deletions(-) -- 1.7.11.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 1/2] Add support for tun_key to OVS datapath
This is a first pass at providing a tun_key which can be used as the basis for flow-based tunnelling. The tun_key includes and replaces the tun_id in both struct ovs_skb_cb and struct sw_tun_key. This patch allows all existing tun_id behaviour to still work. Existing users of tun_id are redirected to tun_key->tun_id to retain compatibility. However, when the userspace code is updated to make use of the new tun_key, the old behaviour will be deprecated and removed. NOTE: With these changes, the tunneling code no longer assumes input and output keys are symmetric. If they are not, PMTUD needs to be disabled for tunneling to work. Signed-off-by: Kyle Mestery Cc: Simon Horman Cc: Jesse Gross --- V6: - Address latest round of comments from Jesse. V5: - Address another round of comments from Jesse. V4: - Address 2 comments from Jesse: - When processing actions, if OVS_CB(skb)->tun_key is NULL, point it at one on the stack temporarily. This goes away when we remove the ability to set tun_id outside the scope of tun_key. - Move tun_key to the end of struct sw_flow_key. V3: - Fix issues found during review by Jesse. - Add a NEWS entry around tunnel code no longer assuming symmetric input and output tunnel keys. V2: - Fix blank line addition/removal found by Simon. - Fix hex printing output found by Simon. --- NEWS| 3 ++ datapath/actions.c | 24 ++--- datapath/datapath.c | 10 +- datapath/datapath.h | 5 +-- datapath/flow.c | 61 +++- datapath/flow.h | 12 --- datapath/tunnel.c | 84 +++-- datapath/tunnel.h | 14 ++-- datapath/vport-capwap.c | 12 +++ datapath/vport-gre.c| 15 datapath/vport.c| 2 +- include/linux/openvswitch.h | 13 ++- lib/dpif-netdev.c | 1 + lib/odp-util.c | 15 +++- lib/odp-util.h | 3 +- 15 files changed, 202 insertions(+), 72 deletions(-) diff --git a/NEWS b/NEWS index 29fd9f3..ae831e3 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,8 @@ post-v1.8.0 +- The tunneling code no longer assumes input and output keys are symmetric. + If they are not, PMTUD needs to be disabled for tunneling to work. Note + this only applies to flow-based keys. - FreeBSD is now a supported platform, thanks to code contributions from Gaetano Catalli, Ed Maste, and Giuseppe Lettieri. - ovs-bugtool: New --ovs option to report only OVS related information. diff --git a/datapath/actions.c b/datapath/actions.c index ec9b595..a70cde8 100644 --- a/datapath/actions.c +++ b/datapath/actions.c @@ -333,7 +333,8 @@ static int sample(struct datapath *dp, struct sk_buff *skb, } static int execute_set_action(struct sk_buff *skb, -const struct nlattr *nested_attr) +const struct nlattr *nested_attr, +struct ovs_key_ipv4_tunnel *tun_key) { int err = 0; @@ -343,7 +344,21 @@ static int execute_set_action(struct sk_buff *skb, break; case OVS_KEY_ATTR_TUN_ID: - OVS_CB(skb)->tun_id = nla_get_be64(nested_attr); + if (OVS_CB(skb)->tun_key == NULL) { + /* If tun_key is NULL for this skb, assign it to +* a value the caller passed in for action processing +* and output. This can disappear once we drop support +* for setting tun_id outside of tun_key. +*/ + memset(tun_key, 0, sizeof(struct ovs_key_ipv4_tunnel)); + OVS_CB(skb)->tun_key = tun_key; + } + + OVS_CB(skb)->tun_key->tun_id = nla_get_be64(nested_attr); + break; + + case OVS_KEY_ATTR_IPV4_TUNNEL: + OVS_CB(skb)->tun_key = nla_data(nested_attr); break; case OVS_KEY_ATTR_ETHERNET: @@ -377,6 +392,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, int prev_port = -1; const struct nlattr *a; int rem; + struct ovs_key_ipv4_tunnel tun_key; for (a = attr, rem = len; rem > 0; a = nla_next(a, &rem)) { @@ -407,7 +423,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, break; case OVS_ACTION_ATTR_SET: - err = execute_set_action(skb, nla_data(a)); + err = execute_set_action(skb, nla_data(a), &tun_key); break; case OVS_ACTION_ATTR_SAMPLE: @@ -469,7 +485,7 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb) goto out_loop; } - OVS_CB(skb)->tun_id = 0; + OVS_CB(skb)->tun_key = NUL
[ovs-dev] [PATCH 2/2] Add flow matching for tunnel keys
Move struct ovs_key_ipv4_tunnel out of struct sw_flow_key and into struct sw_flow. This allows it to "float" and be used for matching only when needed. Modify the matching code in ovs_flow_tbl_lookup() to match on the tunnel header if it's set. In particular, I'd like review of the method used for adding the matching logic here. The method I've chosen here adds one additional check during flow lookup to see if a tunnel key is set, and if so will add one comparison before matching a flow. If this is deemed suboptimal, I'm open to suggestions around optimizing this. Signed-off-by: Kyle Mestery --- datapath/actions.c | 1 + datapath/datapath.c | 29 ++-- datapath/datapath.h | 1 + datapath/flow.c | 65 + datapath/flow.h | 17 +++--- datapath/tunnel.c | 5 +++-- lib/odp-util.c | 37 ++ 7 files changed, 109 insertions(+), 46 deletions(-) diff --git a/datapath/actions.c b/datapath/actions.c index a70cde8..063d3af 100644 --- a/datapath/actions.c +++ b/datapath/actions.c @@ -288,6 +288,7 @@ static int output_userspace(struct datapath *dp, struct sk_buff *skb, upcall.cmd = OVS_PACKET_CMD_ACTION; upcall.key = &OVS_CB(skb)->flow->key; + upcall.tun_key = &OVS_CB(skb)->flow->tun_key; upcall.userdata = NULL; upcall.pid = 0; diff --git a/datapath/datapath.c b/datapath/datapath.c index d8a198e..845494c 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -313,10 +313,11 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) if (!OVS_CB(skb)->flow) { struct sw_flow_key key; + struct ovs_key_ipv4_tunnel tun_key; int key_len; /* Extract flow from 'skb' into 'key'. */ - error = ovs_flow_extract(skb, p->port_no, &key, &key_len); + error = ovs_flow_extract(skb, p->port_no, &key, &tun_key, &key_len); if (unlikely(error)) { kfree_skb(skb); return; @@ -324,12 +325,13 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) /* Look up flow. */ flow = ovs_flow_tbl_lookup(rcu_dereference(dp->table), - &key, key_len); + &key, &tun_key, key_len); if (unlikely(!flow)) { struct dp_upcall_info upcall; upcall.cmd = OVS_PACKET_CMD_MISS; upcall.key = &key; + upcall.tun_key = &tun_key; upcall.userdata = NULL; upcall.pid = p->upcall_pid; ovs_dp_upcall(dp, skb, &upcall); @@ -492,7 +494,7 @@ static int queue_userspace_packet(struct net *net, int dp_ifindex, upcall->dp_ifindex = dp_ifindex; nla = nla_nest_start(user_skb, OVS_PACKET_ATTR_KEY); - ovs_flow_to_nlattrs(upcall_info->key, user_skb); + ovs_flow_to_nlattrs(upcall_info->key, upcall_info->tun_key, user_skb); nla_nest_end(user_skb, nla); if (upcall_info->userdata) @@ -787,13 +789,13 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) if (IS_ERR(flow)) goto err_kfree_skb; - err = ovs_flow_extract(packet, -1, &flow->key, &key_len); + err = ovs_flow_extract(packet, -1, &flow->key, &flow->tun_key, &key_len); if (err) goto err_flow_put; err = ovs_flow_metadata_from_nlattrs(&flow->key.phy.priority, &flow->key.phy.in_port, -&flow->key.tun.tun_key, +&flow->tun_key, a[OVS_PACKET_ATTR_KEY]); if (err) goto err_flow_put; @@ -922,7 +924,7 @@ static int ovs_flow_cmd_fill_info(struct sw_flow *flow, struct datapath *dp, nla = nla_nest_start(skb, OVS_FLOW_ATTR_KEY); if (!nla) goto nla_put_failure; - err = ovs_flow_to_nlattrs(&flow->key, skb); + err = ovs_flow_to_nlattrs(&flow->key, &flow->tun_key, skb); if (err) goto error; nla_nest_end(skb, nla); @@ -1016,6 +1018,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info) struct nlattr **a = info->attrs; struct ovs_header *ovs_header = info->userhdr; struct sw_flow_key key; + struct ovs_key_ipv4_tunnel tun_key; struct sw_flow *flow; struct sk_buff *reply; struct datapath *dp; @@ -1027,7 +1030,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info) error = -EINVAL; if (!a[OVS_FLOW_ATTR_KEY]) goto error; - er
Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote: > Now that both the kernel and userspace can handle information about > the tunnel outer headers this adds userspace support for communicating > between the two. At the moment userspace doesn't know how to do > anything with the extra information on receive and will only generate > actions containing tun_id. However, both sides know how to ignore the > extra information. > > Signed-off-by: Jesse Gross The patch I just submitted overlaps with this patch a little bit. I think there wasn't a really set delineation between what I was doing and what you were doing, so we may have to merge this patch with mine. I think we both pretty much made similar changes though, so it shouldn't be too bad. Acking this patch for now, assuming we'll work through the merge bits. Acked-by: Kyle Mestery ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 1/5] flow: Extend struct flow to contain tunnel outer header.
On Fri, Sep 28, 2012 at 11:12:21AM -0700, Jesse Gross wrote: > Now that the kernel is supplying information about the outer IP > header for tunneled packets userspace needs to be able to track it > as part of the flow. For the time being this is only used internally > by OVS and not exposed outwards to OpenFlow. As a result, this > threads the information throughout userspace but simply stores the > existing tun_id in it. > > Signed-off-by: Jesse Gross Is this gracefully extensible if later we need to support IPv6 tunnel endpoints? It took me a while to figure out the purpose of the 'tnl' parameter to flow_extract(). I think it would have been more obvious if the parameter were marked 'const'. If you can be satisfied with 16 bits of flags (only 3 flags exist so far) then struct flow_tnl can be reduced by 4 bytes, including 2 bytes of padding. Should struct flow_metadata get a struct flow_tnl member? I guess it depends on whether we plan to expose anything other than tunnel ID via OpenFlow. format_tunnel_flags() is kind of oddly written. It doesn't need to be a loop if you dropped the "else"s and put "if (flags)" before the final "else" clause. The flow_format() output might be easier to read if it omitted all tunnel information if there isn't any, instead of writing "tunnel(0)". match_format() doesn't print any of the new fields. I assume that's intentional? In process_packet_in(), memset(&tnl, 0, sizeof tnl); tnl.tun_id = pi.fmd.tun_id; flow_extract(&pkt, 0, &tnl, pi.fmd.in_port, &flow); might more simply be written as: flow_extract(&pkt, 0, NULL, pi.fmd.in_port, &flow); flow.tunnel.tun_id = pi.fmd.tun_id; and I do see it written that way in other places. Similarly in ofproto_unixctl_trace(). Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] "brctl show" doesn't list attached interfaces
On Thu, Sep 27, 2012 at 8:36 PM, Ben Pfaff wrote: > On Fri, Sep 28, 2012 at 11:56:47AM +0900, Tadaaki Nagao wrote: >> In "Re: [ovs-dev] "brctl show" doesn't list attached interfaces", >> Ben Pfaff wrote: >> > I see datapaths in brctl show output even without brcompat: >> > >> > oot@debian:~# rmmod brcompat >> > Error: Module brcompat is not currently loaded >> > root@debian:~# brctl show >> > bridge name bridge id STP enabled interfaces >> > root@debian:~# ovs-vsctl add-br br0 >> > root@debian:~# brctl show >> > bridge name bridge id STP enabled interfaces >> > br0 .d61b592dd94d no >> > root@debian:~# ovs-dpctl show >> > system@br0: >> > lookups: hit:0 missed:0 lost:0 >> > flows: 0 >> > port 0: br0 (internal) >> > root@debian:~# >> > >> > I assumed that was what Isaku meant. >> >> (Though I'm not Isaku ;-), > > Sorry about that. I apologize. > >> what I tried to meant was, as Jesse wrote, OVS ports aren't appearing >> in brctl output. I wasn't clear enough, sorry about it. >> >> Actually, we are trying OVS + KVM + LibVirt + CloudStack. From what I >> heard from my colleague, KVM support code in CloudStack uses "brctl show" >> to list the ports attached to a bridge. Simply changing them to use >> "ovs-dpctl show" would work for us, but I thought it would be better if >> compatibility to Linux bridge is also kept in brctl output. > > It would be better to use "ovs-vsctl list-ports". The output of > "ovs-dpctl show" is an implementation detail that is subject to change > and, in fact, we plan to change soon. Regardless of the mechanism, it is better to build on the OVS native commands rather than relying on brctl since they are much more robust. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] "brctl show" doesn't list attached interfaces
On Thu, Sep 27, 2012 at 8:18 PM, Tadaaki Nagao wrote: > Hi, > > In "Re: [ovs-dev] "brctl show" doesn't list attached interfaces", > Jesse Gross wrote: >> I think it's the opposite problem - that OVS ports aren't appearing in >> brctl output. There's a bug with namespaces on certain kernels when >> populating the sysfs entries that Pravin ran into so we don't do that >> on the theory that missing entries are better than crashing. > > In datapath/dp_sysfs_dp.c:ovs_dp_sysfs_add_dp() there's a similar NULL > check with the same "could panic" comment, which makes sense to me because > sysfs_create_group() is actually called after the check. > > 8#ifdef CONFIG_NET_NS > /* Due to bug in 2.6.32 kernel, sysfs_create_group() could panic > * in other namespace than init_net. Following check is to avoid it. > */ > if (!kobj->sd) > return -ENOENT; > #endif > /* Create /sys/class/net//bridge directory. */ > err = sysfs_create_group(kobj, &bridge_group); > 8 > OTOH, after the check in datapath/dp_sysfs_if.c:ovs_dp_sysfs_add_if() > there's no call to sysfs_create_group() as I mentioned in the first mail. > And that made me uncertain what the check is actually meant for... Pravin originally wrote that code, so I'll let him comment on the call in ovs_dp_sysfs_add_if(). ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
On Fri, Sep 28, 2012 at 11:12:22AM -0700, Jesse Gross wrote: > Now that both the kernel and userspace can handle information about > the tunnel outer headers this adds userspace support for communicating > between the two. At the moment userspace doesn't know how to do > anything with the extra information on receive and will only generate > actions containing tun_id. However, both sides know how to ignore the > extra information. > > Signed-off-by: Jesse Gross What's your thought about how to handle the "XXX: need flags" bits? (Or does a later patch address that? I haven't read ahead.) ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 3/5] netdev-dummy: Add tunnel variation.
On Fri, Sep 28, 2012 at 11:12:23AM -0700, Jesse Gross wrote: > In order to be able to unit test the tunnel flow setup code we > need to be able to generate netdevs that appear to be like > tunnel devices. This adds a dummy_tunnel netdev with two > differences from the normal dummy devices: the ability to set and > read config and a different type to identify them as needing > tunnel translation. > > Signed-off-by: Jesse Gross Seems reasonable, thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] lib: Add xpipe_nonblocking helper
Signed-off-by: Ed Maste --- As suggested by Ben Pfaff in the 'Threaded userspace datapath' thread. This looks like a useful utility function refactoring, independent of any outcome of that discussion. lib/fatal-signal.c | 4 +--- lib/process.c | 4 +--- lib/signals.c | 4 +--- lib/socket-util.c | 7 +++ lib/socket-util.h | 1 + 5 files changed, 11 insertions(+), 9 deletions(-) diff --git a/lib/fatal-signal.c b/lib/fatal-signal.c index 21ebb5a..7cfbd05 100644 --- a/lib/fatal-signal.c +++ b/lib/fatal-signal.c @@ -74,9 +74,7 @@ fatal_signal_init(void) inited = true; -xpipe(signal_fds); -xset_nonblocking(signal_fds[0]); -xset_nonblocking(signal_fds[1]); +xpipe_nonblocking(signal_fds); sigemptyset(&fatal_signal_set); for (i = 0; i < ARRAY_SIZE(fatal_signals); i++) { diff --git a/lib/process.c b/lib/process.c index 91dfc06..9f5c35f 100644 --- a/lib/process.c +++ b/lib/process.c @@ -82,9 +82,7 @@ process_init(void) inited = true; /* Create notification pipe. */ -xpipe(fds); -xset_nonblocking(fds[0]); -xset_nonblocking(fds[1]); +xpipe_nonblocking(fds); /* Set up child termination signal handler. */ memset(&sa, 0, sizeof sa); diff --git a/lib/signals.c b/lib/signals.c index b712f7e..152afcf 100644 --- a/lib/signals.c +++ b/lib/signals.c @@ -63,9 +63,7 @@ signal_init(void) static bool inited; if (!inited) { inited = true; -xpipe(fds); -xset_nonblocking(fds[0]); -xset_nonblocking(fds[1]); +xpipe_nonblocking(fds); } } diff --git a/lib/socket-util.c b/lib/socket-util.c index e32c03d..f8b44cc 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -894,6 +894,14 @@ xpipe(int fds[2]) } void +xpipe_nonblocking(int fds[2]) +{ +xpipe(fds); +xset_nonblocking(fds[0]); +xset_nonblocking(fds[1]); +} + +void xsocketpair(int domain, int type, int protocol, int fds[2]) { if (socketpair(domain, type, protocol, fds)) { diff --git a/lib/socket-util.h b/lib/socket-util.h index bacb236..a00b32e 100644 --- a/lib/socket-util.h +++ b/lib/socket-util.h @@ -64,6 +64,7 @@ int fsync_parent_dir(const char *file_name); int get_mtime(const char *file_name, struct timespec *mtime); void xpipe(int fds[2]); +void xpipe_nonblocking(int fds[2]); char *describe_fd(int fd); -- 1.7.11.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 1/5] flow: Extend struct flow to contain tunnel outer header.
On Fri, Sep 28, 2012 at 12:54 PM, Ben Pfaff wrote: > On Fri, Sep 28, 2012 at 11:12:21AM -0700, Jesse Gross wrote: >> Now that the kernel is supplying information about the outer IP >> header for tunneled packets userspace needs to be able to track it >> as part of the flow. For the time being this is only used internally >> by OVS and not exposed outwards to OpenFlow. As a result, this >> threads the information throughout userspace but simply stores the >> existing tun_id in it. >> >> Signed-off-by: Jesse Gross > > Is this gracefully extensible if later we need to support IPv6 tunnel > endpoints? We'll need to add extra space for the longer addresses and update the tunnel code that cares, of course, but I don't think there is any other real work. > It took me a while to figure out the purpose of the 'tnl' parameter to > flow_extract(). I think it would have been more obvious if the > parameter were marked 'const'. OK. > If you can be satisfied with 16 bits of flags (only 3 flags exist so > far) then struct flow_tnl can be reduced by 4 bytes, including 2 bytes > of padding. On 64-bit machines that would still result in the compiler adding 4 bytes of padding since it's no long 8-byte aligned. However, this patch already makes sure that we memset the whole struct so we can take the benefit on 32-bit machines. > Should struct flow_metadata get a struct flow_tnl member? I guess it > depends on whether we plan to expose anything other than tunnel ID via > OpenFlow. There are no current plans for any of this to be visible to OpenFlow. > format_tunnel_flags() is kind of oddly written. It doesn't need to be > a loop if you dropped the "else"s and put "if (flags)" before the > final "else" clause. Hmm, yes, I'm not sure why I wrote it that way. > The flow_format() output might be easier to read if it omitted all > tunnel information if there isn't any, instead of writing "tunnel(0)". It's fine to get rid of it. > match_format() doesn't print any of the new fields. I assume that's > intentional? They're not currently part of the match. > In process_packet_in(), > memset(&tnl, 0, sizeof tnl); > tnl.tun_id = pi.fmd.tun_id; > > flow_extract(&pkt, 0, &tnl, pi.fmd.in_port, &flow); > might more simply be written as: > flow_extract(&pkt, 0, NULL, pi.fmd.in_port, &flow); > flow.tunnel.tun_id = pi.fmd.tun_id; > and I do see it written that way in other places. > Similarly in ofproto_unixctl_trace(). Yes, I guess that's cleaner. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 4/5] tunnel: Userspace implementation of tunnel manipulation.
On Fri, Sep 28, 2012 at 11:12:24AM -0700, Jesse Gross wrote: > The kernel tunneling code currently needs to handle a large number > of operations when tunnel packets are encapsulated and decapsulated. > Some examples of this are: finding the correct tunnel port on receive, > TTL and ToS inheritance, ECN handling, etc. All of these can be done > on a per-flow basis in userspace now that we have both the inner and > outer header information, which allows us to both simpify the kernel > and take advantage of userspace's information. This ports the logic > from the kernel to userspace and also pulls in the tunnel-specific > configuration handling from netdev-vport.c. Once tunnel packets are > redirected into this code, the redundant pieces can be removed from > other places. > > Signed-off-by: Jesse Gross I guess that DF inheritance is not an important feature? (Should we accept the option and just warn about it, for better compatibility?) In tunnel.c, I would move #include "tunnel.h" just after #include , to ensure that tunnel.h is self-contained. In tunnel.c, we usually put a line break just before the function's name in a function definition. It looks like tnl_port_add__() succeeds if the tunnel type is invalid, but I don't know why. Oh, maybe I see--this layer is meant to be a sort of shim that leaves non-tunnels alone? Some kind of high-level comment (maybe in tunnel.h) would be helpful. In tnl_hash(), since you made sure that struct tnl_match is a multiple of 4 bytes, you could use hash_words() instead of hash_bytes() for a slight speedup. I didn't read tnl_find() or tnl_config_get() carefully. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
On Fri, Sep 28, 2012 at 1:55 PM, Ben Pfaff wrote: > On Fri, Sep 28, 2012 at 11:12:22AM -0700, Jesse Gross wrote: >> Now that both the kernel and userspace can handle information about >> the tunnel outer headers this adds userspace support for communicating >> between the two. At the moment userspace doesn't know how to do >> anything with the extra information on receive and will only generate >> actions containing tun_id. However, both sides know how to ignore the >> extra information. >> >> Signed-off-by: Jesse Gross > > What's your thought about how to handle the "XXX: need flags" bits? > (Or does a later patch address that? I haven't read ahead.) The kernel interface in Kyle's first patch has a place for flags but doesn't actually define any yet. As I worked on the userspace portions I decided that it made sense to expose those three that are in userspace. We'll have to add them to the kernel interface as well since there's no way currently to translate in odp-util.c, which is what those XXX comments refer to. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] lib: Add xpipe_nonblocking helper
On Fri, Sep 28, 2012 at 09:06:41PM +, Ed Maste wrote: > Signed-off-by: Ed Maste > --- > As suggested by Ben Pfaff in the 'Threaded userspace datapath' thread. > This looks like a useful utility function refactoring, independent of > any outcome of that discussion. Applied to master, thank you. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 4/5] tunnel: Userspace implementation of tunnel manipulation.
On Fri, Sep 28, 2012 at 2:18 PM, Ben Pfaff wrote: > On Fri, Sep 28, 2012 at 11:12:24AM -0700, Jesse Gross wrote: >> The kernel tunneling code currently needs to handle a large number >> of operations when tunnel packets are encapsulated and decapsulated. >> Some examples of this are: finding the correct tunnel port on receive, >> TTL and ToS inheritance, ECN handling, etc. All of these can be done >> on a per-flow basis in userspace now that we have both the inner and >> outer header information, which allows us to both simpify the kernel >> and take advantage of userspace's information. This ports the logic >> from the kernel to userspace and also pulls in the tunnel-specific >> configuration handling from netdev-vport.c. Once tunnel packets are >> redirected into this code, the redundant pieces can be removed from >> other places. >> >> Signed-off-by: Jesse Gross > > I guess that DF inheritance is not an important feature? (Should we > accept the option and just warn about it, for better compatibility?) By default we always set the DF bit to on and the times that I've heard of people wanting to change that is when they know things are broken and want to completely turn it off, so I doubt that inheritance is really useful. I wish that we had exposed this as one of our fragment states in the flow setup (as in "none; none, don't fragment; first; later) but we don't currently have that information in userspace. We could add it to the interface but it doesn't seem worth it for something that will probably never get used. We do accept and ignore unknown flags with a warning. It will be a generic message in this case but I think that's probably fine here. > In tunnel.c, I would move #include "tunnel.h" just after #include > , to ensure that tunnel.h is self-contained. OK. > In tunnel.c, we usually put a line break just before the function's > name in a function definition. OK, I wasn't sure whether we were coming or going to that style. > It looks like tnl_port_add__() succeeds if the tunnel type is invalid, > but I don't know why. Oh, maybe I see--this layer is meant to be a > sort of shim that leaves non-tunnels alone? Some kind of high-level > comment (maybe in tunnel.h) would be helpful. Yes, currently it allows non-tunnels to be passed in to the various public functions. I'll leave it to Justin as to whether that's the best behavior once we have better integration. For now I've added a comment to document the current behavior. > In tnl_hash(), since you made sure that struct tnl_match is a multiple > of 4 bytes, you could use hash_words() instead of hash_bytes() for a > slight speedup. Good idea. Thanks for the reviews. I'm going to send out the updated series, not because I think that it has changed enough to require more review but to make sure that anyone trying to build on top of it has an updated version. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] Altavoz soporte para IPAD / Visita la Rioja y sus vinos / Radio FM pulsera
Cabestan. Layout Simple Asegúrate de no perderte ninguna oferta, añade ofer...@globalbono.com a tu lista de contactos. Si no ves correctamente las imágenes, pulsa [ http://email.globalbono.com/E28092012142809.cfm?WL=1599&WS=208833_8707685&WA=822 ] aquí [ http://email.globalbono.com/Go/index.cfm?WL=564&WS=208833_8707685&WA=822 ] Tus Ofertas de hoy Síguenos en: [ http://email.globalbono.com/Go/index.cfm?WL=50&WS=208833_8707685&WA=822 ] [ http://email.globalbono.com/Go/index.cfm?WL=51&WS=208833_8707685&WA=822 ] [ http://email.globalbono.com/Go/index.cfm?WL=1152&WS=208833_8707685&WA=822 ] [ http://email.globalbono.com/Go/index.cfm?WL=1579&WS=208833_8707685&WA=822 ] 46% Dto. Radio FM para el brazo Si eres un amante de la radio pero tu inconveniente es que no paras quieto, aquí tienes la radio que se... 5,94¤ [ http://email.globalbono.com/Go/index.cfm?WL=1579&WS=208833_8707685&WA=822 ] Antes 10,94¤Dto: 46% Ahorro: 5,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=1578&WS=208833_8707685&WA=822 ] 41% Dto. Altavoz soporte para Ipad Base de soporte para Ipad y otros dispositivos de audio con altavoces. Perfecto para llevar a cualquier... 17,24¤ [ http://email.globalbono.com/Go/index.cfm?WL=1578&WS=208833_8707685&WA=822 ] Antes 29,00¤Dto: 41% Ahorro: 11,76¤ [ http://email.globalbono.com/Go/index.cfm?WL=1582&WS=208833_8707685&WA=822 ] 47% Dto. Funda de polipiel para Ipad Si quieres proteger tu Ipad de forma segura para así evitar rallazos en su pantalla, aquí tienes una fu... 11,64¤ [ http://email.globalbono.com/Go/index.cfm?WL=1582&WS=208833_8707685&WA=822 ] Antes 22,00¤Dto: 47% Ahorro: 10,36¤ [ http://email.globalbono.com/Go/index.cfm?WL=1581&WS=208833_8707685&WA=822 ] 54% Dto. BODEGAS MUGA en La Rioja. Cata un vino con historia. La escapada a BODEGAS MUGA en La Rioja te permite adentrarte en el maravilloso y singular... 55,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=1581&WS=208833_8707685&WA=822 ] Antes 120,00¤Dto: 54% Ahorro: 65,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=1580&WS=208833_8707685&WA=822 ] 65% Dto. Vive Madrid a lo grande. Glamour y noches de diversión. La escapada COPAS CON GLAMOUR te da acceso a un mundo exclusivo y privilegiado. Podrás vi... 45,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=1580&WS=208833_8707685&WA=822 ] Antes 130,00¤Dto: 65% Ahorro: 85,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=1583&WS=208833_8707685&WA=822 ] 56% Dto. Escapada RELAX en Ávila. Descansa el cuerpo y la mente. Disfruta de una completa sesión de relajación con esta estupenda escapada que te ofrecemos. Podrás disf... 140,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=1583&WS=208833_8707685&WA=822 ] Antes 320,00¤Dto: 56% Ahorro: 180,00¤ [ http://email.globalbono.com/Go/index.cfm?WL=706&WS=208833_8707685&WA=822 ] De conformidad con lo que establece la Ley Orgánica 15/1999 de Protección de Datos de Carácter Personal, le informamos que sus datos personales serán incorporados a un fichero bajo la responsabilidad de GlobalBono, con la finalidad de poder atender los compromisos derivados de la relación que mantenemos con usted. Puede ejercer sus derechos de acceso, cancelación, rectificación y oposición mediante un escrito a la dirección email i...@globalbono.com. Si en el plazo de 30 días no nos comunica lo contrario, entenderemos que los datos no han sido modificados, que se compromete a notificarnos cualquier variación y que tenemos el consentimiento para utilizarlos a fin de poder fidelizar la relación entre las partes. ¿Quieres dejar de recibir este email? Haz click [ http://email.globalbono.com/baja/form_baja_GB.cfm?WL=182&WS=208833_8707685&WA=822 ] aquí. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC PATCH v2 1/5] flow: Extend struct flow to contain tunnel outer header.
Now that the kernel is supplying information about the outer IP header for tunneled packets userspace needs to be able to track it as part of the flow. For the time being this is only used internally by OVS and not exposed outwards to OpenFlow. As a result, this threads the information throughout userspace but simply stores the existing tun_id in it. Signed-off-by: Jesse Gross --- lib/dpif-linux.c |2 +- lib/dpif-netdev.c |4 ++-- lib/dpif-provider.h|2 +- lib/dpif.c |2 +- lib/flow.c | 55 +++ lib/flow.h | 20 lib/learning-switch.c |3 ++- lib/match.c| 15 lib/meta-flow.c|8 +++ lib/nx-match.c |3 ++- lib/odp-util.c | 12 +- lib/ofp-print.c|2 +- lib/ofp-util.c |6 ++--- ofproto/ofproto-dpif.c | 11 + ofproto/ofproto-provider.h |4 ++-- ofproto/ofproto.c |4 ++-- tests/ofp-print.at |6 ++--- tests/ofproto-dpif.at | 56 ++-- tests/ofproto.at | 12 +- tests/test-classifier.c| 40 +++ tests/test-flows.c |2 +- 21 files changed, 162 insertions(+), 107 deletions(-) diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c index a2eb490..3a4a4e6 100644 --- a/lib/dpif-linux.c +++ b/lib/dpif-linux.c @@ -1399,7 +1399,7 @@ dpif_linux_vport_send(int dp_ifindex, uint32_t port_no, uint64_t action; ofpbuf_use_const(&packet, data, size); -flow_extract(&packet, 0, htonll(0), 0, &flow); +flow_extract(&packet, 0, NULL, 0, &flow); ofpbuf_use_stack(&key, &keybuf, sizeof keybuf); odp_flow_key_from_flow(&key, &flow); diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 1ce14ce..797cb06 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -916,7 +916,7 @@ dpif_netdev_execute(struct dpif *dpif, const struct dpif_execute *execute) ofpbuf_reserve(©, DP_NETDEV_HEADROOM); ofpbuf_put(©, execute->packet->data, execute->packet->size); -flow_extract(©, 0, 0, -1, &key); +flow_extract(©, 0, NULL, -1, &key); error = dpif_netdev_flow_from_nlattrs(execute->key, execute->key_len, &key); if (!error) { @@ -1014,7 +1014,7 @@ dp_netdev_port_input(struct dp_netdev *dp, struct dp_netdev_port *port, if (packet->size < ETH_HEADER_LEN) { return; } -flow_extract(packet, 0, 0, odp_port_to_ofp_port(port->port_no), &key); +flow_extract(packet, 0, NULL, odp_port_to_ofp_port(port->port_no), &key); flow = dp_netdev_lookup_flow(dp, &key); if (flow) { dp_netdev_flow_used(flow, packet); diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h index 317e617..ffe084a 100644 --- a/lib/dpif-provider.h +++ b/lib/dpif-provider.h @@ -292,7 +292,7 @@ struct dpif_class { * taken from the flow specified in the 'execute->key_len' bytes of * 'execute->key'. ('execute->key' is mostly redundant with * 'execute->packet', but it contains some metadata that cannot be - * recovered from 'execute->packet', such as tun_id and in_port.) */ + * recovered from 'execute->packet', such as tunnel and in_port.) */ int (*execute)(struct dpif *dpif, const struct dpif_execute *execute); /* Executes each of the 'n_ops' operations in 'ops' on 'dpif', in the order diff --git a/lib/dpif.c b/lib/dpif.c index 2968966..7f859d7 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -977,7 +977,7 @@ dpif_execute__(struct dpif *dpif, const struct dpif_execute *execute) * the Ethernet frame specified in 'packet' taken from the flow specified in * the 'key_len' bytes of 'key'. ('key' is mostly redundant with 'packet', but * it contains some metadata that cannot be recovered from 'packet', such as - * tun_id and in_port.) + * tunnel and in_port.) * * Returns 0 if successful, otherwise a positive errno value. */ int diff --git a/lib/flow.c b/lib/flow.c index e517a03..93bd9b2 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -316,7 +316,7 @@ invalid: } -/* Initializes 'flow' members from 'packet', 'skb_priority', 'tun_id', and +/* Initializes 'flow' members from 'packet', 'skb_priority', 'tnl', and * 'ofp_in_port'. * * Initializes 'packet' header pointers as follows: @@ -334,8 +334,9 @@ invalid: * present and has a correct length, and otherwise NULL. */ void -flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id, - uint16_t ofp_in_port, struct flow *flow) +flow_extract(struct ofpbuf *packet, uint32_t skb_priority, + const struct flow_tnl *tnl, uint16_t ofp_in_port, + struct flow *flow) { struct ofpbuf b = *packet; struct eth_header *eth; @@ -343,7 +344,10 @@ flow_extract(struct ofpbuf *packet, uint32
[ovs-dev] [RFC PATCH v2 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
Now that both the kernel and userspace can handle information about the tunnel outer headers this adds userspace support for communicating between the two. At the moment userspace doesn't know how to do anything with the extra information on receive and will only generate actions containing tun_id. However, both sides know how to ignore the extra information. Signed-off-by: Jesse Gross --- NEWS |2 ++ lib/odp-util.c| 75 + tests/odp.at | 12 tests/ofproto-dpif.at |2 +- 4 files changed, 73 insertions(+), 18 deletions(-) diff --git a/NEWS b/NEWS index ae831e3..6265e2f 100644 --- a/NEWS +++ b/NEWS @@ -38,6 +38,8 @@ post-v1.8.0 - The autopath action. - Interface type "null". - Numeric values for reserved ports (see "ovs-ofctl" note above). +- Tunneling requires the version of the kernel module paired with this + release (or a newer release). v1.8.0 - xx xxx diff --git a/lib/odp-util.c b/lib/odp-util.c index 9ed17ed..ac0042c 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -925,6 +925,30 @@ parse_odp_key_attr(const char *s, const struct simap *port_names, } { +char tun_id_s[32]; +unsigned long long int flags; +int tos, ttl; +struct ovs_key_ipv4_tunnel tun_key; +int n = -1; + +if (sscanf(s, "ipv4_tunnel(tun_id=%31[x0123456789abcdefABCDEF]," + "flags=%lli,src="IP_SCAN_FMT",dst="IP_SCAN_FMT + ",tos=%i,ttl=%i)%n", tun_id_s, &flags, +IP_SCAN_ARGS(&tun_key.ipv4_src), +IP_SCAN_ARGS(&tun_key.ipv4_dst), &tos, &ttl, +&n) > 0 && n > 0) { +tun_key.tun_id = htonll(strtoull(tun_id_s, NULL, 0)); +tun_key.tun_flags = flags; +tun_key.ipv4_tos = tos; +tun_key.ipv4_ttl = ttl; +memset(&tun_key.pad, 0, sizeof tun_key.pad); +nl_msg_put_unspec(key, OVS_KEY_ATTR_IPV4_TUNNEL, &tun_key, + sizeof tun_key); +return n; +} +} + +{ unsigned long long int in_port; int n = -1; @@ -1273,8 +1297,18 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow) nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->skb_priority); } -if (flow->tunnel.tun_id != htonll(0)) { -nl_msg_put_be64(buf, OVS_KEY_ATTR_TUN_ID, flow->tunnel.tun_id); +if (flow->tunnel.ip_dst != 0) { +struct ovs_key_ipv4_tunnel *tun_key; + +tun_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV4_TUNNEL, + sizeof *tun_key); +tun_key->tun_id = flow->tunnel.tun_id; +tun_key->ipv4_src = flow->tunnel.ip_src; +tun_key->ipv4_dst = flow->tunnel.ip_dst; +tun_key->tun_flags = 0; // XXX: Need flags +tun_key->ipv4_tos = flow->tunnel.ip_tos; +tun_key->ipv4_ttl = flow->tunnel.ip_ttl; +memset(&tun_key->pad, 0, sizeof tun_key->pad); } if (flow->in_port != OFPP_NONE && flow->in_port != OFPP_CONTROLLER) { @@ -1765,9 +1799,17 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY; } -if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUN_ID)) { -flow->tunnel.tun_id = nl_attr_get_be64(attrs[OVS_KEY_ATTR_TUN_ID]); -expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TUN_ID; +if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL)) { +const struct ovs_key_ipv4_tunnel *tun_key; + +tun_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV4_TUNNEL]); +flow->tunnel.tun_id = tun_key->tun_id; +flow->tunnel.ip_src = tun_key->ipv4_src; +flow->tunnel.ip_dst = tun_key->ipv4_dst; +flow->tunnel.flags = 0; // XXX need flags +flow->tunnel.ip_tos = tun_key->ipv4_tos; +flow->tunnel.ip_ttl = tun_key->ipv4_ttl; +expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL; } if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IN_PORT)) { @@ -1858,16 +1900,27 @@ commit_set_action(struct ofpbuf *odp_actions, enum ovs_key_attr key_type, } static void -commit_set_tun_id_action(const struct flow *flow, struct flow *base, +commit_set_tunnel_action(const struct flow *flow, struct flow *base, struct ofpbuf *odp_actions) { -if (base->tunnel.tun_id == flow->tunnel.tun_id) { +struct ovs_key_ipv4_tunnel tun_key; + +if (!memcmp(&base->tunnel, &flow->tunnel, sizeof base->tunnel)) { return; } -base->tunnel.tun_id = flow->tunnel.tun_id; -commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID, - &base->tunnel.tun_id, sizeof(base->tunnel.tun_id)); +memcpy(&base->tunnel, &flow->tunnel, sizeof base->tunnel); + +tun_key.tun_id = base->tunnel.tun_id; +tun_key.ip
[ovs-dev] [RFC PATCH v2 3/5] netdev-dummy: Add tunnel variation.
In order to be able to unit test the tunnel flow setup code we need to be able to generate netdevs that appear to be like tunnel devices. This adds a dummy_tunnel netdev with two differences from the normal dummy devices: the ability to set and read config and a different type to identify them as needing tunnel translation. Signed-off-by: Jesse Gross --- lib/netdev-dummy.c | 88 1 file changed, 88 insertions(+) diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c index 6aa4084..bb65c12 100644 --- a/lib/netdev-dummy.c +++ b/lib/netdev-dummy.c @@ -29,6 +29,7 @@ #include "packets.h" #include "poll-loop.h" #include "shash.h" +#include "smap.h" #include "sset.h" #include "unixctl.h" #include "vlog.h" @@ -37,6 +38,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_dummy); struct netdev_dev_dummy { struct netdev_dev netdev_dev; +struct smap config; uint8_t hwaddr[ETH_ADDR_LEN]; int mtu; struct netdev_stats stats; @@ -94,6 +96,7 @@ netdev_dummy_create(const struct netdev_class *class, const char *name, netdev_dev = xzalloc(sizeof *netdev_dev); netdev_dev_init(&netdev_dev->netdev_dev, name, class); +smap_init(&netdev_dev->config); netdev_dev->hwaddr[0] = 0xaa; netdev_dev->hwaddr[1] = 0x55; netdev_dev->hwaddr[2] = n >> 24; @@ -119,12 +122,35 @@ netdev_dummy_destroy(struct netdev_dev *netdev_dev_) { struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); +smap_destroy(&netdev_dev->config); shash_find_and_delete(&dummy_netdev_devs, netdev_dev_get_name(netdev_dev_)); free(netdev_dev); } static int +netdev_dummy_get_config(struct netdev_dev *netdev_dev_, struct smap *args) +{ +struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); + +smap_destroy(args); +smap_clone(args, &netdev_dev->config); + +return 0; +} + +static int +netdev_dummy_set_config(struct netdev_dev *netdev_dev_, const struct smap *args) +{ +struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); + +smap_destroy(&netdev_dev->config); +smap_clone(&netdev_dev->config, args); + +return 0; +} + +static int netdev_dummy_open(struct netdev_dev *netdev_dev_, struct netdev **netdevp) { struct netdev_dev_dummy *netdev_dev = netdev_dev_dummy_cast(netdev_dev_); @@ -370,6 +396,67 @@ static const struct netdev_class dummy_class = { netdev_dummy_change_seq }; +static const struct netdev_class dummy_tunnel_class = { +"dummy_tunnel", +NULL, /* init */ +NULL, /* run */ +NULL, /* wait */ + +netdev_dummy_create, +netdev_dummy_destroy, +netdev_dummy_get_config, +netdev_dummy_set_config, + +netdev_dummy_open, +netdev_dummy_close, + +netdev_dummy_listen, +netdev_dummy_recv, +netdev_dummy_recv_wait, +netdev_dummy_drain, + +NULL, /* send */ +NULL, /* send_wait */ + +netdev_dummy_set_etheraddr, +netdev_dummy_get_etheraddr, +netdev_dummy_get_mtu, +netdev_dummy_set_mtu, +NULL, /* get_ifindex */ +NULL, /* get_carrier */ +NULL, /* get_carrier_resets */ +NULL, /* get_miimon */ +netdev_dummy_get_stats, +netdev_dummy_set_stats, + +NULL, /* get_features */ +NULL, /* set_advertisements */ + +NULL, /* set_policing */ +NULL, /* get_qos_types */ +NULL, /* get_qos_capabilities */ +NULL, /* get_qos */ +NULL, /* set_qos */ +NULL, /* get_queue */ +NULL, /* set_queue */ +NULL, /* delete_queue */ +NULL, /* get_queue_stats */ +NULL, /* dump_queues */ +NULL, /* dump_queue_stats */ + +NULL, /* get_in4 */ +NULL, /* set_in4 */ +NULL, /* get_in6 */ +NULL, /* add_router */ +NULL, /* get_next_hop */ +NULL, /* get_drv_info */ +NULL, /* arp_lookup */ + +netdev_dummy_update_flags, + +netdev_dummy_change_seq +}; + static struct ofpbuf * eth_from_packet_or_flow(const char *s) { @@ -529,4 +616,5 @@ netdev_dummy_register(bool override) sset_destroy(&types); } netdev_register_provider(&dummy_class); +netdev_register_provider(&dummy_tunnel_class); } -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC PATCH v2 4/5] tunnel: Userspace implementation of tunnel manipulation.
The kernel tunneling code currently needs to handle a large number of operations when tunnel packets are encapsulated and decapsulated. Some examples of this are: finding the correct tunnel port on receive, TTL and ToS inheritance, ECN handling, etc. All of these can be done on a per-flow basis in userspace now that we have both the inner and outer header information, which allows us to both simpify the kernel and take advantage of userspace's information. This ports the logic from the kernel to userspace and also pulls in the tunnel-specific configuration handling from netdev-vport.c. Once tunnel packets are redirected into this code, the redundant pieces can be removed from other places. Signed-off-by: Jesse Gross --- NEWS |2 + lib/automake.mk |2 + lib/tunnel.c | 759 ++ lib/tunnel.h | 45 +++ tests/automake.mk|1 + tests/testsuite.at |1 + tests/tunnel.at | 112 vswitchd/vswitch.xml | 14 +- 8 files changed, 925 insertions(+), 11 deletions(-) create mode 100644 lib/tunnel.c create mode 100644 lib/tunnel.h create mode 100644 tests/tunnel.at diff --git a/NEWS b/NEWS index 6265e2f..be401f4 100644 --- a/NEWS +++ b/NEWS @@ -40,6 +40,8 @@ post-v1.8.0 - Numeric values for reserved ports (see "ovs-ofctl" note above). - Tunneling requires the version of the kernel module paired with this release (or a newer release). +- Inheritance of the Don't Fragment bit in IP tunnels (df_inherit) is + no longer supported. v1.8.0 - xx xxx diff --git a/lib/automake.mk b/lib/automake.mk index 94b86f6..9513cd9 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -181,6 +181,8 @@ lib_libopenvswitch_a_SOURCES = \ lib/timeval.h \ lib/token-bucket.c \ lib/token-bucket.h \ + lib/tunnel.c \ + lib/tunnel.h \ lib/type-props.h \ lib/unaligned.h \ lib/unicode.c \ diff --git a/lib/tunnel.c b/lib/tunnel.c new file mode 100644 index 000..2217b83 --- /dev/null +++ b/lib/tunnel.c @@ -0,0 +1,759 @@ +/* Copyright (c) 2012 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include + +#include "byte-order.h" +#include "daemon.h" +#include "dirs.h" +#include "dynamic-string.h" +#include "hash.h" +#include "hmap.h" +#include "packets.h" +#include "smap.h" +#include "socket-util.h" +#include "tunnel.h" +#include "unixctl.h" +#include "vlog.h" + +/* TODO: + * + * Better hooks for tunnel creation/deletion/input/output + * Mechanism to create DP ports + * Ability to generate actions on input for ECN + * Revalidate flows on port add/remove/reconfigure + * Port stats + * Kernel interface needs flags defined (in particular for keys) + * IPsec flag needed? + */ + +VLOG_DEFINE_THIS_MODULE(tunnel); + +#define DEFAULT_TTL 64 + +struct tnl_match { +ovs_be64 in_key; +ovs_be32 ip_src; +ovs_be32 ip_dst; +uint32_t dp_portno; +bool key_present; +bool in_key_flow; +}; +BUILD_ASSERT_DECL(sizeof(struct tnl_match) % 4 == 0); + +struct tnl_port { +struct hmap_node match_node; +struct hmap_node portno_node; + +const struct netdev *netdev; +struct tnl_match match; +ovs_be64 out_key; +uint32_t tnl_portno; +uint8_t ttl; +uint8_t tos; +bool out_key_flow; +bool key_present; +bool ttl_inherit; +bool tos_inherit; +bool dont_fragment; +bool csum; +}; + +static struct hmap tnl_match_map = HMAP_INITIALIZER(&tnl_match_map); +static struct hmap tnl_portno_map = HMAP_INITIALIZER(&tnl_portno_map); + +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); +static struct vlog_rate_limit dbg_rl = VLOG_RATE_LIMIT_INIT(60, 60); + +static int tnl_config_get(const struct netdev *, struct tnl_port *); +static struct tnl_port *tnl_find(struct tnl_match *); +static struct tnl_port *tnl_find_exact(struct tnl_match *); +static uint32_t tnl_hash(struct tnl_match *); +static void tnl_match_fmt(const struct tnl_match *, struct ds *); +static unixctl_cb_func tnl_port_dump; +static void tnl_port_mod_log(uint32_t tnl_portno, const char *action); +static struct tnl_port *tnl_portno_find(uint32_t tnl_portno); + +void +tnl_init(void) +{ +static bool inited; + +if (inited) { +return; +} +inited = true; + +unixctl_command_register("tunnel/show", "[interface]", 0, 1, + tnl_port_dump, N
[ovs-dev] [RFC PATCH v2 5/5] ofproto-dpif: Basic hooks for rewriting tunnel packets.
This is a basic set of hooks into the ofproto-dpif code so that tunneled packets can be have their flows rewritten to reflect the outward facing view of how ports are laid out. This is far from being a comprehensive layer and should be replaced once a better level of indirection is available. Signed-off-by: Jesse Gross --- ofproto/ofproto-dpif.c | 40 1 file changed, 32 insertions(+), 8 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index ff6df41..6fbe406 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -50,6 +50,7 @@ #include "poll-loop.h" #include "simap.h" #include "timer.h" +#include "tunnel.h" #include "unaligned.h" #include "unixctl.h" #include "vlan-bitmap.h" @@ -803,6 +804,8 @@ construct(struct ofproto *ofproto_) hmap_init(&ofproto->vlandev_map); hmap_init(&ofproto->realdev_vid_map); +tnl_init(); + hmap_insert(&all_ofproto_dpifs, &ofproto->all_ofproto_dpifs_node, hash_string(ofproto->up.name, 0)); memset(&ofproto->stats, 0, sizeof ofproto->stats); @@ -2536,6 +2539,11 @@ port_add(struct ofproto *ofproto_, struct netdev *netdev, uint16_t *ofp_portp) error = dpif_port_add(ofproto->dpif, netdev, &odp_port); if (!error) { +error = tnl_port_add(odp_port, odp_port, netdev); +if (error) { +dpif_port_del(ofproto->dpif, odp_port); +return error; +} *ofp_portp = odp_port_to_ofp_port(odp_port); } return error; @@ -2545,9 +2553,12 @@ static int port_del(struct ofproto *ofproto_, uint16_t ofp_port) { struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_); +uint16_t odp_port = ofp_port_to_odp_port(ofp_port); int error; -error = dpif_port_del(ofproto->dpif, ofp_port_to_odp_port(ofp_port)); +tnl_port_del(odp_port); + +error = dpif_port_del(ofproto->dpif, odp_port); if (!error) { struct ofport_dpif *ofport = get_ofp_port(ofproto, ofp_port); if (ofport) { @@ -3025,8 +3036,9 @@ handle_flow_miss(struct ofproto_dpif *ofproto, struct flow_miss *miss, * an ODP_FIT_* value that indicates how well 'key' fits our expectations for * what a flow key should contain. * - * This function also includes some logic to help make VLAN splinters - * transparent to the rest of the upcall processing logic. In particular, if + * This function also includes some logic to help with ports that have special + * meaning to their flows such as tunnels and VLAN splinters. Tunnels may + * map their source and destination addresses to a logical port whereas if * the extracted in_port is a VLAN splinter port, it replaces flow->in_port by * the "real" port, sets flow->vlan_tci correctly for the VLAN of the VLAN * splinter port, and pushes a VLAN header onto 'packet' (if it is nonnull). @@ -3043,13 +3055,21 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto, struct ofpbuf *packet) { enum odp_key_fitness fitness; +bool adjusted; +int err; fitness = odp_flow_key_to_flow(key, key_len, flow); if (fitness == ODP_FIT_ERROR) { return fitness; } -*initial_tci = flow->vlan_tci; +err = tnl_port_receive(flow, &adjusted); +if (err) { +/* XXX: We should really install a drop flow here. */ +return ODP_FIT_ERROR; +} + +*initial_tci = flow->vlan_tci; if (vsp_adjust_flow(ofproto, flow)) { if (packet) { /* Make the packet resemble the flow, so that it gets sent to an @@ -3069,10 +3089,12 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto, eth_push_vlan(packet, flow->vlan_tci); } -/* Let the caller know that we can't reproduce 'key' from 'flow'. */ -if (fitness == ODP_FIT_PERFECT) { -fitness = ODP_FIT_TOO_MUCH; -} +adjusted = true; +} + +/* Let the caller know that we can't reproduce 'key' from 'flow'. */ +if (adjusted && fitness == ODP_FIT_PERFECT) { +fitness = ODP_FIT_TOO_MUCH; } return fitness; @@ -4981,6 +5003,8 @@ compose_output_action__(struct action_xlate_ctx *ctx, uint16_t ofp_port, if (out_port != odp_port) { ctx->flow.vlan_tci = htons(0); } +out_port = tnl_port_send(&ctx->flow, out_port); + commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions); nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port); -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Threaded userspace datapath
To me, it was that the existing code mostly worked and had already proved to be a real performance improvement. This coupled with the limited manpower we could pour on the project. I understand that turning a single-threaded program into a multi-threaded one looks scary, and with reason. On the other end, reworking the main loop of the bridge looks just as scary, and there is still the risk to be disappointed by the final performance. More positively, this approach only affects the userlevel datapath, which is also the one that really needs a performance boost. A change in the main loop of the bridge would be much more intrusive, I think, since all datapaths would likely need to be touched. Il giorno 28/set/2012, alle ore 20:48, Jesse Gross ha scritto: > I haven't been following this work all that closely but earlier on > there was a discussion of how the performance benefits might be > related to how poll is used. Restructuring that seems like a much > more desirable path but I never saw any real follow up. What was the > reason for continuing to go down this path? Dr. Ing. Giuseppe Lettieri Dipartimento di Ingegneria della Informazione Universita' di Pisa Largo Lucio Lazzarino 2, 56122 Pisa - Italy Ph. : (+39) 050-2217.649 (direct) .599 (switch) Fax : (+39) 050-2217.600 e-mail: g.letti...@iet.unipi.it ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery) wrote: > On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote: >> Now that both the kernel and userspace can handle information about >> the tunnel outer headers this adds userspace support for communicating >> between the two. At the moment userspace doesn't know how to do >> anything with the extra information on receive and will only generate >> actions containing tun_id. However, both sides know how to ignore the >> extra information. >> >> Signed-off-by: Jesse Gross > > > The patch I just submitted overlaps with this patch a little bit. I think > there wasn't > a really set delineation between what I was doing and what you were doing, so > we may have to merge this patch with mine. I think we both pretty much made > similar changes though, so it shouldn't be too bad. Acking this patch for now, > assuming we'll work through the merge bits. I agree the changes are pretty similar. My inclination is that it's a little easier to use the version in my patch if that's OK with you for a couple of reasons: * I'm not sure that the change logically belongs in the second patch of your series. It could be moved into the first patch though. * There isn't much benefit to userspace supporting both models of kernel tunnel implementation because the kernel will reject any actions that it doesn't understand. As a result, once this patch goes in we will immediately require a new version of the kernel module. Ideally I'd like to push that requirement back as late as practical so that we can start applying patches without worrying about breaking compatibility more than necessary. However, when we do put it in, we might as well go all the way and drop support for the old mechanism in userspace. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
On Sep 28, 2012, at 5:40 PM, Jesse Gross wrote: > On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery) > wrote: >> On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote: >>> Now that both the kernel and userspace can handle information about >>> the tunnel outer headers this adds userspace support for communicating >>> between the two. At the moment userspace doesn't know how to do >>> anything with the extra information on receive and will only generate >>> actions containing tun_id. However, both sides know how to ignore the >>> extra information. >>> >>> Signed-off-by: Jesse Gross >> >> >> The patch I just submitted overlaps with this patch a little bit. I think >> there wasn't >> a really set delineation between what I was doing and what you were doing, so >> we may have to merge this patch with mine. I think we both pretty much made >> similar changes though, so it shouldn't be too bad. Acking this patch for >> now, >> assuming we'll work through the merge bits. > > I agree the changes are pretty similar. My inclination is that it's a > little easier to use the version in my patch if that's OK with you for > a couple of reasons: > * I'm not sure that the change logically belongs in the second patch > of your series. It could be moved into the first patch though. > * There isn't much benefit to userspace supporting both models of > kernel tunnel implementation because the kernel will reject any > actions that it doesn't understand. As a result, once this patch goes > in we will immediately require a new version of the kernel module. > Ideally I'd like to push that requirement back as late as practical so > that we can start applying patches without worrying about breaking > compatibility more than necessary. However, when we do put it in, we > might as well go all the way and drop support for the old mechanism in > userspace. That makes sense to me. Do you want me to rebase my first patch and fold this one into that? I can then rework my second patch based on this. Or even, we could move this patch as a second patch in my series, and I could just rebase my third patch. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
On Fri, Sep 28, 2012 at 3:44 PM, Kyle Mestery (kmestery) wrote: > On Sep 28, 2012, at 5:40 PM, Jesse Gross wrote: >> On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery) >> wrote: >>> On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote: Now that both the kernel and userspace can handle information about the tunnel outer headers this adds userspace support for communicating between the two. At the moment userspace doesn't know how to do anything with the extra information on receive and will only generate actions containing tun_id. However, both sides know how to ignore the extra information. Signed-off-by: Jesse Gross >>> >>> >>> The patch I just submitted overlaps with this patch a little bit. I think >>> there wasn't >>> a really set delineation between what I was doing and what you were doing, >>> so >>> we may have to merge this patch with mine. I think we both pretty much made >>> similar changes though, so it shouldn't be too bad. Acking this patch for >>> now, >>> assuming we'll work through the merge bits. >> >> I agree the changes are pretty similar. My inclination is that it's a >> little easier to use the version in my patch if that's OK with you for >> a couple of reasons: >> * I'm not sure that the change logically belongs in the second patch >> of your series. It could be moved into the first patch though. >> * There isn't much benefit to userspace supporting both models of >> kernel tunnel implementation because the kernel will reject any >> actions that it doesn't understand. As a result, once this patch goes >> in we will immediately require a new version of the kernel module. >> Ideally I'd like to push that requirement back as late as practical so >> that we can start applying patches without worrying about breaking >> compatibility more than necessary. However, when we do put it in, we >> might as well go all the way and drop support for the old mechanism in >> userspace. > > That makes sense to me. Do you want me to rebase my first patch and fold this > one into that? I can then rework my second patch based on this. Or even, we > could move this patch as a second patch in my series, and I could just rebase > my third patch. I actually don't think that much needs to be done - this patch depends on the previous patch to do all the userspace flow changes and we don't really want to roll all of them together. I would just drop the odp-util.c changes from your second patch and then your patches fit in right before my series. That will avoid the conflict and shouldn't have any dependency problems. I haven't had a chance to review your patches that closely yet but my impression is that the bulk of the second patch doesn't have any changes visible to userspace. Is that right? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [RFC PATCH 2/5] odp-util: Enable communicating outer tunnel to/from the kernel.
On Sep 28, 2012, at 5:54 PM, Jesse Gross wrote: > On Fri, Sep 28, 2012 at 3:44 PM, Kyle Mestery (kmestery) > wrote: >> On Sep 28, 2012, at 5:40 PM, Jesse Gross wrote: >>> On Fri, Sep 28, 2012 at 11:59 AM, Kyle Mestery (kmestery) >>> wrote: On Sep 28, 2012, at 1:12 PM, Jesse Gross wrote: > Now that both the kernel and userspace can handle information about > the tunnel outer headers this adds userspace support for communicating > between the two. At the moment userspace doesn't know how to do > anything with the extra information on receive and will only generate > actions containing tun_id. However, both sides know how to ignore the > extra information. > > Signed-off-by: Jesse Gross The patch I just submitted overlaps with this patch a little bit. I think there wasn't a really set delineation between what I was doing and what you were doing, so we may have to merge this patch with mine. I think we both pretty much made similar changes though, so it shouldn't be too bad. Acking this patch for now, assuming we'll work through the merge bits. >>> >>> I agree the changes are pretty similar. My inclination is that it's a >>> little easier to use the version in my patch if that's OK with you for >>> a couple of reasons: >>> * I'm not sure that the change logically belongs in the second patch >>> of your series. It could be moved into the first patch though. >>> * There isn't much benefit to userspace supporting both models of >>> kernel tunnel implementation because the kernel will reject any >>> actions that it doesn't understand. As a result, once this patch goes >>> in we will immediately require a new version of the kernel module. >>> Ideally I'd like to push that requirement back as late as practical so >>> that we can start applying patches without worrying about breaking >>> compatibility more than necessary. However, when we do put it in, we >>> might as well go all the way and drop support for the old mechanism in >>> userspace. >> >> That makes sense to me. Do you want me to rebase my first patch and fold this >> one into that? I can then rework my second patch based on this. Or even, we >> could move this patch as a second patch in my series, and I could just rebase >> my third patch. > > I actually don't think that much needs to be done - this patch depends > on the previous patch to do all the userspace flow changes and we > don't really want to roll all of them together. I would just drop the > odp-util.c changes from your second patch and then your patches fit in > right before my series. That will avoid the conflict and shouldn't > have any dependency problems. > OK, got it. > I haven't had a chance to review your patches that closely yet but my > impression is that the bulk of the second patch doesn't have any > changes visible to userspace. Is that right? That's correct. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2.0] Non-Datapath MPLS actions and matches
On Thu, Sep 27, 2012 at 5:44 PM, Simon Horman wrote: > On Thu, Sep 27, 2012 at 04:45:22PM -0700, Jesse Gross wrote: >> On Thu, Sep 27, 2012 at 12:05 PM, ravi kerur wrote: >> > On Thu, Sep 27, 2012 at 11:16 AM, Jesse Gross wrote: >> >> On Thu, Sep 27, 2012 at 9:26 AM, Ben Pfaff wrote: >> >> > On Wed, Sep 26, 2012 at 04:13:41PM +0900, Simon Horman wrote: >> >> >> This patch provides an implementation of the non-datapath portions >> >> >> of MPLS matches and actions. >> >> >> >> >> >> This patch is based on top of Ben Pfaff's series, >> >> >> "set-field action support" >> >> >> >> >> >> Cc: Isaku Yamahata >> >> >> Cc: Ravi K >> >> > >> >> > I think Ravi's full last name is "Kerur". >> >> > >> >> >> Signed-off-by: Simon Horman >> >> > >> >> > Jesse, do you have any concerns about this? I haven't read past the >> >> > diffstat yet. But if it seems like a reasonable intermediate approach >> >> > then I'm happy to review it. >> >> >> >> I don't think there is inherently anything wrong in starting with a >> >> userspace-only approach. I have a couple of specific concerns based >> >> on briefly skimming the patch: >> >> * It seems like this is really the userspace half of the code which >> >> assumes that the kernel portions will still be doing the work on the >> >> actual packet flows. If that's the case then I don't think that >> >> userspace support can go in independently. Otherwise, userspace >> >> should really be self-contained and setup slow-path flows to do the >> >> work itself. >> > >> > >> > mpls userspace is completely self-contained i.e. doesn't depend on OVS >> > kernel code. I am saying this based on implementation and testing. During >> > testing no OVS kernel module was loaded and testing such as ping, iperf, >> > netperf and scp executed. >> >> At the very least, userspace-only code shouldn't add anything to >> either the userspace/kernel interface or odp-util.c. I also don't see >> how any MPLS action will actually get processed. I do see that MPLS >> actions send packets to the local port and then install a flow but I >> think there is a misunderstanding because that doesn't make a lot of >> sense to me. > > That portion is probably my handiwork. Could you give some guidance on > a method that would work? I'm more than happy to rework things. Currently this is changing the userspace flow to reflect any modifications that we want to make. This is normally correct because then later when we want to actually output a packet (where any modifications would get noticed) we do a comparison from the last output and the current flow and generate any kernel actions necessary. However, in this case we don't have any kernel actions to do MPLS modifications so the flow changes just get ignored. Instead, what we need to do is set up a slow path flow to explicitly send all MPLS packets to userspace if there is any MPLS match or action. We don't currently have anything in this category but it would use the same infrastructure as CFM and LACP which also need to go to userspace since they are actually consumed there. Once the MPLS packets are in userspace, it should actually modify them directly rather than just change the flow. >> >> * If it is truly userspace only, the handling of multiple levels of tags >> >> seems a little incomplete since we actually have the full packet. >> > >> > >> > Can you elaborate please? >> >> Multiple levels of tags aren't handled, for example, when you pop a tag. > > Do you think that adding support for multiple levels of tags to the kernel > is appropriate. Or should such cases be bounced to user-space? The kernel should support enough tags to handle the common cases. Initially I would make this 1 but I could see it becoming 2 in the future. However, regardless of how many tags we support it's always possible that a greater number of tags are used than we support. I can see three ways to handle this: 1. Don't support it. You can run into a similar problem with vlan tags because it's possible to pop tags off and then do further operations but the kernel doesn't look deep enough into the packet. There isn't really a mechanism to deal with this today and to my knowledge nobody has actually complained. 2. Handle in userspace. It's certainly a reasonable approach for corner cases. 3. Some form of recirculation. Allowing the kernel to look at a packet, pop some tags off, and then look at it again allows processing an infinite number of tags in a manner that won't significantly affect throughput (although each round will result in more flow setups). It also allows use cases like pop MPLS tag and do IP routing, which is both fairly common and can't be realistically achieved any other way in kernel. If the short term goal is a kernel implementation, I would do #1 since it's much simpler, is a strict increase in functionality from what we have today and doesn't preclude doing #3 in the future. If you want something more complete then you can handle the
Re: [ovs-dev] [PATCH v2.0] Non-Datapath MPLS actions and matches
On Fri, Sep 28, 2012 at 6:25 AM, ravi kerur wrote: > On Thu, Sep 27, 2012 at 5:44 PM, Simon Horman wrote: >> On Thu, Sep 27, 2012 at 04:45:22PM -0700, Jesse Gross wrote: >> > On Thu, Sep 27, 2012 at 12:05 PM, ravi kerur wrote: >> > > On Thu, Sep 27, 2012 at 11:16 AM, Jesse Gross >> > >> * If it is truly userspace only, the handling of multiple levels of >> > >> tags >> > >> seems a little incomplete since we actually have the full packet. >> > > >> > > >> > > Can you elaborate please? >> > >> > Multiple levels of tags aren't handled, for example, when you pop a tag. >> >> Do you think that adding support for multiple levels of tags to the kernel >> is appropriate. Or should such cases be bounced to user-space? > > > multiple tags do work in both userspace and kernel i.e. any combination > of push/pop actions. I am not sure what the confusion is? > > Simon: I have not looked into your patch, has anything changed here? This patch is not the same as yours. Please actually read it before making assertions. >> > >> If this is supposed to be a quick stepping stone to kernel support >> > >> that seems less important since we will no longer have complete >> > >> packet >> > >> access. >> > >> >> > >> So it basically comes down to what the short term plans are. There's >> > >> also Leo's patch (which I haven't looked at) that I can post if there >> > >> are plans to do kernel support. >> > > >> > > >> > > so will there be separate/different kernel interface? >> > >> > Separate from what? > > > Our earlier disagreement was on packet recirculation for multiple > actions. You mention about Leo's patch and I assumed it contains packet > recirculation support and you want mpls changes for that new piece of kernel > datapath and wanted clarification on that. All I'm looking for is something either implements recirculation or at the very least doesn't preclude it from being done in the future. The patch that I was referring to takes the latter approach and implements just the basic operations in the kernel. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Threaded userspace datapath
On Fri, Sep 28, 2012 at 3:17 PM, Giuseppe Lettieri wrote: > To me, it was that the existing code mostly worked and had already proved to > be a real performance improvement. This coupled with the limited manpower we > could pour on the project. I understand that turning a single-threaded > program into a multi-threaded one looks scary, and with reason. On the other > end, reworking the main loop of the bridge looks just as scary, and there is > still the risk to be disappointed by the final performance. > > More positively, this approach only affects the userlevel datapath, which is > also the one that really needs a performance boost. A change in the main loop > of the bridge would be much more intrusive, I think, since all datapaths > would likely need to be touched. Does this actually require changing the main loop? If it really is the poll related changes that you mention it seems like they could be contained to the netdevs and userspace datapath. You could handle a batch of packets each time through the main loop. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev