Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Tue, Jan 15, 2013 at 11:05:45PM -0800, Ben Pfaff wrote: > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: > > diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h > > index 5e32965..b421753 100644 > > --- a/include/linux/openvswitch.h > > +++ b/include/linux/openvswitch.h > > @@ -282,6 +282,7 @@ enum ovs_key_attr { > > OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ > > OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ > > OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ > > + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ > > OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ > > OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ > > __OVS_KEY_ATTR_MAX > > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of > another value that we're already aiming to get upstream. I'd suggest a > value of 62. > > Jesse, does that sound reasonable to you? Another alternative would be > to avoid modifying entirely, one way or another, > until we have real kernel support, which would be a little extra work. I'll use 62 unless I hear otherwise from Jesse. > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. I'll switch over to using OVS_ACTION_SET+OVS_KEY_ATTR_MPLS, it seems to be in keeping with the existing code. > > > +/* Action structure for NXAST_PUSH_VLAN/MPLS. */ > > +struct nx_action_push { > > +ovs_be16 type; /* OFPAT_PUSH_VLAN/MPLS. */ > > The above two comments are wrong, there's no NXAST_PUSH_VLAN. And the > struct should be named nx_action_push_mpls, not more generically. (This > must be a leftover from Ravi's initial patch that also added QinQ.) > > > +ovs_be16 len; /* Length is 8. */ > > +ovs_be32 vendor;/* NX_VENDOR_ID. */ > > +ovs_be16 subtype; /* NXAST_PUSH_MPLS. */ > > +ovs_be16 ethertype; /* Ethertype */ > > +uint8_t pad[4]; > > +}; > > +OFP_ASSERT(sizeof(struct nx_action_push) == 16); > > > @@ -1271,6 +1272,20 @@ dp_netdev_execute_actions(struct dp_netdev *dp, > > eth_pop_vlan(packet); > > break; > > > > +case OVS_ACTION_ATTR_PUSH_MPLS: { > > +const struct ovs_action_push_mpls *mpls = nl_attr_get(a); > > +push_mpls(packet, mpls->mpls_ethertype, mpls->mpls_label); > > +break; > > The similar OVS_ACTION_ATTR_PUSH_VLAN case declares its variable at the > outer block. Please make the code consistent one way or another on this > score. > > The breakdown of code between parse_mpls() and parse_remaining_mpls() > seems odd. I'd just write one function. > > In mf_is_value_valid(), I think that the MFF_MPLS_TC and MFF_MPLS_BOS > cases should check the u8 member, not be32. Same for > mf_random_value(). Also htonl won't be needed for u8. > > Most of the meta-flow functions present the cases in the order label, > tc, bos, but mf_set_value() uses a different order. Can you make it > consistent? > > In enum mf_prereqs, it might be best to add an "L2.5" category for MPLS. > > This adds a double blank line in nx_put_raw(). (Oh the horror!) > > Also in nx_put_raw(), missing space after the comma here: > +nxm_put_8(b,OXM_OF_MPLS_TC, mpls_lse_to_tc(flow->mpls_lse)); > > parse_l3_onward() has a ";;": > +ovs_be16 dl_type = flow->dl_type;; > > parse_l3_onward() could use flow_innermost_dl_type() since that's what > it's effectively calculating as 'dl_type' (maybe it should be > 'inner_dl_type' or 'innermost_dl_type'). > > Is there any sense in handling cases in commit_mpls_action() where the > mpls_depth has to change by more than 1? Should we log an error at > least? > > The comment on struct ofpact_push is getting ahead of ourselves, since > we have an OFPACT_PUSH_VLAN but it doesn't use this structure and no one > has even mentioned PBB for OVS yet, at least not to me: > > +/* OFPACT_PUSH_VLAN/MPLS/PBB > + * > + * used for NXAST_PUSH_MPLS, OFPAT13_PUSH_VLAN/MPLS/PBB */ > +struct ofpact_push { > +struct ofpact ofpact; > +ovs_be16 ethertype; > +}; > > You can remove the bit about "negotiation of an OF1.3 session is not yet > supported" from this comment in ofputil_usable_protocols(): > +/* NXM and OF1.3+ support matching MPLS label */ > +/* Allow for OF1.2 as there doesn't seem to be a > + * particularly good reason not to and negotiation > + * of an OF1.3 session is not yet supported. */ > > In ofputil_normalize_match__() should we also mask off mpls_depth if > !MAY_MPLS? I guess so. > > It's getting late here so I'll resume looking at this patch starting at > lib/packets.c next chance I get. All the above seems reasonable. I've started working on making it so. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] python: Workaround UNIX socket path length limits
>From aa28e8bfb646a54ba91e3545f3c0b9db39eddb7f Mon Sep 17 00:00:00 2001 From: James Page Date: Wed, 16 Jan 2013 10:52:59 + Subject: [PATCH] python: Workaround UNIX socket path length limits To: dev@openvswitch.org UNIX socket paths have a limit in terms of length. On Linux this is 103 characters. This is a problem in Debian/Ubuntu buildds which can generate quite long paths. This patch works around this issue by detecting when socket connection/binding fails due to this issue and accessing the socket path using a file descriptor path in /proc/self/fd. The workaround is limited to Linux. This is based on similar code in the C parts of OpenvSwitch. Signed-off-by: James Page --- python/ovs/socket_util.py | 39 +-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/python/ovs/socket_util.py b/python/ovs/socket_util.py index 8fecbc7..7bfefc4 100644 --- a/python/ovs/socket_util.py +++ b/python/ovs/socket_util.py @@ -70,9 +70,44 @@ def make_unix_socket(style, nonblock, bind_path, connect_path): return 0, sock except socket.error, e: sock.close() -if bind_path is not None: +if (bind_path is not None and +os.path.exists(bind_path)): ovs.fatal_signal.unlink_file_now(bind_path) -return get_exception_errno(e), None +eno = ovs.socket_util.get_exception_errno(e) +if (eno == "AF_UNIX path too long" and +os.uname()[0] == "Linux"): +short_connect_path = None +short_bind_path = None +connect_dirfd = None +bind_dirfd = None +# Try workaround using /proc/self/fd +if connect_path is not None: +dirname = os.path.dirname(connect_path) +basename = os.path.basename(connect_path) +try: +connect_dirfd = os.open(dirname, os.O_DIRECTORY | os.O_RDONLY) +except OSError, err: +return get_exception_errno(e), None +short_connect_path = "/proc/self/fd/%d/%s" % (connect_dirfd, basename) + +if bind_path is not None: +dirname = os.path.dirname(bind_path) +basename = os.path.basename(bind_path) +try: +bind_dirfd = os.open(dirname, os.O_DIRECTORY | os.O_RDONLY) +except OSError, err: +return get_exception_errno(e), None +short_bind_path = "/proc/self/fd/%d/%s" % (bind_dirfd, basename) + +try: +return make_unix_socket(style, nonblock, short_bind_path, short_connect_path) +finally: +if connect_dirfd is not None: +os.close(connect_dirfd) +if bind_dirfd is not None: +os.close(bind_dirfd) +else: +return get_exception_errno(e), None def check_connection_completion(sock): -- 1.8.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] datapath: support Linux 3.8 kernel
>From cce3dd47ae7d9b1e15c79c82a68bee8f24e18c36 Mon Sep 17 00:00:00 2001 From: James Page Date: Mon, 14 Jan 2013 12:23:58 + Subject: [PATCH] datapath: support Linux 3.8 kernel To: dev@openvswitch.org Add Linux 3.8 kernel to the range of supported kernel versions. Signed-off-by: James Page --- datapath/datapath.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/datapath/datapath.c b/datapath/datapath.c index ed69af8..b2854bc 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -61,8 +61,8 @@ #include "vport-internal_dev.h" #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,18) || \ -LINUX_VERSION_CODE >= KERNEL_VERSION(3,8,0) -#error Kernels before 2.6.18 or after 3.7 are not supported by this version of Open vSwitch. +LINUX_VERSION_CODE >= KERNEL_VERSION(3,9,0) +#error Kernels before 2.6.18 or after 3.8 are not supported by this version of Open vSwitch. #endif #define REHASH_FLOW_INTERVAL (10 * 60 * HZ) -- 1.8.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] OpenvSwitch for Ubuntu Raring/Kernel 3.8
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Ben On 15/01/13 05:53, Ben Pfaff wrote: >>> Also is there a timescale around shipping ovs 1.9? > We had a brief internal discussion about this today. We recommend > version 1.9 to anyone planning a release on the basis of what Open > vSwitch has available now or in the next few months. We should be > releasing OVS 1.9 really soon (within a week or two, I hope). I > don't know of anything we're working on in the 1.9 branch, so it > probably won't change further barring discovery of bugs (which > could just as well get discovered after release and end up in a > 1.9.x release). It would be perfectly reasonable to prepare for a > release on the basis of the current branch-1.9. Great - have done so; I've submitted two patches for the master branch to support this activity; one to workaround long UNIX socket path limitations which where causing test failures in the python code when building in sbuild (there was a similar problem a while back in the C code) and one to extend the range of supported kernels to include 3.8. I also cherry picked commit 2520f4528742decf78a8b375f5389b50977f5e4b to support the new header file layout for kernels >= 3.7. I've done some basic testing of this version on the 3.8 kernel in Ubuntu raring and things looks OK. > What I really want to discourage (even though I doubt you are > considering it) is doing anything based on current master, because > there's been a lot of big changes in internal structure and I'm > sure more bugs will be discovered there before it's really ready. Acknowledged - never had any intent todo this. Thanks for the guidance - much appreciated. Cheers James - -- James Page Ubuntu Core Developer Debian Maintainer james.p...@ubuntu.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCAAGBQJQ9o7/AAoJEL/srsug59jDXW8P/1VEcg1qPdqvWfCKfCvMoeB6 /nIfLmiEDVGCZr74O0npp9daByJtBKsOcpFHKZHLH0Ch3FJdsne5IQ1Qad9AUlcz /s8cwNQJhvlOWlGYhMe0Vb48YD8ScgXwCwC2FnlPfmEyhFkreCPkvF0iMmc1kQiw PjvqkWM+mKgWEPLfApUN/zdA6tEG+AeOo2DUTYv365DZSPd7blgF5nqS6TMgxfQH 9qSHQcmMq8uVxZ4l/maOrgbN8ugGxGdWdr9m/A0cPQQud96U1SY2m4ZHm9+fiNz7 nigt6Qigm+6krp3kq9lpcv/LnJRjLBPQKiV1fnIA2x0r4+rZrC/Xs7sJPPC9gL74 L/xEObPiqkQ1/VFh1FkgLiKIz6RLha6cmMmp4t/BOclHpMG0wCucsfAvToiUh1SE g8VJRwFBBovCcRV3HEa9Pam5UyxR/W5hYZa5EOZoZvN+pQDm7RyMg2KsTbwCqc0t tuw9kGOklIkhcPf4Hm7yzRBGek0g/BStTa0/k/EczkZNzbiFeBUO1vDkmrKMwUnk CpWBNL/nDmHmjp70UdR1WWt8o7rPmzctIcy+L6YpRPVe1IL+HweBhi2WaR4o8WqR quahpodw6J056Yj9FJNuJKTFDaOpOLnJdXtP9JAju6fE4guhyl+eEkFvQV3fXEoF Jp+J0rs7kuyf7BFT0rmT =4PkH -END PGP SIGNATURE- ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] netdev: Add NULL get_tunnel_config for BSD.
Commit f431bf7d78f3212d32bb3d122f783c5c796a1576 added new method get_tunnel_config to netdev_class but did not update netdev-bsd. Signed-off-by: Ed Maste --- lib/netdev-bsd.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c index 9094d04..538f91a 100644 --- a/lib/netdev-bsd.c +++ b/lib/netdev-bsd.c @@ -1255,6 +1255,7 @@ const struct netdev_class netdev_bsd_class = { netdev_bsd_destroy, NULL, /* get_config */ NULL, /* set_config */ +NULL, /* get_tunnel_config */ netdev_bsd_open_system, netdev_bsd_close, @@ -1315,6 +1316,7 @@ const struct netdev_class netdev_tap_class = { netdev_bsd_destroy, NULL, /* get_config */ NULL, /* set_config */ +NULL, /* get_tunnel_config */ netdev_bsd_open_system, netdev_bsd_close, -- 1.7.11.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] OpenvSwitch for Ubuntu Raring/Kernel 3.8
On Wed, Jan 16, 2013 at 3:29 AM, James Page wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Hi Ben > > On 15/01/13 05:53, Ben Pfaff wrote: Also is there a timescale around shipping ovs 1.9? >> We had a brief internal discussion about this today. We recommend >> version 1.9 to anyone planning a release on the basis of what Open >> vSwitch has available now or in the next few months. We should be >> releasing OVS 1.9 really soon (within a week or two, I hope). I >> don't know of anything we're working on in the 1.9 branch, so it >> probably won't change further barring discovery of bugs (which >> could just as well get discovered after release and end up in a >> 1.9.x release). It would be perfectly reasonable to prepare for a >> release on the basis of the current branch-1.9. > > Great - have done so; I've submitted two patches for the master branch > to support this activity; one to workaround long UNIX socket path > limitations which where causing test failures in the python code when > building in sbuild (there was a similar problem a while back in the C > code) and one to extend the range of supported kernels to include 3.8. > > I also cherry picked commit 2520f4528742decf78a8b375f5389b50977f5e4b > to support the new header file layout for kernels >= 3.7. branch-1.9 was actually already supposed to support Linux 3.7 so I put that commit directly on the branch now. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 1/3] tests: Make test for Linux /proc/self/fd more straightforward.
Presumably we can test for this Linux feature just by seeing whether the directory is there. Another goal is to shorten the code because I intend to make another copy of it in an upcoming commit, to add a similar test for Python. Signed-off-by: Ben Pfaff --- tests/library.at | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/tests/library.at b/tests/library.at index 0765c3f..3334d23 100644 --- a/tests/library.at +++ b/tests/library.at @@ -124,17 +124,7 @@ dnl is about 100 bytes. On Linux, we work around this by indirecting through dnl a directory fd using /proc/self/fd/. We do not have a workaround dnl for other platforms, so we skip the test there. AT_SETUP([test unix socket -- long pathname]) -AT_CHECK([dnl -case `uname` in dnl ( - *[[lL]]inux*) -exit 0 -;; dnl ( - *) -dnl Magic exit code to tell Autotest to skip this test. -exit 77 -;; -esac -]) +AT_SKIP_IF([test ! -d /proc/self/fd]) dnl Linux has a 108 byte limit; this is 150 bytes long. mkdir 012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 cd 012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 3/3] tests: Add test for Python version of long socket name workaround.
Signed-off-by: Ben Pfaff --- tests/library.at | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/tests/library.at b/tests/library.at index 4afd913..b0fe3c9 100644 --- a/tests/library.at +++ b/tests/library.at @@ -114,7 +114,7 @@ m4_foreach( AT_CHECK([test-util testname], [0], [], []) AT_CLEANUP]) -AT_SETUP([test unix socket -- short pathname]) +AT_SETUP([test unix socket, short pathname - C]) AT_CHECK([test-unix-socket x]) AT_CLEANUP @@ -123,7 +123,7 @@ dnl go in a fixed-length field in struct sockaddr_un. Generally the limit dnl is about 100 bytes. On Linux, we work around this by indirecting through dnl a directory fd using /proc/self/fd/. We do not have a workaround dnl for other platforms, so we skip the test there. -AT_SETUP([test unix socket -- long pathname]) +AT_SETUP([test unix socket, long pathname - C]) AT_SKIP_IF([test ! -d /proc/self/fd]) dnl Linux has a 108 byte limit; this is 150 bytes long. longname=012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 @@ -131,3 +131,23 @@ mkdir $longname cd $longname AT_CHECK([test-unix-socket ../$longname/socket socket]) AT_CLEANUP + +AT_SETUP([test unix socket, short pathname - Python]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +AT_CHECK([$PYTHON $srcdir/test-unix-socket.py x]) +AT_CLEANUP + +dnl Unix sockets with long names are problematic because the name has to +dnl go in a fixed-length field in struct sockaddr_un. Generally the limit +dnl is about 100 bytes. On Linux, we work around this by indirecting through +dnl a directory fd using /proc/self/fd/. We do not have a workaround +dnl for other platforms, so we skip the test there. +AT_SETUP([test unix socket, long pathname - Python]) +AT_SKIP_IF([test $HAVE_PYTHON = no]) +AT_SKIP_IF([test ! -d /proc/self/fd]) +dnl Linux has a 108 byte limit; this is 150 bytes long. +longname=012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 +mkdir $longname +cd $longname +AT_CHECK([$PYTHON $abs_srcdir/test-unix-socket.py ../$longname/socket socket]) +AT_CLEANUP -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] datapath: support Linux 3.8 kernel
On Wed, Jan 16, 2013 at 3:19 AM, James Page wrote: > From cce3dd47ae7d9b1e15c79c82a68bee8f24e18c36 Mon Sep 17 00:00:00 2001 > From: James Page > Date: Mon, 14 Jan 2013 12:23:58 + > Subject: [PATCH] datapath: support Linux 3.8 kernel > To: dev@openvswitch.org > > Add Linux 3.8 kernel to the range of supported kernel versions. > > Signed-off-by: James Page Applied, thanks. I decided to apply this to branch-1.9 as well since there are no actual code changes. Since it will be a LTS release this will help extend its life. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] python: Workaround UNIX socket path length limits
On Wed, Jan 16, 2013 at 11:12:09AM +, James Page wrote: > From aa28e8bfb646a54ba91e3545f3c0b9db39eddb7f Mon Sep 17 00:00:00 2001 > From: James Page > Date: Wed, 16 Jan 2013 10:52:59 + > Subject: [PATCH] python: Workaround UNIX socket path length limits > To: dev@openvswitch.org > > UNIX socket paths have a limit in terms of length. On Linux > this is 103 characters. > > This is a problem in Debian/Ubuntu buildds which can generate > quite long paths. > > This patch works around this issue by detecting when socket > connection/binding fails due to this issue and accessing the > socket path using a file descriptor path in /proc/self/fd. > > The workaround is limited to Linux. > > This is based on similar code in the C parts of OpenvSwitch. > > Signed-off-by: James Page Applied. This patch didn't add a test like the one for the existing C code, so I sent out an implementation, starting here: http://openvswitch.org/pipermail/dev/2013-January/024535.html ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 2/3] tests: Make long name more readable by introducing a shell variable.
Signed-off-by: Ben Pfaff --- tests/library.at |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/tests/library.at b/tests/library.at index 3334d23..4afd913 100644 --- a/tests/library.at +++ b/tests/library.at @@ -126,7 +126,8 @@ dnl for other platforms, so we skip the test there. AT_SETUP([test unix socket -- long pathname]) AT_SKIP_IF([test ! -d /proc/self/fd]) dnl Linux has a 108 byte limit; this is 150 bytes long. -mkdir 012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 -cd 012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 -AT_CHECK([test-unix-socket ../012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789/socket socket]) +longname=012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 +mkdir $longname +cd $longname +AT_CHECK([test-unix-socket ../$longname/socket socket]) AT_CLEANUP -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] netdev: Add NULL get_tunnel_config for BSD.
On Wed, Jan 16, 2013 at 09:57:52AM -0500, Ed Maste wrote: > Commit f431bf7d78f3212d32bb3d122f783c5c796a1576 added new method > get_tunnel_config to netdev_class but did not update netdev-bsd. > > Signed-off-by: Ed Maste Applied to master. I'm sorry that I missed this in review. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] bridge: Remove restriction on socket name.
On Tue, Jan 15, 2013 at 12:23:46PM -0800, Pavithra Ramesh wrote: > Following patch removes restriction on the listening socket name that gets > configured as bridge controller. Currently, we only connect to sockets in a > specific directory with the name of the bridge. This patch removes the > restriction on the bridge name, keeping the directory restriction. > > Bug #14029 > > Signed-off-by: Pavithra Ramesh This regresses a bit since it switched from equal_pathnames() to just strncmp(). How about this version instead. --8<--cut here-->8-- From: Pavithra Ramesh Date: Tue, 15 Jan 2013 12:23:46 -0800 Subject: [PATCH] bridge: Remove restriction on socket name. Following patch removes restriction on the listening socket name that gets configured as bridge controller. Currently, we only connect to sockets in a specific directory with the name of the bridge. This patch removes the restriction on the bridge name, keeping the directory restriction. Bug #14029. Signed-off-by: Pavithra Ramesh Signed-off-by: Ben Pfaff --- vswitchd/bridge.c | 60 ++--- 1 file changed, 43 insertions(+), 17 deletions(-) diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index 348faef..c4ef9ea 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -2730,11 +2730,18 @@ bridge_configure_local_iface_netdev(struct bridge *br, /* Returns true if 'a' and 'b' are the same except that any number of slashes * in either string are treated as equal to any number of slashes in the other, - * e.g. "x///y" is equal to "x/y". */ + * e.g. "x///y" is equal to "x/y". + * + * Also, if 'b_stoplen' bytes from 'b' are found to be equal to corresponding + * bytes from 'a', the function considers this success. Specify 'b_stoplen' as + * SIZE_MAX to compare all of 'a' to all of 'b' rather than just a prefix of + * 'b' against a prefix of 'a'. + */ static bool -equal_pathnames(const char *a, const char *b) +equal_pathnames(const char *a, const char *b, size_t b_stoplen) { -while (*a == *b) { +const char *b_start = b; +while (b - b_start < b_stoplen && *a == *b) { if (*a == '/') { a += strspn(a, "/"); b += strspn(b, "/"); @@ -2792,21 +2799,40 @@ bridge_configure_remotes(struct bridge *br, static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); char *whitelist; -whitelist = xasprintf("unix:%s/%s.controller", +if (!strncmp(c->target, "unix:", 5)) { +/* Connect to a listening socket */ +whitelist = xasprintf("unix:%s/", ovs_rundir()); +if (!equal_pathnames(c->target, whitelist, + strlen(whitelist))) { +VLOG_ERR_RL(&rl, "bridge %s: Not connecting to socket " + "controller \"%s\" due to possibility for " + "remote exploit. Instead, specify socket " + "in whitelisted \"%s\" or connect to " + "\"unix:%s/%s.mgmt\" (which is always " + "available without special configuration).", + br->name, c->target, whitelist, ovs_rundir(), br->name); -if (!equal_pathnames(c->target, whitelist)) { -/* Prevent remote ovsdb-server users from accessing arbitrary - * Unix domain sockets and overwriting arbitrary local - * files. */ -VLOG_ERR_RL(&rl, "bridge %s: Not adding Unix domain socket " -"controller \"%s\" due to possibility for remote " -"exploit. Instead, specify whitelisted \"%s\" or " -"connect to \"unix:%s/%s.mgmt\" (which is always " -"available without special configuration).", -br->name, c->target, whitelist, -ovs_rundir(), br->name); -free(whitelist); -continue; +free(whitelist); +continue; +} +} else { + whitelist = xasprintf("punix:%s/%s.controller", + ovs_rundir(), br->name); + if (!equal_pathnames(c->target, whitelist, SIZE_MAX)) { + /* Prevent remote ovsdb-server users from accessing +* arbitrary Unix domain sockets and overwriting arbitrary +* local files. */ + VLOG_ERR_RL(&rl, "bridge %s: Not adding Unix domain socket " + "controller \"%s\" due to possibility of " + "overwriting local files. Instead, specify " + "whitelisted \"%s\" or conne
[ovs-dev] freight forwarder & logistics provider shared photos with you
Dear My Friend Nice day, Hyun Young is a leading professional freight forwarder and logistics provider who focus on the shipment from South China to all the world. Hyun Young started freight forwarding operation at Shenzhen in 2004. Based at Shenzhen, our ambition have pushed us forward to expand to other cities in south of China. Now we have capacity of handing shipment to or from all the ports in south of China. Holds while whole - heartedly achieves the best enterprise objective, With the great support of our global agency, we provide services to our customers through process-driven operation team, advanced information system, and strong management team. Glance to our company: 1. Sea Freight, included FCL&LCL; 2. Air Freight; 3. Express, included DHL,UPS,FEDEX,SAGAWA and SCOREJP; 4. Import & Export; 5. Land Transportation. We seek no strongest only more specialized, senior. Your satisfied will be our maximal pride. Shenzhen Hyun Young International Transportation CO.,LTD Jacky Yang Add: Floor 7&8, South Bao’an Road, Luohu District, Shenzhen, Guangdong, China. <>___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Tue, Jan 15, 2013 at 11:05 PM, Ben Pfaff wrote: > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: >> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h >> index 5e32965..b421753 100644 >> --- a/include/linux/openvswitch.h >> +++ b/include/linux/openvswitch.h >> @@ -282,6 +282,7 @@ enum ovs_key_attr { >> OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ >> OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ >> OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ >> + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ >> OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ >> OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ >> __OVS_KEY_ATTR_MAX > > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of > another value that we're already aiming to get upstream. I'd suggest a > value of 62. > > Jesse, does that sound reasonable to you? Another alternative would be > to avoid modifying entirely, one way or another, > until we have real kernel support, which would be a little extra work. It's not ideal but I think it's fine for the time being. Incidentally, Pravin is working on something similar from the kernel side to allow transformation of Netlink tunnel attributes into something more efficient for packet processing. To ensure that there is no leakage, I would do two things: wrap all MPLS definitions in #ifndef __KERNEL__ blocks and check that we don't have any handling code into odp-util.c. That way if another new attribute later starts using this temporary value then an old version of OVS userspace will ignore it rather than trying to interprete it as MPLS. If we do that then we can just put the MPLS attribute directly after IPV4_TUNNEL and new attributes would be inserted in between. The MPLS value could change but that shouldn't matter because there the value would be entirely internal. This seems easier to deal with than allocating at both ends. > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. For the PUSH/POP actions I think we also have the same issue with inserting in the middle of the list and can use the same solution. In the comment about the EtherType above struct ovs_action_push_mpls I would also say that non-MPLS EtherTypes will get rejected rather than result in a corrupt packet since that seems like better behavior. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [ovs-discuss] which ovs version will support the action set_field
To expand on my answer: You will have to change it to use ovs-vsctl. On Wed, Jan 16, 2013 at 07:32:08PM +, Henkel, Michael wrote: > Nope, this gives me an "operation not supported" when starting the > container... lxc seems to use the brctl utility for adding a veth to a > bridge... > > -Original Message- > From: Ben Pfaff [mailto:b...@nicira.com] > Sent: Mittwoch, 16. Januar 2013 20:30 > To: Henkel, Michael > Cc: disc...@openvswitch.org > Subject: Re: [ovs-discuss] which ovs version will support the action set_field > > On Wed, Jan 16, 2013 at 06:40:11PM +, Henkel, Michael wrote: > > Do you have any suggestion on how to get a lxc interface connected to > > the ovs without the brcompat module? > > ovs-vsctl? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] bridge: Remove restriction on socket name.
On Wed, Jan 16, 2013 at 09:54:02AM -0800, Ben Pfaff wrote: > On Tue, Jan 15, 2013 at 12:23:46PM -0800, Pavithra Ramesh wrote: > > Following patch removes restriction on the listening socket name that gets > > configured as bridge controller. Currently, we only connect to sockets in a > > specific directory with the name of the bridge. This patch removes the > > restriction on the bridge name, keeping the directory restriction. > > > > Bug #14029 > > > > Signed-off-by: Pavithra Ramesh > > This regresses a bit since it switched from equal_pathnames() to just > strncmp(). How about this version instead. Pavithra told me it looked good off-list, so I applied this to master. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ovs-vsctl: Add --if-exists option to many database commands.
On Tue, Jan 15, 2013 at 04:57:27PM -0800, Gurucharan Shetty wrote: > On Tue, Jan 15, 2013 at 1:24 PM, Ben Pfaff wrote: > > A few ovs-vsctl commands have accepted --if-exists options for some time, > > to make it possible to execute them in cases where it doesn't really > > matter if the records they touch exist. This commit adds this option to > > other commands. > > > > This is intended for initial use with "ovs-vsctl set interface > > ofport_request=" commands in ovs-ctl for upgrades from OVS 1.9 > > to later versions. > > There is a whitespace error. Otherwise looks good. Thanks. I will > re-spin my patches. Thanks, I applied this to master. Oops, I forgot to fix the whitespace error. Sorry about that. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 1/2] ovs-save: Add a helper command to maintain ofport value.
On Tue, Jan 15, 2013 at 06:39:23PM -0800, Gurucharan Shetty wrote: > v1-v2: Changed commit message and added the --if-exists option to ovs-vsctl. Thanks, looks good. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 2/2] ovs-ctl.in: Restore ofport values across force-reload-kmod.
On Tue, Jan 15, 2013 at 06:39:24PM -0800, Gurucharan Shetty wrote: > v1-v2: > - Save ofports only if we are upgrading from a pre-1.10 branch. > - Change commit message. Thanks, looks good. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] datapath: More flexible kernel/userspace tunneling attribute.
Thanks for comments. I posted patch according to comments except refactoring netlink-parsing. I will post separate patch for that. --Pravin. On Tue, Jan 15, 2013 at 10:38 AM, Jesse Gross wrote: > On Fri, Jan 11, 2013 at 5:23 PM, Pravin B Shelar wrote: >> diff --git a/datapath/datapath.c b/datapath/datapath.c >> index 30e26a7..4ed00cf 100644 >> --- a/datapath/datapath.c >> +++ b/datapath/datapath.c >> static int validate_sample(const struct nlattr *attr, >> - const struct sw_flow_key *key, int depth) >> + const struct sw_flow_key *key, int depth, >> + struct sw_flow_actions *sfa) >> { >> const struct nlattr *attrs[OVS_SAMPLE_ATTR_MAX + 1]; >> const struct nlattr *probability, *actions; >> @@ -451,7 +453,7 @@ static int validate_sample(const struct nlattr *attr, >> actions = attrs[OVS_SAMPLE_ATTR_ACTIONS]; >> if (!actions || (nla_len(actions) && nla_len(actions) < NLA_HDRLEN)) >> return -EINVAL; >> - return validate_actions(actions, key, depth + 1); >> + return validate_and_copy_actions(actions, key, depth + 1, sfa); > > I think this will result in the inner actions being output before the > sample attribute itself. > >> +static int nla_put(struct sw_flow_actions *sfa, int attrtype, void >> *data, int len) > > I would probably give this a name other than nla_put since it makes it > sound like it is part of the Netlink library itself. > >> +{ >> + struct nlattr *a = (struct nlattr *) ((unsigned char *)sfa->actions >> + sfa->used); >> + >> + if (NLA_ALIGN(nla_attr_size(len)) > (sfa->actions_len - sfa->used)); >> + return -EMSGSIZE; > > What if we made actions_len the current length of the attributes that > we've put and then used ksize() to check for overflow? That way we > wouldn't need sfa->used and actions_len will always reflect the > correct data. > >> +static int validate_and_copy_set_tun(const struct nlattr *attr, >> +struct sw_flow_actions *sfa) >> +{ >> + struct ovs_key_ipv4_tunnel tun_key; >> + int err; >> + >> + err = ipv4_tun_from_nlattr(attr, &tun_key); >> + if (err) >> + return err; >> + >> + err = nla_put(sfa, OVS_KEY_ATTR_IPV4_TUNNEL, &tun_key, >> sizeof(tun_key)); >> + if (err) >> + return err; > > I think this will result in only having the OVS_KEY_ATTR_IPV4_TUNNEL > and not the wrapping set attribute. > >> @@ -600,12 +653,16 @@ static int validate_actions(const struct nlattr *attr, >> }; >> const struct ovs_action_push_vlan *vlan; >> int type = nla_type(a); >> + bool set_tun; >> + >> > > Extra blank line. > >> @@ -634,13 +691,13 @@ static int validate_actions(const struct nlattr *attr, >> break; >> >> case OVS_ACTION_ATTR_SET: >> - err = validate_set(a, key); >> + err = validate_set(a, key, sfa, &set_tun); >> if (err) >> return err; >> break; >> >> case OVS_ACTION_ATTR_SAMPLE: >> - err = validate_sample(a, key, depth); >> + err = validate_sample(a, key, depth, sfa); > > I think if we have a sample attribute with a nested tunnel attribute > then we will get both the original translated tunnel key and the > original full sample attribute. > > Don't we need to translate this back the other way as well? If we > there is a query for a flow we should return the original value. > >> @@ -951,9 +1013,16 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff >> *skb, struct genl_info *info) >> >> /* Validate actions. */ >> if (a[OVS_FLOW_ATTR_ACTIONS]) { >> - error = validate_actions(a[OVS_FLOW_ATTR_ACTIONS], &key, 0); >> - if (error) >> + acts = ovs_flow_actions_alloc(a[OVS_FLOW_ATTR_ACTIONS]); >> + error = PTR_ERR(acts); >> + if (IS_ERR(acts)) >> + goto error; >> + >> + error = validate_and_copy_actions(a[OVS_FLOW_ATTR_ACTIONS], >> &key, 0, acts); >> + if (error) { >> + ovs_flow_deferred_free_acts(acts); > > We could just kfree() the actions here, no need to use RCU. > > Doesn't this leak the actions if we later encounter an error? > >> @@ -1026,21 +1087,14 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff >> *skb, struct genl_info *info) >> /* Update actions. */ >> old_acts = rcu_dereference_protected(flow->sf_acts, >> lockdep_genl_is_held()); >> - acts_attrs = a[OVS_FLOW_ATTR_ACTIONS]; >> - if (acts_attrs && >> - (old_acts->actions_len != nla_len(ac
[ovs-dev] [PATCH] v2 datapath: More flexible kernel/userspace tunneling attribute.
Fixed according to comments from Jesse, v1-v2: - Fixed sample action. - Fixed set ipv4-tunnel. - Converted tunnel flags to netlink flags. - Fixed dump-flow. --8<--cut here-->8-- Following patch breaks down single ipv4_tunnel netlink attribute into individual member attributes. It will help when we extend tunneling parameters in future. Signed-off-by: Pravin B Shelar Bug #14611 --- datapath/datapath.c | 327 +++ datapath/flow.c | 172 +-- datapath/flow.h | 16 ++- datapath/tunnel.h |5 + include/linux/openvswitch.h | 31 ++-- lib/dpif-netdev.c |2 +- lib/odp-util.c | 236 ++- tests/odp.at| 11 +- 8 files changed, 582 insertions(+), 218 deletions(-) diff --git a/datapath/datapath.c b/datapath/datapath.c index ed69af8..4ed40e2 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -423,16 +423,83 @@ static int flush_flows(struct datapath *dp) return 0; } -static int validate_actions(const struct nlattr *attr, - const struct sw_flow_key *key, int depth); +static struct nlattr *reserve_sfa_size(struct sw_flow_actions **sfa, int attr_len) +{ + + struct sw_flow_actions *new; + struct nlattr *a; + + if (NLA_ALIGN(attr_len) <= (ksize(*sfa) - (*sfa)->actions_len)) + goto out; + + if (ksize(*sfa) * 2 > MAX_ACTIONS_BUFSIZE) + return ERR_PTR(-EMSGSIZE); + + new = ovs_flow_actions_alloc(ksize(*sfa) * 2); + if (IS_ERR(new)) + return (void *)new; + + memcpy(new->actions, (*sfa)->actions, (*sfa)->actions_len); + new->actions_len = (*sfa)->actions_len; + kfree(*sfa); + *sfa = new; + +out: + a = (struct nlattr *) ((unsigned char *)(*sfa)->actions + + (*sfa)->actions_len); + (*sfa)->actions_len = (*sfa)->actions_len + NLA_ALIGN(attr_len); + return a; +} -static int validate_sample(const struct nlattr *attr, - const struct sw_flow_key *key, int depth) +static int add_action(struct sw_flow_actions **sfa, int attrtype, void *data, int len) +{ + struct nlattr *a; + + a = reserve_sfa_size(sfa, nla_attr_size(len)); + if (IS_ERR(a)) + return PTR_ERR(a); + + a->nla_type = attrtype; + a->nla_len = nla_attr_size(len); + + if (data) + memcpy(nla_data(a), data, len); + memset((unsigned char *) a + a->nla_len, 0, nla_padlen(len)); + + return 0; +} + +static inline int add_nested_action_start(struct sw_flow_actions **sfa, int attrtype) +{ + int used = (*sfa)->actions_len; + int err; + + err = add_action(sfa, attrtype, NULL, 0); + if (err) + return err; + + return used; +} + +static inline void add_nested_action_end(struct sw_flow_actions *sfa, int st_offset) +{ + struct nlattr *a = (struct nlattr *) ((unsigned char *)sfa->actions + st_offset); + + a->nla_len = sfa->actions_len - st_offset; +} + +static int validate_and_copy_actions(const struct nlattr *attr, + const struct sw_flow_key *key, int depth, + struct sw_flow_actions **sfa); + +static int validate_and_copy_sample(const struct nlattr *attr, + const struct sw_flow_key *key, int depth, + struct sw_flow_actions **sfa) { const struct nlattr *attrs[OVS_SAMPLE_ATTR_MAX + 1]; const struct nlattr *probability, *actions; const struct nlattr *a; - int rem; + int rem, start, err, st_acts; memset(attrs, 0, sizeof(attrs)); nla_for_each_nested(a, attr, rem) { @@ -451,7 +518,26 @@ static int validate_sample(const struct nlattr *attr, actions = attrs[OVS_SAMPLE_ATTR_ACTIONS]; if (!actions || (nla_len(actions) && nla_len(actions) < NLA_HDRLEN)) return -EINVAL; - return validate_actions(actions, key, depth + 1); + + /* validation done, copy sample action. */ + start = add_nested_action_start(sfa, OVS_ACTION_ATTR_SAMPLE); + if (start < 0) + return start; + err = add_action(sfa, OVS_SAMPLE_ATTR_PROBABILITY, nla_data(probability), sizeof(u32)); + if (err) + return err; + st_acts = add_nested_action_start(sfa, OVS_SAMPLE_ATTR_ACTIONS); + if (st_acts < 0) + return st_acts; + + err = validate_and_copy_actions(actions, key, depth + 1, sfa); + if (err) + return err; + + add_nested_action_end(*sfa, st_acts); + add_nested_action_end(*sfa, start); + + return 0; } static int validate_tp_port(const struct sw_flow_key *flow_key) @@ -467,8 +553,30 @@ static int validat
Re: [ovs-dev] [PATCH] ofproto: Optimise OpenFlow flow expiry
On Tue, Jan 15, 2013 at 01:20:57PM +0900, Simon Horman wrote: > Optimise OpenFlow flow expiry by placing expirable flows on a list. > This optimises scanning of flows for expiry in two ways: > > * Empirically list traversal appears faster than the code it replaces. > > With 1,000,000 flows present an otherwise idle system I observed CPU > utilisation of around 20% with the existing code but around 10% with > this new code. > > * Flows that will never expire are not traversed. > > This addresses a case seen in the field. This version looks better. I still have a few comments, but before that, may I ask a little bit about the situation in which the performance improvement was observed? In this situation, about how many of the 1,000,000 flows were actually expirable, that is, had either a hard timeout or an idle timeout? That is, is the performance improvement due more to the first or the second bullet you list above? If none of the flows were expirable, then I guess it was the second; if all of them were, then I guess it was the first; and otherwise it is something in between. Basically I'm wondering if we should do something to make flow table traversal faster, independent of expiration. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [murmurhash 4/4] hash: Replace primary hash functions by murmurhash.
I might consider putting hash_3words() near hash_2words() in the header file. Mostly just for consistency sake. In hash_3words(), you're finishing with a byte count of 8 instead 12. I may be misinterpreting what that value is for though. Acked-by: Ethan Jackson On Fri, Dec 14, 2012 at 4:33 PM, Ben Pfaff wrote: > murmurhash is faster than Jenkins and slightly higher quality, so switch to > it for hashing words. > > The best timings I got for hashing for data lengths of the following > numbers of 32-bit words, in seconds per 1,000,000,000 hashes, were: > > words murmurhash Jenkins hash > - -- >1 8.4 10.4 >2 10.3 10.3 >3 11.2 10.7 >4 12.6 18.0 >5 13.9 18.3 >6 15.2 18.7 > > In other words, murmurhash outperforms Jenkins for all input lengths other > than exactly 3 32-bit words (12 bytes). (It's understandable that Jenkins > would have a best case at 12 bytes, because Jenkins works in 12-byte > chunks.) Even in the case where Jenkins is faster, it's only by 5%. On > average within this data set, murmurhash is 15% faster, and for 4-word > input it is 30% faster. > > We retain Jenkins for flow_hash_symmetric_l4() and flow_hash_fields(), > which are cases where the hash value is exposed externally. > > Signed-off-by: Ben Pfaff > --- > lib/automake.mk |2 + > lib/flow.c|5 +- > lib/hash.c| 89 ++ > lib/hash.h| 82 +++--- > lib/jhash.c | 125 > + > lib/jhash.h | 44 +++ > tests/test-hash.c | 27 ++- > 7 files changed, 229 insertions(+), 145 deletions(-) > create mode 100644 lib/jhash.c > create mode 100644 lib/jhash.h > > diff --git a/lib/automake.mk b/lib/automake.mk > index 740f33f..4547198 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -61,6 +61,8 @@ lib_libopenvswitch_a_SOURCES = \ > lib/hmap.h \ > lib/hmapx.c \ > lib/hmapx.h \ > + lib/jhash.c \ > + lib/jhash.h \ > lib/json.c \ > lib/json.h \ > lib/jsonrpc.c \ > diff --git a/lib/flow.c b/lib/flow.c > index 89488e5..314a194 100644 > --- a/lib/flow.c > +++ b/lib/flow.c > @@ -31,6 +31,7 @@ > #include "csum.h" > #include "dynamic-string.h" > #include "hash.h" > +#include "jhash.h" > #include "match.h" > #include "ofpbuf.h" > #include "openflow/openflow.h" > @@ -699,7 +700,7 @@ flow_hash_symmetric_l4(const struct flow *flow, uint32_t > basis) > fields.tp_port = flow->tp_src ^ flow->tp_dst; > } > } > -return hash_bytes(&fields, sizeof fields, basis); > +return jhash_bytes(&fields, sizeof fields, basis); > } > > /* Hashes the portions of 'flow' designated by 'fields'. */ > @@ -710,7 +711,7 @@ flow_hash_fields(const struct flow *flow, enum > nx_hash_fields fields, > switch (fields) { > > case NX_HASH_FIELDS_ETH_SRC: > -return hash_bytes(flow->dl_src, sizeof flow->dl_src, basis); > +return jhash_bytes(flow->dl_src, sizeof flow->dl_src, basis); > > case NX_HASH_FIELDS_SYMMETRIC_L4: > return flow_hash_symmetric_l4(flow, basis); > diff --git a/lib/hash.c b/lib/hash.c > index 8cee5d0..2fad911 100644 > --- a/lib/hash.c > +++ b/lib/hash.c > @@ -18,57 +18,11 @@ > #include > #include "unaligned.h" > > -/* Returns the hash of the 'n' 32-bit words at 'p', starting from 'basis'. > - * 'p' must be properly aligned. */ > -uint32_t > -hash_words(const uint32_t *p, size_t n, uint32_t basis) > -{ > -uint32_t a, b, c; > - > -a = b = c = 0xdeadbeef + (((uint32_t) n) << 2) + basis; > - > -while (n > 3) { > -a += p[0]; > -b += p[1]; > -c += p[2]; > -hash_mix(&a, &b, &c); > -n -= 3; > -p += 3; > -} > - > -switch (n) { > -case 3: > -c += p[2]; > -/* fall through */ > -case 2: > -b += p[1]; > -/* fall through */ > -case 1: > -a += p[0]; > -hash_final(&a, &b, &c); > -/* fall through */ > -case 0: > -break; > -} > -return c; > -} > - > /* Returns the hash of 'a', 'b', and 'c'. */ > uint32_t > hash_3words(uint32_t a, uint32_t b, uint32_t c) > { > -a += 0xdeadbeef; > -b += 0xdeadbeef; > -c += 0xdeadbeef; > -hash_final(&a, &b, &c); > -return c; > -} > - > -/* Returns the hash of 'a' and 'b'. */ > -uint32_t > -hash_2words(uint32_t a, uint32_t b) > -{ > -return hash_3words(a, b, 0); > +return mhash_finish(mhash_add(mhash_add(mhash_add(a, 0), b), c), 8); > } > > /* Returns the hash of the 'n' bytes at 'p', starting from 'basis'. */ > @@ -76,37 +30,30 @@ uint32_t > hash_bytes(const void *p_, size_t n, uint32_t basis) > { > const uint8_t *p = p_; > -
Re: [ovs-dev] [murmurhash 1/4] tests: Fix dependencies on hash function in ofproto-dpif tests.
Acked-by: Ethan Jackson On Tue, Jan 15, 2013 at 9:49 PM, Ben Pfaff wrote: > On Tue, Jan 15, 2013 at 01:16:13PM -0800, Ethan Jackson wrote: >> Is there any reason we couldn't simply convert these tests to use the >> ADD_OF_PORTS macro? > > No. Somehow this macro had passed me by. > > Revised version follows. > > --8<--cut here-->8-- > > From: Ben Pfaff > Date: Tue, 15 Jan 2013 21:48:25 -0800 > Subject: [PATCH] tests: Fix dependencies on hash function in ofproto-dpif > tests. > > These tests relied on luck to ensure that OpenFlow ports received the > expected OpenFlow port numbers. With a different hash function, or (I > expect) on a big-endian architecture, the port numbers were assigned > differently and the tests failed. > > This commit fixes the problem by requesting the specific expected port > numbers explicitly. > > Signed-off-by: Ben Pfaff > --- > tests/ofproto-dpif.at | 104 > ++--- > 1 file changed, 38 insertions(+), 66 deletions(-) > > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at > index fd66d24..a14c412 100644 > --- a/tests/ofproto-dpif.at > +++ b/tests/ofproto-dpif.at > @@ -601,15 +601,12 @@ AT_CLEANUP > > > AT_SETUP([ofproto-dpif - mirroring, select_all]) > -OVS_VSWITCHD_START( > - [add-port br0 p1 -- set Interface p1 type=dummy --\ > -add-port br0 p2 -- set Interface p2 type=dummy --\ > -add-port br0 p3 -- set Interface p3 type=dummy --\ > +OVS_VSWITCHD_START > +ADD_OF_PORTS([br0], 1, 2, 3) > +ovs-vsctl \ > set Bridge br0 mirrors=@m --\ > --id=@p3 get Port p3 --\ > ---id=@m create Mirror name=mymirror \ > -select_all=true output_port=@p3], [<0> > -]) > +--id=@m create Mirror name=mymirror select_all=true output_port=@p3 > > AT_DATA([flows.txt], [dnl > in_port=1 actions=output:2 > @@ -634,15 +631,12 @@ AT_CLEANUP > > > AT_SETUP([ofproto-dpif - mirroring, select_src]) > -OVS_VSWITCHD_START( > - [add-port br0 p1 -- set Interface p1 type=dummy --\ > -add-port br0 p2 -- set Interface p2 type=dummy --\ > -add-port br0 p3 -- set Interface p3 type=dummy --\ > +OVS_VSWITCHD_START > +ADD_OF_PORTS([br0], 1, 2, 3) > +ovs-vsctl \ > set Bridge br0 mirrors=@m --\ > --id=@p1 get Port p1 -- --id=@p3 get Port p3 --\ > ---id=@m create Mirror name=mymirror \ > -select_src_port=@p1 output_port=@p3], [<0> > -]) > +--id=@m create Mirror name=mymirror select_src_port=@p1 > output_port=@p3 > > AT_DATA([flows.txt], [dnl > in_port=1 actions=output:2 > @@ -665,14 +659,12 @@ OVS_VSWITCHD_STOP > AT_CLEANUP > > AT_SETUP([ofproto-dpif - mirroring, OFPP_NONE ingress port]) > -OVS_VSWITCHD_START( > - [add-port br0 p1 -- set Interface p1 type=dummy --\ > -add-port br0 p2 -- set Interface p2 type=dummy --\ > +OVS_VSWITCHD_START > +ADD_OF_PORTS([br0], 1, 2) > +ovs-vsctl \ > set Bridge br0 mirrors=@m --\ > --id=@p2 get Port p2 --\ > ---id=@m create Mirror name=mymirror \ > -select_all=true output_port=@p2], [<0> > -]) > +--id=@m create Mirror name=mymirror select_all=true output_port=@p2 > > AT_CHECK([ovs-ofctl add-flow br0 action=output:1]) > > @@ -688,15 +680,12 @@ AT_CLEANUP > > > AT_SETUP([ofproto-dpif - mirroring, select_dst]) > -OVS_VSWITCHD_START( > - [add-port br0 p1 -- set Interface p1 type=dummy ofport_request=1 --\ > -add-port br0 p2 -- set Interface p2 type=dummy ofport_request=2 --\ > -add-port br0 p3 -- set Interface p3 type=dummy ofport_request=3 --\ > +OVS_VSWITCHD_START > +ADD_OF_PORTS([br0], 1, 2, 3) > +ovs-vsctl \ > set Bridge br0 mirrors=@m --\ > --id=@p2 get Port p2 -- --id=@p3 get Port p3 --\ > ---id=@m create Mirror name=mymirror \ > -select_dst_port=@p2 output_port=@p3], [<0> > -]) > +--id=@m create Mirror name=mymirror select_dst_port=@p2 > output_port=@p3 > > AT_DATA([flows.txt], [dnl > in_port=1 actions=output:2 > @@ -721,15 +710,12 @@ AT_CLEANUP > > > AT_SETUP([ofproto-dpif - mirroring, select_vlan]) > -OVS_VSWITCHD_START( > - [add-port br0 p1 -- set Interface p1 type=dummy --\ > -add-port br0 p2 -- set Interface p2 type=dummy --\ > -add-port br0 p3 -- set Interface p3 type=dummy --\ > +OVS_VSWITCHD_START > +ADD_OF_PORTS([br0], 1, 2, 3) > +ovs-vsctl \ > set Bridge br0 mirrors=@m --\ > --id=@p2 get Port p2 -- --id=@p3 get Port p3 --\ > ---id=@m create Mirror name=mymirror \ > -select_all=true select_vlan=11 output_port=@p3], [<0> > -]) > +--id=@m create Mirror name=mymirror select_all=true select_vlan=11 > output_port=@p3 > > AT_DATA([flows.txt], [dnl > in_port=1, actions=output:2 > @@ -759,15 +745,12 @@ AT_CLEANUP > > > AT_SETUP([ofproto-dpif - mirroring, output_port]) > -OVS_VSWITCHD_START( > - [add-port br0 p1 -- set Interface p1 type=dummy --\ > -
Re: [ovs-dev] [cleanups 02/12] dpif-linux.c: Let the kernel pick a port number if one not requested.
On Jan 15, 2013, at 4:01 PM, Gurucharan Shetty wrote: > If I run the following 3 commands, vswitchd segfaults. > ovs-vsctl add-br br1 > ovs-vsctl add-port br1 gre1 -- set interface gre1 type=gre > options:remote_ip=192.168.1.1 > ovs-vsctl add-port br1 gre2 -- set interface gre2 type=gre > options:remote_ip=192.168.1.1 Thanks, Guru. I'll send out a patch in a bit. --Justin ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [patch_ports 1/4] ofproto-dpif: Create single global revalidation state.
On Mon, Jan 14, 2013 at 07:50:14PM -0800, Ethan Jackson wrote: > Before this patch, the "need_revalidate" flag and the > "revalidate_set" tag_set where maintained separately for each > ofproto. This won't work in future patches when a flow table > change in one ofproto can require revalidation in another entirely > separate ofproto. For this reason, this patch makes these flags > global for all ofprotos in ofproto-dpif. > > Signed-off-by: Ethan Jackson I think it would be wise to include some explanatory comments on the new global variables. I think we discussed this at one point but I don't remember the conclusion: did you consider making these per-dpif_backer rather than global? It is pretty rare to have more than one backer, so maybe that is the logic. In making the revalidation logic global you reversed the order of resetting need_revalidate and calling facet_revalidate(). The original order was intentional to ensure that, should revalidating a facet in turn force revalidation, that later revalidation would actually occur. Thanks for the minor simplification in bundle_flush_macs(). ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [patch_ports 2/4] ofproto-dpif: Refresh stats before ovs-appctl dpif/dump-flows.
On Mon, Jan 14, 2013 at 07:50:15PM -0800, Ethan Jackson wrote: > As a matter of convenience, this patch refreshes the statistics > when ovs-appctl dpif/dump-flows is called. Hopefully this will > prevent users from being confused because statistics they were > expecting to see haven't made it from the datapath into the switch > yet. > > Signed-off-by: Ethan Jackson Seems reasonable. I'm mildly concerned about a command to read state actually mutating state behind the scenes, but I don't think it's a big risk in this case. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [patch_ports 3/4] ofproto-dpif: Implement patch ports in userspace.
On Mon, Jan 14, 2013 at 07:50:16PM -0800, Ethan Jackson wrote: > This commit moves responsibility for implementing patch ports from > the datapath to ofproto-dpif. There are two main reasons to do > this. > > The first is a matter of design: ofproto-dpif both has more > information than the datapath, and is better suited to handle the > complexity required to implement patch ports. > > The second is performance. My setup is a virtual machine with two > basic learning bridges connected by patch ports. I used > ovs-benchmark to ping the virtual router IP residing outside the > VM. Over a 60 second run, "ovs-benchmark rate" improves from > 14618.1 to 19311.9 transactions per second, or a 32% improvement. > Similarly, "ovs-benchmark latency" improves from 6ms to 4ms. > > Signed-off-by: Ethan Jackson I get lots of rejects against current master, would you mind rebasing and reposting? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [patch_ports 3/4] ofproto-dpif: Implement patch ports in userspace.
On Tue, Jan 15, 2013 at 01:39:43PM -0800, Ethan Jackson wrote: > I've folded this incremental in. > > --- > FAQ | 12 +++- > NEWS |1 + > 2 files changed, 4 insertions(+), 9 deletions(-) > > diff --git a/FAQ b/FAQ > index 6d14fc8..b9f655b 100644 > --- a/FAQ > +++ b/FAQ > @@ -172,17 +172,11 @@ A: The kernel module in upstream Linux 3.3 and later > does not include > vSwitch distribution instead of the upstream Linux kernel > module. > > - - Patch virtual ports, that is, interfaces with type "patch". > - You can use Linux "veth" devices as a substitute. > - > - We don't have any plans to add patch ports upstream. > - How about if we add some additional explanation, just to make it really explicit. Also, we can update the list of tunnel types and drop the list format since now there's only one bullet. Like this: A: The kernel module in upstream Linux 3.3 and later does not include tunnel virtual ports, that is, interfaces with type "gre", "ipsec_gre", "gre64", "ipsec_gre64", "vxlan", or "capwap". It is possible to create tunnels in Linux and attach them to Open vSwitch as system devices. However, they cannot be dynamically created through the OVSDB protocol or set the tunnel ids as a flow action. Work is in progress in adding tunnel virtual ports to the upstream Linux version of the Open vSwitch kernel module. For now, if you need these features, use the kernel module from the Open vSwitch distribution instead of the upstream Linux kernel module. The upstream kernel module does not include patch ports, but this only matters for Open vSwitch 1.9 and earlier, because Open vSwitch 1.10 and later implement patch ports without using this kernel feature. Otherwise, thanks for the update. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 11:24:42AM -0800, Jesse Gross wrote: > On Tue, Jan 15, 2013 at 11:05 PM, Ben Pfaff wrote: > > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: > >> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h > >> index 5e32965..b421753 100644 > >> --- a/include/linux/openvswitch.h > >> +++ b/include/linux/openvswitch.h > >> @@ -282,6 +282,7 @@ enum ovs_key_attr { > >> OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ > >> OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ > >> OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ > >> + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ > >> OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ > >> OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ > >> __OVS_KEY_ATTR_MAX > > > > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of > > another value that we're already aiming to get upstream. I'd suggest a > > value of 62. > > > > Jesse, does that sound reasonable to you? Another alternative would be > > to avoid modifying entirely, one way or another, > > until we have real kernel support, which would be a little extra work. > > It's not ideal but I think it's fine for the time being. > Incidentally, Pravin is working on something similar from the kernel > side to allow transformation of Netlink tunnel attributes into > something more efficient for packet processing. > > To ensure that there is no leakage, I would do two things: wrap all > MPLS definitions in #ifndef __KERNEL__ blocks and check that we don't > have any handling code into odp-util.c. That way if another new > attribute later starts using this temporary value then an old version > of OVS userspace will ignore it rather than trying to interprete it as > MPLS. > > If we do that then we can just put the MPLS attribute directly after > IPV4_TUNNEL and new attributes would be inserted in between. The MPLS > value could change but that shouldn't matter because there the value > would be entirely internal. This seems easier to deal with than > allocating at both ends. > > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. > > For the PUSH/POP actions I think we also have the same issue with > inserting in the middle of the list and can use the same solution. > In the comment about the EtherType above struct ovs_action_push_mpls > I would also say that non-MPLS EtherTypes will get rejected rather > than result in a corrupt packet since that seems like better > behavior. All this sounds reasonable to me. Simon, will you take care of it? Thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] dpif-linux: Fix segfault when a port already exists.
Commit 78a2d59c (dpif-linux.c: Let the kernel pick a port number if one not requested.) changed the logic for port assignment, but didn't properly handle some error conditions. An attempt to add a tunnel port that already exists would lead to a segfault. This commit fixes the logic to stop processing and return an error. Reported-by: Gurucharan Shetty Signed-off-by: Justin Pettit --- lib/dpif-linux.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c index 63d7afb..a8544d7 100644 --- a/lib/dpif-linux.c +++ b/lib/dpif-linux.c @@ -475,9 +475,11 @@ dpif_linux_port_add(struct dpif *dpif_, struct netdev *netdev, *port_nop = reply.port_no; VLOG_DBG("%s: assigning port %"PRIu32" to netlink pid %"PRIu32, dpif_name(dpif_), reply.port_no, upcall_pid); -} else if (error == EBUSY && *port_nop != UINT32_MAX) { -VLOG_INFO("%s: requested port %"PRIu32" is in use", - dpif_name(dpif_), *port_nop); +} else { +if (error == EBUSY && *port_nop != UINT32_MAX) { +VLOG_INFO("%s: requested port %"PRIu32" is in use", + dpif_name(dpif_), *port_nop); +} nl_sock_destroy(sock); ofpbuf_delete(buf); return error; -- 1.7.5.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [ovs-assert 4/4] Makefile.am: add check that is not used from unexpected files.
Thanks for all the reviews. I applied your comments from patch 1 (in obvious fashion) and will soon push this to master. On Tue, Jan 15, 2013 at 11:27:58AM -0800, Ethan Jackson wrote: > Acked-by: Ethan Jackson > > On Wed, Nov 7, 2012 at 10:48 AM, Ben Pfaff wrote: > > In general, with a few specific exceptions, ovs_assert is now preferred > > over assert, so this commit adds a check for that in the top-level > > Makefile. > > > > Signed-off-by: Ben Pfaff > > --- > > Makefile.am | 13 + > > configure.ac |1 + > > lib/vlog.c |5 + > > lib/worker.c |5 + > > 4 files changed, 24 insertions(+), 0 deletions(-) > > > > diff --git a/Makefile.am b/Makefile.am > > index b71ca1f..56fe3bb 100644 > > --- a/Makefile.am > > +++ b/Makefile.am > > @@ -180,6 +180,19 @@ rate-limit-check: > > fi > > .PHONY: rate-limit-check > > > > +# Check that assert.h is not used outside a whitelist of files. > > +ALL_LOCAL += check-assert-h-usage > > +check-assert-h-usage: > > + @if test -e $(srcdir)/.git && (git --version) >/dev/null 2>&1 && \ > > + (cd $(srcdir) && git --no-pager grep -l -E '[<]assert.h[>]') | \ > > + $(EGREP) -v '^lib/(sflow_receiver|vlog|worker).c$$|^tests/'; \ > > + then \ > > + echo "Files listed above unexpectedly #include > > <""assert.h"">."; \ > > + echo "Please use ovs_assert (from util.h) instead of assert."; \ > > + exit 1; \ > > +fi > > +.PHONY: check-assert-h-usage > > + > > if HAVE_GROFF > > ALL_LOCAL += manpage-check > > manpage-check: $(man_MANS) $(dist_man_MANS) $(noinst_man_MANS) > > diff --git a/configure.ac b/configure.ac > > index 32940a5..b85cf06 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -27,6 +27,7 @@ AC_PROG_CPP > > AC_PROG_RANLIB > > AC_PROG_MKDIR_P > > AC_PROG_FGREP > > +AC_PROG_EGREP > > > > AC_ARG_VAR([PERL], [path to Perl interpreter]) > > AC_PATH_PROG([PERL], perl, no) > > diff --git a/lib/vlog.c b/lib/vlog.c > > index 0bd9bd1..600c4b6 100644 > > --- a/lib/vlog.c > > +++ b/lib/vlog.c > > @@ -40,6 +40,11 @@ > > > > VLOG_DEFINE_THIS_MODULE(vlog); > > > > +/* ovs_assert() logs the assertion message, so using ovs_assert() in this > > + * source file could cause recursion. */ > > +#undef ovs_assert > > +#define ovs_assert use_assert_instead_of_ovs_assert_in_this_module > > + > > /* Name for each logging level. */ > > static const char *level_names[VLL_N_LEVELS] = { > > #define VLOG_LEVEL(NAME, SYSLOG_LEVEL) #NAME, > > diff --git a/lib/worker.c b/lib/worker.c > > index f2b896e..58ee85b 100644 > > --- a/lib/worker.c > > +++ b/lib/worker.c > > @@ -37,6 +37,11 @@ > > > > VLOG_DEFINE_THIS_MODULE(worker); > > > > +/* ovs_assert() logs the assertion message and logging sometimes goes > > through a > > + * worker, so using ovs_assert() in this source file could cause > > recursion. */ > > +#undef ovs_assert > > +#define ovs_assert use_assert_instead_of_ovs_assert_in_this_module > > + > > /* Header for an RPC request. */ > > struct worker_request { > > size_t request_len; /* Length of the payload in bytes. */ > > -- > > 1.7.2.5 > > > > ___ > > dev mailing list > > dev@openvswitch.org > > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [murmurhash 3/4] hash: Change mhash_finish() data measurement from words to bytes.
On Fri, Dec 14, 2012 at 04:33:17PM -0800, Ben Pfaff wrote: > murmurhash includes an xor with the number of bytes hashed in its finishing > step. Until now, we've only had murmurhash for full words, but an upcoming > commit adds murmurhash for bytes, so the finishing function will need to > take a count of bytes instead. > > Signed-off-by: Ben Pfaff I see a review for patches 1, 2, and 4, but not this patch (3). Did you look at this patch? Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [murmurhash 4/4] hash: Replace primary hash functions by murmurhash.
On Wed, Jan 16, 2013 at 02:24:56PM -0800, Ethan Jackson wrote: > I might consider putting hash_3words() near hash_2words() in the > header file. Mostly just for consistency sake. I put the prototypes nearby at the top of the file, but hash_3words() generated enough code that it didn't look worthwhile to inline it, so the definition is in the .c file. The prototypes did look like they could use some rearrangement, though. How about this: uint32_t hash_words(const uint32_t data[], size_t n_words, uint32_t basis); uint32_t hash_bytes(const void *, size_t n_bytes, uint32_t basis); static inline uint32_t hash_int(uint32_t x, uint32_t basis); static inline uint32_t hash_2words(uint32_t, uint32_t); uint32_t hash_3words(uint32_t, uint32_t, uint32_t); static inline uint32_t hash_boolean(bool x, uint32_t basis); uint32_t hash_double(double, uint32_t basis); static inline uint32_t hash_pointer(const void *, uint32_t basis); static inline uint32_t hash_string(const char *, uint32_t basis); > In hash_3words(), you're finishing with a byte count of 8 instead 12. > I may be misinterpreting what that value is for though. You're right. I changed it to 12. > Acked-by: Ethan Jackson Here's the new version in case you want to look. I only made the changes above --8<--cut here-->8-- From: Ben Pfaff Date: Wed, 16 Jan 2013 16:14:42 -0800 Subject: [PATCH] hash: Replace primary hash functions by murmurhash. murmurhash is faster than Jenkins and slightly higher quality, so switch to it for hashing words. The best timings I got for hashing for data lengths of the following numbers of 32-bit words, in seconds per 1,000,000,000 hashes, were: words murmurhash Jenkins hash - -- 1 8.4 10.4 2 10.3 10.3 3 11.2 10.7 4 12.6 18.0 5 13.9 18.3 6 15.2 18.7 In other words, murmurhash outperforms Jenkins for all input lengths other than exactly 3 32-bit words (12 bytes). (It's understandable that Jenkins would have a best case at 12 bytes, because Jenkins works in 12-byte chunks.) Even in the case where Jenkins is faster, it's only by 5%. On average within this data set, murmurhash is 15% faster, and for 4-word input it is 30% faster. We retain Jenkins for flow_hash_symmetric_l4() and flow_hash_fields(), which are cases where the hash value is exposed externally. This commit appears to improve "ovs-benchmark rate" results slightly by a few hundred connections per second (under 1%), when used with an NVP controller. Signed-off-by: Ben Pfaff --- lib/automake.mk |2 + lib/flow.c|5 +- lib/hash.c| 91 +-- lib/hash.h| 134 ++--- lib/jhash.c | 125 + lib/jhash.h | 44 + tests/test-hash.c | 27 ++- 7 files changed, 259 insertions(+), 169 deletions(-) create mode 100644 lib/jhash.c create mode 100644 lib/jhash.h diff --git a/lib/automake.mk b/lib/automake.mk index 740f33f..4547198 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -61,6 +61,8 @@ lib_libopenvswitch_a_SOURCES = \ lib/hmap.h \ lib/hmapx.c \ lib/hmapx.h \ + lib/jhash.c \ + lib/jhash.h \ lib/json.c \ lib/json.h \ lib/jsonrpc.c \ diff --git a/lib/flow.c b/lib/flow.c index 05bc269..2a3dd3d 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -30,6 +30,7 @@ #include "csum.h" #include "dynamic-string.h" #include "hash.h" +#include "jhash.h" #include "match.h" #include "ofpbuf.h" #include "openflow/openflow.h" @@ -722,7 +723,7 @@ flow_hash_symmetric_l4(const struct flow *flow, uint32_t basis) fields.tp_port = flow->tp_src ^ flow->tp_dst; } } -return hash_bytes(&fields, sizeof fields, basis); +return jhash_bytes(&fields, sizeof fields, basis); } /* Hashes the portions of 'flow' designated by 'fields'. */ @@ -733,7 +734,7 @@ flow_hash_fields(const struct flow *flow, enum nx_hash_fields fields, switch (fields) { case NX_HASH_FIELDS_ETH_SRC: -return hash_bytes(flow->dl_src, sizeof flow->dl_src, basis); +return jhash_bytes(flow->dl_src, sizeof flow->dl_src, basis); case NX_HASH_FIELDS_SYMMETRIC_L4: return flow_hash_symmetric_l4(flow, basis); diff --git a/lib/hash.c b/lib/hash.c index 8cee5d0..e954d78 100644 --- a/lib/hash.c +++ b/lib/hash.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2012 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2012, 2013 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -18,57 +18,11 @@ #include #include "unaligned.h" -/* Returns the hash o
Re: [ovs-dev] [PATCH] dpif-linux: Fix segfault when a port already exists.
On Wed, Jan 16, 2013 at 03:59:23PM -0800, Justin Pettit wrote: > Commit 78a2d59c (dpif-linux.c: Let the kernel pick a port number if one > not requested.) changed the logic for port assignment, but didn't > properly handle some error conditions. An attempt to add a tunnel port > that already exists would lead to a segfault. This commit fixes the > logic to stop processing and return an error. > > Reported-by: Gurucharan Shetty > Signed-off-by: Justin Pettit This time for sure! Looks good, thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 11:24:42AM -0800, Jesse Gross wrote: > On Tue, Jan 15, 2013 at 11:05 PM, Ben Pfaff wrote: > > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: > >> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h > >> index 5e32965..b421753 100644 > >> --- a/include/linux/openvswitch.h > >> +++ b/include/linux/openvswitch.h > >> @@ -282,6 +282,7 @@ enum ovs_key_attr { > >> OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ > >> OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ > >> OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ > >> + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ > >> OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ > >> OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ > >> __OVS_KEY_ATTR_MAX > > > > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of > > another value that we're already aiming to get upstream. I'd suggest a > > value of 62. > > > > Jesse, does that sound reasonable to you? Another alternative would be > > to avoid modifying entirely, one way or another, > > until we have real kernel support, which would be a little extra work. > > It's not ideal but I think it's fine for the time being. > Incidentally, Pravin is working on something similar from the kernel > side to allow transformation of Netlink tunnel attributes into > something more efficient for packet processing. > > To ensure that there is no leakage, I would do two things: wrap all > MPLS definitions in #ifndef __KERNEL__ blocks and check that we don't > have any handling code into odp-util.c. That way if another new > attribute later starts using this temporary value then an old version > of OVS userspace will ignore it rather than trying to interprete it as > MPLS. OVS_KEY_ATTR_MPLS is used in userspace: * lib/odp-util.c uses it in flow key handling routines. In particular in: - ovs_key_attr_to_string() - odp_flow_key_attr_len() - format_odp_key_attr() - parse_odp_key_attr() - odp_flow_key_from_flow() - parse_l3_onward() - commit_mpls_action() [new] * lib/dpif-netdev.c uses it to allow the user-space datapath to handle MPLS. In particular in execute_set_action(). > If we do that then we can just put the MPLS attribute directly after > IPV4_TUNNEL and new attributes would be inserted in between. The MPLS > value could change but that shouldn't matter because there the value > would be entirely internal. This seems easier to deal with than > allocating at both ends. Sure, I'll move OVS_KEY_ATTR_MPLS to after OVS_KEY_ATTR_IPV4_TUNNEL. > > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. > > For the PUSH/POP actions I think we also have the same issue with > inserting in the middle of the list and can use the same solution. I'm unsure what "inserting in the middle of the list" refers to. However, when a flow key is created it should be done so in such away that there may be at most one element from the following list present: - OVS_ACTION_SET/OVS_KEY_ATTR_MPLS (previously OVS_ACTION_ATTR_SET_MPLS) - OVS_ACTION_ATTR_PUSH_MPLS - OVS_ACTION_ATTR_POP_MPLS I don't believe that the above is enforced anywhere, however it should be the way the flow key creation code currently works. > In the comment about the EtherType above struct ovs_action_push_mpls I > would also say that non-MPLS EtherTypes will get rejected rather than > result in a corrupt packet since that seems like better behavior. I believe that is the behaviour in the code. I'll double check and add the comment as you suggest. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 03:49:00PM -0800, Ben Pfaff wrote: > On Wed, Jan 16, 2013 at 11:24:42AM -0800, Jesse Gross wrote: > > On Tue, Jan 15, 2013 at 11:05 PM, Ben Pfaff wrote: > > > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: > > >> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h > > >> index 5e32965..b421753 100644 > > >> --- a/include/linux/openvswitch.h > > >> +++ b/include/linux/openvswitch.h > > >> @@ -282,6 +282,7 @@ enum ovs_key_attr { > > >> OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ > > >> OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ > > >> OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ > > >> + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ > > >> OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ > > >> OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ > > >> __OVS_KEY_ATTR_MAX > > > > > > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of > > > another value that we're already aiming to get upstream. I'd suggest a > > > value of 62. > > > > > > Jesse, does that sound reasonable to you? Another alternative would be > > > to avoid modifying entirely, one way or another, > > > until we have real kernel support, which would be a little extra work. > > > > It's not ideal but I think it's fine for the time being. > > Incidentally, Pravin is working on something similar from the kernel > > side to allow transformation of Netlink tunnel attributes into > > something more efficient for packet processing. > > > > To ensure that there is no leakage, I would do two things: wrap all > > MPLS definitions in #ifndef __KERNEL__ blocks and check that we don't > > have any handling code into odp-util.c. That way if another new > > attribute later starts using this temporary value then an old version > > of OVS userspace will ignore it rather than trying to interprete it as > > MPLS. > > > > If we do that then we can just put the MPLS attribute directly after > > IPV4_TUNNEL and new attributes would be inserted in between. The MPLS > > value could change but that shouldn't matter because there the value > > would be entirely internal. This seems easier to deal with than > > allocating at both ends. > > > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > > > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. > > > > For the PUSH/POP actions I think we also have the same issue with > > inserting in the middle of the list and can use the same solution. > > In the comment about the EtherType above struct ovs_action_push_mpls > > I would also say that non-MPLS EtherTypes will get rejected rather > > than result in a corrupt packet since that seems like better > > behavior. > > All this sounds reasonable to me. Simon, will you take care of it? Yes, of course. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Thu, Jan 17, 2013 at 09:27:03AM +0900, Simon Horman wrote: > > > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > > > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. > > > > For the PUSH/POP actions I think we also have the same issue with > > inserting in the middle of the list and can use the same solution. > > I'm unsure what "inserting in the middle of the list" refers to. I think Jesse just means that you shouldn't add a new enum value in a way that changes existing enum values. (Took me a minute too.) ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] dpif-linux: Fix segfault when a port already exists.
On Wed, Jan 16, 2013 at 3:59 PM, Justin Pettit wrote: > Commit 78a2d59c (dpif-linux.c: Let the kernel pick a port number if one > not requested.) changed the logic for port assignment, but didn't > properly handle some error conditions. An attempt to add a tunnel port > that already exists would lead to a segfault. This commit fixes the > logic to stop processing and return an error. > > Reported-by: Gurucharan Shetty > Signed-off-by: Justin Pettit > --- > lib/dpif-linux.c |8 +--- > 1 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c > index 63d7afb..a8544d7 100644 > --- a/lib/dpif-linux.c > +++ b/lib/dpif-linux.c > @@ -475,9 +475,11 @@ dpif_linux_port_add(struct dpif *dpif_, struct netdev > *netdev, > *port_nop = reply.port_no; > VLOG_DBG("%s: assigning port %"PRIu32" to netlink pid %"PRIu32, > dpif_name(dpif_), reply.port_no, upcall_pid); > -} else if (error == EBUSY && *port_nop != UINT32_MAX) { > -VLOG_INFO("%s: requested port %"PRIu32" is in use", > - dpif_name(dpif_), *port_nop); > +} else { > +if (error == EBUSY && *port_nop != UINT32_MAX) { > +VLOG_INFO("%s: requested port %"PRIu32" is in use", > + dpif_name(dpif_), *port_nop); > +} > nl_sock_destroy(sock); > ofpbuf_delete(buf); > return error; > -- If the kernel returns success but port_nop is UINT32_MAX, I think we will still segfault. That probably is not an issue since the kernel is not supposed to do that? > 1.7.5.4 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ofproto: Optimise OpenFlow flow expiry
On Wed, Jan 16, 2013 at 01:59:21PM -0800, Ben Pfaff wrote: > On Tue, Jan 15, 2013 at 01:20:57PM +0900, Simon Horman wrote: > > Optimise OpenFlow flow expiry by placing expirable flows on a list. > > This optimises scanning of flows for expiry in two ways: > > > > * Empirically list traversal appears faster than the code it replaces. > > > > With 1,000,000 flows present an otherwise idle system I observed CPU > > utilisation of around 20% with the existing code but around 10% with > > this new code. > > > > * Flows that will never expire are not traversed. > > > > This addresses a case seen in the field. > > This version looks better. I still have a few comments, but before > that, may I ask a little bit about the situation in which the > performance improvement was observed? In this situation, about how > many of the 1,000,000 flows were actually expirable, that is, had > either a hard timeout or an idle timeout? That is, is the performance > improvement due more to the first or the second bullet you list above? > If none of the flows were expirable, then I guess it was the second; > if all of them were, then I guess it was the first; and otherwise it > is something in between. > > Basically I'm wondering if we should do something to make flow table > traversal faster, independent of expiration. Hi Ben, the primary aim of this patch was to address a performance issue that was noticed when inserting 100,000 flows none of which were expirable. I have been told this is representative of an expected use-case. During my testing I used 1,000,000 flows instead of 100,000 in order to make the CPU utilisation more pronounced and easier to observe. In the course of my testing I tested 1,000,000 flows none of which were expirable and in that case CPU utilisation was dramatically reduced to approximately 0. This seems to be a good outcome for the use-case originally reported. In the course of testing I also tested 1,000,000 flows all of which were expirable. This was primarily to see if there were any regressions. In the course of this test I noticed that there seemed to be some reduction in CPU utilisation. However this was just a side effect and not an aim of my work. I should have placed it as the second bullet in my commit message and noted that it was a side effect. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 04:31:31PM -0800, Ben Pfaff wrote: > On Thu, Jan 17, 2013 at 09:27:03AM +0900, Simon Horman wrote: > > > > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > > > > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. > > > > > > For the PUSH/POP actions I think we also have the same issue with > > > inserting in the middle of the list and can use the same solution. > > > > I'm unsure what "inserting in the middle of the list" refers to. > > I think Jesse just means that you shouldn't add a new enum value in > a way that changes existing enum values. (Took me a minute too.) Oh yes, of course. I'll move the PUSH/POP values to the end. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] Question regarding porting ovs to hardware switch.
Hi list, Our company (a switch silicon company) is now developping a hardsware OpenFlow switch, we love OVS architechure and have ported it succesfully. Because OpenFlow standard is in still under development and OVS is under development too, we are now focusing implementing OpenFlow spec 1.1+, so there are some questions arised during our developping: Q1: Can we contribute our modification for implementing OpenFlow spec 1.1+ message logic without the corresponding dpif logic? because we are developping hardware switch, we will not developping features that support dpif, let's take OpenFlow 1.3 Group Mod message as an example. Q2: Is it possible that we commit our ofproto provider (silicon bound) to OVS to serve as a reference for hardware switch vendor who have interest on porting OVS to hardware switch? We are glad to hear the feedback from OVS community, Thanks a lot. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Question regarding porting ovs to hardware switch.
On Thu, Jan 17, 2013 at 10:18:14AM +0800, ?? wrote: > Q1: Can we contribute our modification for implementing OpenFlow spec > 1.1+ message logic without the corresponding dpif logic? because we are > developping hardware switch, we will not developping features that support > dpif, let's take OpenFlow 1.3 Group Mod message as an example. If the code you have is a starting point for generic group_mod support, then we'd take it in that spirit. It is likely that we could not actually apply the change to the master branch without adding basic support to the ofproto-dpif provider. > Q2: Is it possible that we commit our ofproto provider (silicon bound) to > OVS to serve as a reference for hardware switch vendor who have interest on > porting OVS to hardware switch? That would be awesome. Yes, please, contribute it. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] v2 datapath: More flexible kernel/userspace tunneling attribute.
On Wed, Jan 16, 2013 at 1:54 PM, Pravin B Shelar wrote: > diff --git a/datapath/datapath.c b/datapath/datapath.c > index ed69af8..4ed40e2 100644 > --- a/datapath/datapath.c > +++ b/datapath/datapath.c > +static struct nlattr *reserve_sfa_size(struct sw_flow_actions **sfa, int > attr_len) > +{ > + > + struct sw_flow_actions *new; > + struct nlattr *a; > + > + if (NLA_ALIGN(attr_len) <= (ksize(*sfa) - (*sfa)->actions_len)) > + goto out; > + > + if (ksize(*sfa) * 2 > MAX_ACTIONS_BUFSIZE) > + return ERR_PTR(-EMSGSIZE); It's possible that the current size is smaller than MAX_ACTIONS_BUFSIZE but 2 * size is larger. This probably is not likely because kmalloc will round up to a power of two and MAX_ACTIONS_BUFSIZE is a power of two but I'm not sure that we want to implicitly assume that. > @@ -716,16 +850,15 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, > struct genl_info *info) > err = PTR_ERR(acts); > if (IS_ERR(acts)) > goto err_flow_free; > + > + err = validate_and_copy_actions(a[OVS_PACKET_ATTR_ACTIONS], > &flow->key, 0, &acts); > rcu_assign_pointer(flow->sf_acts, acts); > + if (err) > + goto err_flow_free; I would probably put the error handler before continuing on with the rcu_assign_pointer call. > +static int actions_to_attr(const struct nlattr *attr, int len, struct > sk_buff *skb) > +{ > + const struct nlattr *a; > + int rem, err; > + > + nla_for_each_attr(a, attr, len, rem) { > + bool skip_copy; > + int type = nla_type(a); > + > + skip_copy = false; > + switch (type) { > + case OVS_ACTION_ATTR_SET: > + err = set_tun_action_to_attr(a, skb, &skip_copy); The name is a little strange given that we call it unconditionally for all set actions. > @@ -951,28 +1179,32 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff > *skb, struct genl_info *info) > > /* Validate actions. */ > if (a[OVS_FLOW_ATTR_ACTIONS]) { > - error = validate_actions(a[OVS_FLOW_ATTR_ACTIONS], &key, 0); > - if (error) > + acts = > ovs_flow_actions_alloc(nla_len(a[OVS_FLOW_ATTR_ACTIONS])); > + error = PTR_ERR(acts); > + if (IS_ERR(acts)) > goto error; > + > + error = validate_and_copy_actions(a[OVS_FLOW_ATTR_ACTIONS], > &key, 0, &acts); > + if (error) { > + goto err_kfree; > + } We don't need the braces around this error condition. > } else if (info->genlhdr->cmd == OVS_FLOW_CMD_NEW) { > error = -EINVAL; > - goto error; > + goto err_kfree; I don't we need to call err_kfree in this case because we didn't actually allocate anything. > diff --git a/datapath/flow.c b/datapath/flow.c > index 63eef77..49982f0 100644 > --- a/datapath/flow.c > +++ b/datapath/flow.c > +int ipv4_tun_from_nlattr(const struct nlattr *attr, > +struct ovs_key_ipv4_tunnel *tun_key) > +{ > + struct nlattr *a; > + int rem; > + > + memset(tun_key, 0, sizeof(*tun_key)); > + > + nla_for_each_nested(a, attr, rem) { > + int type = nla_type(a); > + static const u32 ovs_tunnel_key_lens[OVS_TUNNEL_MAX + 1] = { > + [OVS_TUNNEL_ID] = sizeof(u64), > + [OVS_TUNNEL_IPV4_SRC] = sizeof(u32), > + [OVS_TUNNEL_IPV4_DST] = sizeof(u32), > + [OVS_TUNNEL_TOS] = 1, > + [OVS_TUNNEL_TTL] = 1, > + [OVS_TUNNEL_FLAGS_DONT_FRAGMENT] = 0, > + [OVS_TUNNEL_FLAGS_CSUM] = 0, > + }; > + > + if (type > OVS_TUNNEL_MAX || > + ovs_tunnel_key_lens[type] != nla_len(a)) > + return -EINVAL; > + > + switch (type) { > + case OVS_TUNNEL_ID: > + memcpy(&tun_key->tun_id, nla_data(a), sizeof(__be64)); > + tun_key->tun_flags |= OVS_TNL_F_KEY; > + break; > + case OVS_TUNNEL_IPV4_SRC: > + memcpy(&tun_key->ipv4_src, nla_data(a), > sizeof(__be32)); > + break; > + case OVS_TUNNEL_IPV4_DST: > + memcpy(&tun_key->ipv4_dst, nla_data(a), > sizeof(__be32)); Can't we use nla_get_X for these types? > + if (rem > 0) > + return -EINVAL; > + > + if (!tun_key->ipv4_dst) > + return -EINVAL; > + > + if (!tun_key->ipv4_ttl) > + return -EINVAL; I would distinguish between TTL of zero and not set. If TTL is zero is explicitly asked for then I think it's fine to allow but we might want to create a different default later. > +int ipv4_t
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 4:27 PM, Simon Horman wrote: > On Wed, Jan 16, 2013 at 11:24:42AM -0800, Jesse Gross wrote: >> On Tue, Jan 15, 2013 at 11:05 PM, Ben Pfaff wrote: >> > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: >> >> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h >> >> index 5e32965..b421753 100644 >> >> --- a/include/linux/openvswitch.h >> >> +++ b/include/linux/openvswitch.h >> >> @@ -282,6 +282,7 @@ enum ovs_key_attr { >> >> OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ >> >> OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ >> >> OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ >> >> + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ >> >> OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ >> >> OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ >> >> __OVS_KEY_ATTR_MAX >> > >> > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of >> > another value that we're already aiming to get upstream. I'd suggest a >> > value of 62. >> > >> > Jesse, does that sound reasonable to you? Another alternative would be >> > to avoid modifying entirely, one way or another, >> > until we have real kernel support, which would be a little extra work. >> >> It's not ideal but I think it's fine for the time being. >> Incidentally, Pravin is working on something similar from the kernel >> side to allow transformation of Netlink tunnel attributes into >> something more efficient for packet processing. >> >> To ensure that there is no leakage, I would do two things: wrap all >> MPLS definitions in #ifndef __KERNEL__ blocks and check that we don't >> have any handling code into odp-util.c. That way if another new >> attribute later starts using this temporary value then an old version >> of OVS userspace will ignore it rather than trying to interprete it as >> MPLS. > > OVS_KEY_ATTR_MPLS is used in userspace: > > * lib/odp-util.c uses it in flow key handling routines. > In particular in: > - ovs_key_attr_to_string() > - odp_flow_key_attr_len() > - format_odp_key_attr() > - parse_odp_key_attr() > - odp_flow_key_from_flow() > - parse_l3_onward() > - commit_mpls_action() [new] > > * lib/dpif-netdev.c uses it to allow the user-space datapath to handle > MPLS. In particular in execute_set_action(). It's the usage in the userspace/kernel interfaces that I'm concerned about. If we're saying that this is essentially an internal identifier to userspace then there's no benefit to having the code that communicates that value to the kernel at this time. If we have it but don't use then it's possible that there will be a conflict in the future if there is a different user of this temporary identifier. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 4:31 PM, Ben Pfaff wrote: > On Thu, Jan 17, 2013 at 09:27:03AM +0900, Simon Horman wrote: >> > > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using >> > > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. >> > >> > For the PUSH/POP actions I think we also have the same issue with >> > inserting in the middle of the list and can use the same solution. >> >> I'm unsure what "inserting in the middle of the list" refers to. > > I think Jesse just means that you shouldn't add a new enum value in > a way that changes existing enum values. (Took me a minute too.) Right, sorry. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] Question regarding porting ovs to hardware switch.
Hi Ben, Thanks a lot for your quick reply, that provider much information for us and we can do our work accordingly. 2013/1/17 Ben Pfaff > On Thu, Jan 17, 2013 at 10:18:14AM +0800, ?? wrote: > > Q1: Can we contribute our modification for implementing OpenFlow spec > > 1.1+ message logic without the corresponding dpif logic? because we are > > developping hardware switch, we will not developping features that > support > > dpif, let's take OpenFlow 1.3 Group Mod message as an example. > > If the code you have is a starting point for generic group_mod support, > then we'd take it in that spirit. It is likely that we could not > actually apply the change to the master branch without adding basic > support to the ofproto-dpif provider. > > > Q2: Is it possible that we commit our ofproto provider (silicon bound) > to > > OVS to serve as a reference for hardware switch vendor who have interest > on > > porting OVS to hardware switch? > > That would be awesome. Yes, please, contribute it. > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 07:04:06PM -0800, Jesse Gross wrote: > On Wed, Jan 16, 2013 at 4:27 PM, Simon Horman wrote: > > On Wed, Jan 16, 2013 at 11:24:42AM -0800, Jesse Gross wrote: > >> On Tue, Jan 15, 2013 at 11:05 PM, Ben Pfaff wrote: > >> > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: > >> >> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h > >> >> index 5e32965..b421753 100644 > >> >> --- a/include/linux/openvswitch.h > >> >> +++ b/include/linux/openvswitch.h > >> >> @@ -282,6 +282,7 @@ enum ovs_key_attr { > >> >> OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ > >> >> OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ > >> >> OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ > >> >> + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ > >> >> OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ > >> >> OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ > >> >> __OVS_KEY_ATTR_MAX > >> > > >> > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of > >> > another value that we're already aiming to get upstream. I'd suggest a > >> > value of 62. > >> > > >> > Jesse, does that sound reasonable to you? Another alternative would be > >> > to avoid modifying entirely, one way or another, > >> > until we have real kernel support, which would be a little extra work. > >> > >> It's not ideal but I think it's fine for the time being. > >> Incidentally, Pravin is working on something similar from the kernel > >> side to allow transformation of Netlink tunnel attributes into > >> something more efficient for packet processing. > >> > >> To ensure that there is no leakage, I would do two things: wrap all > >> MPLS definitions in #ifndef __KERNEL__ blocks and check that we don't > >> have any handling code into odp-util.c. That way if another new > >> attribute later starts using this temporary value then an old version > >> of OVS userspace will ignore it rather than trying to interprete it as > >> MPLS. > > > > OVS_KEY_ATTR_MPLS is used in userspace: > > > > * lib/odp-util.c uses it in flow key handling routines. > > In particular in: > > - ovs_key_attr_to_string() > > - odp_flow_key_attr_len() > > - format_odp_key_attr() > > - parse_odp_key_attr() > > - odp_flow_key_from_flow() > > - parse_l3_onward() > > - commit_mpls_action() [new] > > > > * lib/dpif-netdev.c uses it to allow the user-space datapath to handle > > MPLS. In particular in execute_set_action(). > > It's the usage in the userspace/kernel interfaces that I'm concerned > about. If we're saying that this is essentially an internal > identifier to userspace then there's no benefit to having the code > that communicates that value to the kernel at this time. If we have > it but don't use then it's possible that there will be a conflict in > the future if there is a different user of this temporary identifier. My understanding is that fundamentally OVS_KEY_ATTR_MPLS is used for as part of the userspace/kernel interface and that thus code on both sides uses it. Likewise for OVS_ACTION_ATTR_PUSH_MPLS and OVS_ACTION_ATTR_POP_MPLS. Perhaps a good approach is to use a values towards the end of the possible space for OVS_KEY_ATTR_MPLS, OVS_ACTION_ATTR_PUSH_MPLS and OVS_ACTION_ATTR_POP_MPLS, as has been done for OVS_KEY_ATTR_TUN_ID, to reduce the likely hood of a clash with another identifier in the short to medium term. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Wed, Jan 16, 2013 at 05:38:08PM +0900, Simon Horman wrote: > On Tue, Jan 15, 2013 at 11:05:45PM -0800, Ben Pfaff wrote: > > On Tue, Jan 08, 2013 at 02:46:02PM +0900, Simon Horman wrote: > > > diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h > > > index 5e32965..b421753 100644 > > > --- a/include/linux/openvswitch.h > > > +++ b/include/linux/openvswitch.h > > > @@ -282,6 +282,7 @@ enum ovs_key_attr { > > > OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */ > > > OVS_KEY_ATTR_ND,/* struct ovs_key_nd */ > > > OVS_KEY_ATTR_SKB_MARK, /* u32 skb mark */ > > > + OVS_KEY_ATTR_MPLS, /* struct ovs_key_mpls */ > > > OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */ > > > OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */ > > > __OVS_KEY_ATTR_MAX > > > > I'm pretty sure we shouldn't be inserting OVS_KEY_ATTR_MPLS in front of > > another value that we're already aiming to get upstream. I'd suggest a > > value of 62. > > > > Jesse, does that sound reasonable to you? Another alternative would be > > to avoid modifying entirely, one way or another, > > until we have real kernel support, which would be a little extra work. > > I'll use 62 unless I hear otherwise from Jesse. > > > I'm not sure why we have OVS_ACTION_ATTR_SET_MPLS instead of using > > OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS. > > I'll switch over to using OVS_ACTION_SET+OVS_KEY_ATTR_MPLS, > it seems to be in keeping with the existing code. > > > > > > +/* Action structure for NXAST_PUSH_VLAN/MPLS. */ > > > +struct nx_action_push { > > > +ovs_be16 type; /* OFPAT_PUSH_VLAN/MPLS. */ > > > > The above two comments are wrong, there's no NXAST_PUSH_VLAN. And the > > struct should be named nx_action_push_mpls, not more generically. (This > > must be a leftover from Ravi's initial patch that also added QinQ.) > > > > > +ovs_be16 len; /* Length is 8. */ > > > +ovs_be32 vendor;/* NX_VENDOR_ID. */ > > > +ovs_be16 subtype; /* NXAST_PUSH_MPLS. */ > > > +ovs_be16 ethertype; /* Ethertype */ > > > +uint8_t pad[4]; > > > +}; > > > +OFP_ASSERT(sizeof(struct nx_action_push) == 16); > > > > > @@ -1271,6 +1272,20 @@ dp_netdev_execute_actions(struct dp_netdev *dp, > > > eth_pop_vlan(packet); > > > break; > > > > > > +case OVS_ACTION_ATTR_PUSH_MPLS: { > > > +const struct ovs_action_push_mpls *mpls = nl_attr_get(a); > > > +push_mpls(packet, mpls->mpls_ethertype, mpls->mpls_label); > > > +break; > > > > The similar OVS_ACTION_ATTR_PUSH_VLAN case declares its variable at the > > outer block. Please make the code consistent one way or another on this > > score. I will supply a patch to localise the vlan variable for OVS_ACTION_ATTR_PUSH_VLAN as this approach reduces clutter at the top of the loop and keeps variables close to where they are used. > > The breakdown of code between parse_mpls() and parse_remaining_mpls() > > seems odd. I'd just write one function. > > > > In mf_is_value_valid(), I think that the MFF_MPLS_TC and MFF_MPLS_BOS > > cases should check the u8 member, not be32. Same for > > mf_random_value(). Also htonl won't be needed for u8. > > > > Most of the meta-flow functions present the cases in the order label, > > tc, bos, but mf_set_value() uses a different order. Can you make it > > consistent? > > > > In enum mf_prereqs, it might be best to add an "L2.5" category for MPLS. > > > > This adds a double blank line in nx_put_raw(). (Oh the horror!) > > > > Also in nx_put_raw(), missing space after the comma here: > > +nxm_put_8(b,OXM_OF_MPLS_TC, mpls_lse_to_tc(flow->mpls_lse)); > > > > parse_l3_onward() has a ";;": > > +ovs_be16 dl_type = flow->dl_type;; > > > > parse_l3_onward() could use flow_innermost_dl_type() since that's what > > it's effectively calculating as 'dl_type' (maybe it should be > > 'inner_dl_type' or 'innermost_dl_type'). I'm not so sure about this. flow_innermost_dl_type(), in its current incantation, is used to obtain the innermost dl_type of a struct flow. In other words, it reads flow->encap_dl_type. On the other hand, parse_l3_onward() may write flow->encap_dl_type and never reads it. > > Is there any sense in handling cases in commit_mpls_action() where the > > mpls_depth has to change by more than 1? Should we log an error at > > least? I think that logging is a good approach. I have update the code as follows: if (flow->mpls_depth < base->mpls_depth) { if (base->mpls_depth - flow->mpls_depth > 1) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(10, 10); VLOG_WARN_RL(&rl, "Multiple mpls_pop actions reduced to " " a single mpls_pop action"); } nl_msg_put_be16(odp_actions, OVS_ACTION_ATTR_POP_MPLS, flow->dl_type); } else if (flow->mpls_depth > base->mpls_depth) { struct
[ovs-dev] [PATCH 03/17] odp-util: commit_set_nw_action: use flow's innermost dl_type
Use the innermost dl_type when decoding L3 and L4 data from a packet. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * No change v2.12 * Use flow_innermost_dl_type helper v2.11 * First post --- lib/odp-util.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/odp-util.c b/lib/odp-util.c index 025664c..c00b5a9 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -2269,14 +2269,16 @@ static void commit_set_nw_action(const struct flow *flow, struct flow *base, struct ofpbuf *odp_actions) { +ovs_be16 dl_type = flow_innermost_dl_type(flow); + /* Check if flow really have an IP header. */ if (!flow->nw_proto) { return; } -if (base->dl_type == htons(ETH_TYPE_IP)) { +if (dl_type == htons(ETH_TYPE_IP)) { commit_set_ipv4_action(flow, base, odp_actions); -} else if (base->dl_type == htons(ETH_TYPE_IPV6)) { +} else if (dl_type == htons(ETH_TYPE_IPV6)) { commit_set_ipv6_action(flow, base, odp_actions); } } -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 17/17] datapath: Allow IP actions for MPLS
Allow copy_ttl_in to function in the case where: a) The outer header is MPLS; b) The next-outer header is IP and; c) The resulting nw ttl will be different from its existing value In this circumstance a set ipv4 action will be used to adjust the TTL but the dl_type of the match will be ETH_P_MPLS_UC or ETH_P_MPLS_MC. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * No change v2.12 * Use eth_p_mpls helper v2.11 * First post --- datapath/datapath.c |8 1 file changed, 8 insertions(+) diff --git a/datapath/datapath.c b/datapath/datapath.c index adbf5f3..46926b4 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -566,6 +566,10 @@ static int validate_set(const struct nlattr *a, break; case OVS_KEY_ATTR_IPV4: + /* Gratuitous exception for MPLS to allow copy_ttl_in */ + if (eth_p_mpls(flow_key->eth.type)) + break; + if (flow_key->eth.type != htons(ETH_P_IP)) return -EINVAL; @@ -582,6 +586,10 @@ static int validate_set(const struct nlattr *a, break; case OVS_KEY_ATTR_IPV6: + /* Gratuitous exception for MPLS to allow copy_ttl_in */ + if (eth_p_mpls(flow_key->eth.type)) + break; + if (flow_key->eth.type != htons(ETH_P_IPV6)) return -EINVAL; -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 07/17] dpif-netdev: Inner And Outer Flows
This is the user-space datapath component of this change Allow a flow to be visible in two forms, an outer flow and an inner flow. This may occur if the actions allow further decoding of a packet to provide a more fine-grained match. In this case the first-pass coarse-grained match will hit the outer-flow and the second-pass fined-grained match will hit the inner-flow. Inner-flows are not visible to user-space. Rather, they are just an internal representation to handle the case of actions allowing further packet decoding. An inner-flow is associated with its outer-flow and deleted when the outer-flow is deleted. An outer-flow may have more than one inner-flow but an inner-flow may only have one outer-flow. For example: In the case of MPLS, L3 and L4 information may not initially be decoded from the frame as the ethernet type of the frame is an MPLS type and no information is known about the type of the inner frame. However, the type of the inner frame may be provided by an mpls_pop action in which case L3 and L4 information may be decoded providing a finer grained match than is otherwise possible. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * First post --- lib/dpif-netdev.c | 198 +++-- lib/match.c | 17 ++--- 2 files changed, 187 insertions(+), 28 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index a47deb4..1ebb933 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -108,17 +108,52 @@ struct dp_netdev_port { char *type; /* Port type as requested by user. */ }; -/* A flow in dp_netdev's 'flow_table'. */ +#define DP_NETDEV_FLOW_F_NO_DUMP 0x01 + +/* A flow in dp_netdev's 'flow_table'. + * + * Flows are divided into three types: + * Singleton Flow: The match and actions of this flow are self-contained + * and actions are applied to packets which match. + * Outer Flow: The actions of the flow include information that + * may be used to create a more fine-grained match. This match + * is constructed and a second lookup is made for the inner flow that + * matches a packet. + * Inner Flow: This has a more fine-grained match constructed using + * information contained in the actions of the inner flow. + * + * The actions of the inner and outer flow are identical but as a match on + * an outer flow will always result in the lookup of an inner flow the + * matches of an outer flow are never applied. + * + * An outer_flow is denoted by encap_eth_type being set to a non-zero + * value. This element is used as an optimisation to avoid scanning the + * actions of the outer flow each time it matches a packet. + * + * On deletion, outer_flow will delete all its inner flows, which + * are found by iterating the inner_flows element. */ struct dp_netdev_flow { struct hmap_node node; /* Element in dp_netdev's 'flow_table'. */ struct flow key; +struct list inner_flows;/* Inner flows. + * For an inner flow this is its entry + * in the hlist of its outer flow. + * For an outer flow this is the + * list of inner flows. + * For a singleton flow this is unused */ + ovs_be16 encap_dl_type; /* Set to the ethernet type supplied + * by action for outer flows, zero otherwise. */ + /* Statistics. */ long long int used; /* Last used time, in monotonic msecs. */ long long int packet_count; /* Number of packets matched. */ long long int byte_count; /* Number of bytes matched. */ uint8_t tcp_flags; /* Bitwise-OR of seen tcp_flags values. */ +/* Flags */ +uint8_t flags; /* One of DP_NETDEV_FLOW_F_* */ + /* Actions. */ struct nlattr *actions; size_t actions_len; @@ -553,7 +588,7 @@ dpif_netdev_get_max_ports(const struct dpif *dpif OVS_UNUSED) } static void -dp_netdev_free_flow(struct dp_netdev *dp, struct dp_netdev_flow *flow) +dp_netdev_free_flow__(struct dp_netdev *dp, struct dp_netdev_flow *flow) { hmap_remove(&dp->flow_table, &flow->node); free(flow->actions); @@ -561,6 +596,17 @@ dp_netdev_free_flow(struct dp_netdev *dp, struct dp_netdev_flow *flow) } static void +dp_netdev_free_flow(struct dp_netdev *dp, struct dp_netdev_flow *flow) +{ +struct dp_netdev_flow *p, *next; + +LIST_FOR_EACH_SAFE (p, next, inner_flows, &flow->inner_flows) { +dp_netdev_free_flow__(dp, p); +} +dp_netdev_free_flow__(dp, flow); +} + +static void dp_netdev_flow_flush(struct dp_netdev *dp) { struct dp_netdev_flow *flow, *next; @@ -656,6 +702,22 @@ dp_netdev_lookup_flow(const struct dp_netdev *dp, const struct flow *key) return NULL; } +static ovs_be16 +dpif_netdev_actions_allow_l3_extraction(const struct nlattr *actions, +
[ovs-dev] [PATCH 09/17] Add support for set_mpls_ttl action
This adds support for the OpenFlow 1.1+ set_mpls_ttl action. And also adds an NX set_mpls_ttl action. The handling of the TTL modification is entirely handled in userspace. Reviewed-by: Isaku Yamahata Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * No change v2.12 * Use eth_type_mpls() helper v2.11 * No change v2.10 * Check the size of nx_action_mpls_ttl. nx_action_pop_mpls was being checked due to a cut-and-paste error v2.9 * Update tests for upstream changes v2.8 * No change v2.7 * Encode action as OFP11 action in OFP11+ messages v2.6 * Rebase v2.5 * First post --- include/openflow/nicira-ext.h | 12 lib/ofp-actions.c | 30 lib/ofp-actions.h | 10 +++ lib/ofp-parse.c | 17 +++ lib/ofp-util.def |2 ++ ofproto/ofproto-dpif.c| 12 tests/ofproto-dpif.at | 63 + utilities/ovs-ofctl.8.in |4 +++ 8 files changed, 150 insertions(+) diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h index 61fb97e..5cfed85 100644 --- a/include/openflow/nicira-ext.h +++ b/include/openflow/nicira-ext.h @@ -306,6 +306,7 @@ enum nx_action_subtype { NXAST_WRITE_METADATA, /* struct nx_action_write_metadata */ NXAST_PUSH_MPLS,/* struct nx_action_push_mpls */ NXAST_POP_MPLS, /* struct nx_action_pop_mpls */ +NXAST_SET_MPLS_TTL, /* struct nx_action_ttl */ NXAST_DEC_MPLS_TTL, /* struct nx_action_header */ }; @@ -2237,4 +2238,15 @@ struct nx_action_pop_mpls { }; OFP_ASSERT(sizeof(struct nx_action_pop_mpls) == 16); +/* Action structure for NXAST_SET_MPLS_TTL. */ +struct nx_action_mpls_ttl { +ovs_be16 type; /* OFPAT_POP_MPLS. */ +ovs_be16 len; /* Length is 8. */ +ovs_be32 vendor;/* NX_VENDOR_ID. */ +ovs_be16 subtype; /* NXAST_SET_MPLS_TTL. */ +uint8_t ttl; /* TTL */ +uint8_t pad[5]; +}; +OFP_ASSERT(sizeof(struct nx_action_mpls_ttl) == 16); + #endif /* openflow/nicira-ext.h */ diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c index 2280853..2a6cba6 100644 --- a/lib/ofp-actions.c +++ b/lib/ofp-actions.c @@ -412,6 +412,12 @@ ofpact_from_nxast(const union ofp_action *a, enum ofputil_action_code code, break; } +case OFPUTIL_NXAST_SET_MPLS_TTL: { +struct nx_action_mpls_ttl *nxamt = (struct nx_action_mpls_ttl *)a; +ofpact_put_SET_MPLS_TTL(out)->ttl = nxamt->ttl; +break; +} + case OFPUTIL_NXAST_DEC_MPLS_TTL: ofpact_put_DEC_MPLS_TTL(out); break; @@ -796,6 +802,12 @@ ofpact_from_openflow11(const union ofp_action *a, struct ofpbuf *out) return nxm_reg_load_from_openflow12_set_field( (const struct ofp12_action_set_field *)a, out); +case OFPUTIL_OFPAT11_SET_MPLS_TTL: { +struct ofp11_action_mpls_ttl *oamt = (struct ofp11_action_mpls_ttl *)a; +ofpact_put_SET_MPLS_TTL(out)->ttl = oamt->mpls_ttl; +break; +} + case OFPUTIL_OFPAT11_DEC_MPLS_TTL: ofpact_put_DEC_MPLS_TTL(out); break; @@ -1150,6 +1162,7 @@ ofpact_check__(const struct ofpact *a, const struct flow *flow, int max_ports, } case OFPACT_DEC_TTL: +case OFPACT_SET_MPLS_TTL: case OFPACT_DEC_MPLS_TTL: case OFPACT_SET_TUNNEL: case OFPACT_SET_QUEUE: @@ -1402,6 +1415,11 @@ ofpact_to_nxast(const struct ofpact *a, struct ofpbuf *out) ofpact_dec_ttl_to_nxast(ofpact_get_DEC_TTL(a), out); break; +case OFPACT_SET_MPLS_TTL: +ofputil_put_NXAST_SET_MPLS_TTL(out)->ttl += ofpact_get_SET_MPLS_TTL(a)->ttl; +break; + case OFPACT_DEC_MPLS_TTL: ofputil_put_NXAST_DEC_MPLS_TTL(out); break; @@ -1577,6 +1595,7 @@ ofpact_to_openflow10(const struct ofpact *a, struct ofpbuf *out) case OFPACT_REG_MOVE: case OFPACT_REG_LOAD: case OFPACT_DEC_TTL: +case OFPACT_SET_MPLS_TTL: case OFPACT_DEC_MPLS_TTL: case OFPACT_SET_TUNNEL: case OFPACT_WRITE_METADATA: @@ -1711,6 +1730,11 @@ ofpact_to_openflow11(const struct ofpact *a, struct ofpbuf *out) ofpact_dec_ttl_to_openflow11(ofpact_get_DEC_TTL(a), out); break; +case OFPACT_SET_MPLS_TTL: +ofputil_put_OFPAT11_SET_MPLS_TTL(out)->mpls_ttl += ofpact_get_SET_MPLS_TTL(a)->ttl; +break; + case OFPACT_DEC_MPLS_TTL: ofputil_put_OFPAT11_DEC_MPLS_TTL(out); break; @@ -1860,6 +1884,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact, uint16_t port) case OFPACT_REG_MOVE: case OFPACT_REG_LOAD: case OFPACT_DEC_TTL: +case OFPACT_SET_MPLS_TTL: case OFPACT_DEC_MPLS_TTL: case OFPACT_SET_TUNNEL: case OFPACT_WRITE_METADATA: @@ -2085,6 +2110,11 @@ ofpact_format(const struct ofpact
[ovs-dev] [PATCH 15/17] datapath: ovs_flow_to_nlattrs: use encap_eth_type
The ethernet type of an encapsulated frame may be obtained from actions. If so it should be used to allow a richer conversion of a flow to nlattrs. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * No change v2.12 * No change v2.11 * First post --- datapath/datapath.c |4 ++-- datapath/flow.c | 49 +++-- datapath/flow.h |3 ++- 3 files changed, 31 insertions(+), 25 deletions(-) diff --git a/datapath/datapath.c b/datapath/datapath.c index 28e629f..824fefb 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -449,7 +449,7 @@ static int queue_userspace_packet(struct net *net, int dp_ifindex, upcall->dp_ifindex = dp_ifindex; nla = nla_nest_start(user_skb, OVS_PACKET_ATTR_KEY); - ovs_flow_to_nlattrs(upcall_info->key, user_skb); + ovs_flow_to_nlattrs(upcall_info->key, user_skb, htons(0)); nla_nest_end(user_skb, nla); if (upcall_info->userdata) @@ -919,7 +919,7 @@ static int ovs_flow_cmd_fill_info(struct sw_flow *flow, struct datapath *dp, nla = nla_nest_start(skb, OVS_FLOW_ATTR_KEY); if (!nla) goto nla_put_failure; - err = ovs_flow_to_nlattrs(&flow->key, skb); + err = ovs_flow_to_nlattrs(&flow->key, skb, htons(0)); if (err) goto error; nla_nest_end(skb, nla); diff --git a/datapath/flow.c b/datapath/flow.c index e599638..31c90b8 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -1383,10 +1383,12 @@ int ovs_flow_metadata_from_nlattrs(struct sw_flow *flow, int key_len, const stru return 0; } -int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) +int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb, + __be16 encap_eth_type) { struct ovs_key_ethernet *eth_key; struct nlattr *nla, *encap; + __be16 eth_type; if (swkey->phy.priority && nla_put_u32(skb, OVS_KEY_ATTR_PRIORITY, swkey->phy.priority)) @@ -1436,7 +1438,20 @@ int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE, swkey->eth.type)) goto nla_put_failure; - if (swkey->eth.type == htons(ETH_P_IP)) { + eth_type = swkey->eth.type; + if (eth_p_mpls(eth_type)) { + struct ovs_key_mpls *mpls_key; + + nla = nla_reserve(skb, OVS_KEY_ATTR_MPLS, sizeof(*mpls_key)); + if (!nla) + goto nla_put_failure; + mpls_key = nla_data(nla); + memset(mpls_key, 0, sizeof(struct ovs_key_mpls)); + mpls_key->mpls_top_label = swkey->mpls.top_label; + eth_type = encap_eth_type; + } + + if (eth_type == htons(ETH_P_IP)) { struct ovs_key_ipv4 *ipv4_key; nla = nla_reserve(skb, OVS_KEY_ATTR_IPV4, sizeof(*ipv4_key)); @@ -1449,7 +1464,7 @@ int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) ipv4_key->ipv4_tos = swkey->ip.tos; ipv4_key->ipv4_ttl = swkey->ip.ttl; ipv4_key->ipv4_frag = swkey->ip.frag; - } else if (swkey->eth.type == htons(ETH_P_IPV6)) { + } else if (eth_type == htons(ETH_P_IPV6)) { struct ovs_key_ipv6 *ipv6_key; nla = nla_reserve(skb, OVS_KEY_ATTR_IPV6, sizeof(*ipv6_key)); @@ -1465,8 +1480,8 @@ int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) ipv6_key->ipv6_tclass = swkey->ip.tos; ipv6_key->ipv6_hlimit = swkey->ip.ttl; ipv6_key->ipv6_frag = swkey->ip.frag; - } else if (swkey->eth.type == htons(ETH_P_ARP) || - swkey->eth.type == htons(ETH_P_RARP)) { + } else if (eth_type == htons(ETH_P_ARP) || + eth_type == htons(ETH_P_RARP)) { struct ovs_key_arp *arp_key; nla = nla_reserve(skb, OVS_KEY_ATTR_ARP, sizeof(*arp_key)); @@ -1479,19 +1494,9 @@ int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) arp_key->arp_op = htons(swkey->ip.proto); memcpy(arp_key->arp_sha, swkey->ipv4.arp.sha, ETH_ALEN); memcpy(arp_key->arp_tha, swkey->ipv4.arp.tha, ETH_ALEN); - } else if (eth_p_mpls(swkey->eth.type)) { - struct ovs_key_mpls *mpls_key; - - nla = nla_reserve(skb, OVS_KEY_ATTR_MPLS, sizeof(*mpls_key)); - if (!nla) - goto nla_put_failure; - mpls_key = nla_data(nla); - memset(mpls_key, 0, sizeof(struct ovs_key_mpls)); - mpls_key->mpls_top_label = swkey->mpls.top_label; } - if ((swkey->eth.type == htons(ETH_P_IP) || -swkey->eth.type == htons(ETH_P_IPV6)) && + if ((eth_type == htons(ETH_P_
[ovs-dev] [PATCH 0/16 v2.15] Basic MPLS actions and matches
Hi, This series implements basic MPLS actions and matches based on work by Ravi K, Leo Alterman and Yamahata-san. The main limitation of this implementation is that it only supports manipulating the outer-most MPLS label. Key differences between the v2.13 and v2.15: * Addressed review by Ben Pfaff and Jesse Gross In particular: - Use Use OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS instead o Still under discussion - Move OVS_KEY_ATTR_MPLS to after all upstream attributes - Move OVS_ACTION_ATTR_PUSH_MPLS and OVS_ACTION_ATTR_POP_MPLS to after all upstream attributes Key differences between the v2.13 and v2.14: * Reshuffle patches moving kernel datapath patches to the end of the series and allowing the other patches to be applied independently. That is, the first 10 patches of the series may be applied. They do not touch the kernel. And the result will be MPLS support for ovs-vswitchd and the user-space datapath. Key differences between the v2.12 and v2.13: * Rebase * Add Inner and Outer flow support to user-space datapath * Allow MPLS matches for OpenFlow 1.3 * Optimisations suggested by Jarno Rajahalme Key differences between the v2.11 and v2.12: * Rebase * Add eth_type_mpls helper * Add flow_innermost_dl_type helper * calculate MPLS fitness over entire flow - It is sufficient to just calculate the fitness of an MPLS frame once. - Previously a fitness calculation was made based on expectations calculat on L1 and L2 data. However, this may not be correct as it will return ODP_FIT_TOO_MUCH in the case where the key contains L3 attributes. * Pass MPLS lse to push_mpls - When push_mpls() the LSE of the flow is always already known, so pass it to push_mpls and set it directly. This avoids: + The need to re-compose an default LSE, allowing the removal of get_lse() + The need to follow each call to push_mpls() with a call to set_mpls_lse() - This fixes the previously bogus usage of push_mpls in dp_netdev_execute_actions, allowing push_mpls to work in conjunction with the user-space datapath. * Add dl_type parameter to flow_extract_l3_onwards() and thus allow "flow: Split flow_extract" to be applied independently of the rest of this patch-set. Key differences between the v2.10 and v2.11: * Rebase * The last 13 patches in the series have been added to allow handling of actions on MPLS flows that modify any L3+ data. In particular, to allow modification of the IP TTL. This is achieved by adding the notion of inner and outer flows. This works as follows. - A match is decoded from a packet using the existing implementation and this is used to look up the flow. - If the flow includes actions that allow further decoding of the packet - e.g. an MPLS pop action which supplies the ethernet type of the encapsulated frame - then the packet is further decoded and a more fine-grained match is made. This match is then used to lookup the flow again. In such cases the flow added to the datapath by ovs-vswtichd includes the richer match and the actions. The datapath inspects the actions and if there are actions that allow further decoding of the packet it inserts the flow twice, once including only an L1 and L2 match and once including the richer match. The former is the other flow and the latter is the inner flow. Each inner flow has a single outer flow. Outer flows may have more than one inner flow. Inner flows are not exposed to ovs-vswitchd. Inner flows are deleted when their outer flow is deleted. * Check the size of nx_action_mpls_ttl. nx_action_pop_mpls was being checked due to a cut-and-paste error Key differences between the previous post, v2.9, and this post, v2.10: * Rebase * Check the size of nx_action_mpls_ttl. nx_action_pop_mpls was being checked due to a cut-and-paste error Key differences between v2.8 and v2.9 * Rebase and fix tests accordingly * Always update MPLS bos pointer when VLAN pop is performed in kernel datapath. Key differences between v2.7 and v2.8 * Rebase Key differences between v2.6 and v2.7 * When encoding TTL actions (the last 4 patches) for an OpenFlow 1.1+ message, encode them as OpenFlow 1.1+ actions. Previously they were encoded as NX matches. Key differences between v2.5 and v2.6 * Dropped "flow: Set ttl in flow_compose()", this is now present in the master branch * Dropped "nx-match: Do not check pre-requisites for load actions", instead, dl_type changes are tracked to allow correct pre-requisite checking in the presence of push_mpls and pop_mpls actions. Key differences between v2.4 and v2.5 * Various clean-ups and fises to "User-Space MPLS actions and matches" as noted in its changelog. * New patches to add support for TLS actions * Inclusion of Justin Pettit's patch "flow: Set ttl in flow_compose()", which is a dependency of the "Add support for copy_ttl_out action" patch. Key differences between v2.3 and v2.4 * Minor cle
[ovs-dev] [PATCH 08/17] Add support for dec_mpls_ttl action
This adds support for the OpenFlow 1.1+ dec_mpls_ttl action. And also adds an NX dec_mpls_ttl action. The handling of the TTL modification is entirely handled in userspace. Reviewed-by: Isaku Yamahata Signed-off-by: Simon Horman --- v2.15 * Rebase v2.14 * No change v2.13 * No change v2.12 * Use eth_type_mpls() helper v2.11 * No change v2.10 * No change v2.9 * Update tests for upstream changes v2.8 * No change v2.7 * Encode action as OFP11 action in OFP11+ messages v2.6 * Rebase v2.5 * First post --- include/openflow/nicira-ext.h |1 + lib/flow.c|8 lib/ofp-actions.c | 23 +++ lib/ofp-actions.h |1 + lib/ofp-parse.c |5 + lib/ofp-util.def |2 ++ lib/packets.c |2 +- lib/packets.h |1 + ofproto/ofproto-dpif.c| 27 +++ tests/ofproto-dpif.at | 21 + utilities/ovs-ofctl.8.in |9 + 11 files changed, 99 insertions(+), 1 deletion(-) diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h index d6a99ce..61fb97e 100644 --- a/include/openflow/nicira-ext.h +++ b/include/openflow/nicira-ext.h @@ -306,6 +306,7 @@ enum nx_action_subtype { NXAST_WRITE_METADATA, /* struct nx_action_write_metadata */ NXAST_PUSH_MPLS,/* struct nx_action_push_mpls */ NXAST_POP_MPLS, /* struct nx_action_pop_mpls */ +NXAST_DEC_MPLS_TTL, /* struct nx_action_header */ }; /* Header for Nicira-defined actions. */ diff --git a/lib/flow.c b/lib/flow.c index 7550b44..d73802a 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -842,6 +842,14 @@ flow_set_mpls_label(struct flow *flow, ovs_be32 label) set_mpls_lse_label(&flow->mpls_lse, label); } +/* Sets the MPLS TTL that 'flow' matches to 'ttl', which should be in the + * range 0...255. */ +void +flow_set_mpls_ttl(struct flow *flow, uint8_t ttl) +{ +set_mpls_lse_ttl(&flow->mpls_lse, ttl); +} + /* Sets the MPLS TC that 'flow' matches to 'tc', which should be in the * range 0...7. */ void diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c index 660d4e3..2280853 100644 --- a/lib/ofp-actions.c +++ b/lib/ofp-actions.c @@ -412,6 +412,10 @@ ofpact_from_nxast(const union ofp_action *a, enum ofputil_action_code code, break; } +case OFPUTIL_NXAST_DEC_MPLS_TTL: +ofpact_put_DEC_MPLS_TTL(out); +break; + case OFPUTIL_NXAST_POP_MPLS: { struct nx_action_pop_mpls *nxapm = (struct nx_action_pop_mpls *)a; if (eth_type_mpls(nxapm->ethertype)) { @@ -792,6 +796,10 @@ ofpact_from_openflow11(const union ofp_action *a, struct ofpbuf *out) return nxm_reg_load_from_openflow12_set_field( (const struct ofp12_action_set_field *)a, out); +case OFPUTIL_OFPAT11_DEC_MPLS_TTL: +ofpact_put_DEC_MPLS_TTL(out); +break; + case OFPUTIL_OFPAT11_PUSH_MPLS: { struct ofp11_action_push *oap = (struct ofp11_action_push *)a; if (!eth_type_mpls(oap->ethertype)) { @@ -1142,6 +1150,7 @@ ofpact_check__(const struct ofpact *a, const struct flow *flow, int max_ports, } case OFPACT_DEC_TTL: +case OFPACT_DEC_MPLS_TTL: case OFPACT_SET_TUNNEL: case OFPACT_SET_QUEUE: case OFPACT_POP_QUEUE: @@ -1393,6 +1402,10 @@ ofpact_to_nxast(const struct ofpact *a, struct ofpbuf *out) ofpact_dec_ttl_to_nxast(ofpact_get_DEC_TTL(a), out); break; +case OFPACT_DEC_MPLS_TTL: +ofputil_put_NXAST_DEC_MPLS_TTL(out); +break; + case OFPACT_SET_TUNNEL: ofpact_set_tunnel_to_nxast(ofpact_get_SET_TUNNEL(a), out); break; @@ -1564,6 +1577,7 @@ ofpact_to_openflow10(const struct ofpact *a, struct ofpbuf *out) case OFPACT_REG_MOVE: case OFPACT_REG_LOAD: case OFPACT_DEC_TTL: +case OFPACT_DEC_MPLS_TTL: case OFPACT_SET_TUNNEL: case OFPACT_WRITE_METADATA: case OFPACT_SET_QUEUE: @@ -1697,6 +1711,10 @@ ofpact_to_openflow11(const struct ofpact *a, struct ofpbuf *out) ofpact_dec_ttl_to_openflow11(ofpact_get_DEC_TTL(a), out); break; +case OFPACT_DEC_MPLS_TTL: +ofputil_put_OFPAT11_DEC_MPLS_TTL(out); +break; + case OFPACT_WRITE_METADATA: /* OpenFlow 1.1 uses OFPIT_WRITE_METADATA to express this action. */ break; @@ -1842,6 +1860,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact, uint16_t port) case OFPACT_REG_MOVE: case OFPACT_REG_LOAD: case OFPACT_DEC_TTL: +case OFPACT_DEC_MPLS_TTL: case OFPACT_SET_TUNNEL: case OFPACT_WRITE_METADATA: case OFPACT_SET_QUEUE: @@ -2066,6 +2085,10 @@ ofpact_format(const struct ofpact *a, struct ds *s) print_dec_ttl(ofpact_get_DEC_TTL(a), s); break; +case OFPACT_DEC_MPLS_TTL: +ds_put_cstr(s, "dec_mpls_ttl"); +break; + case OF
[ovs-dev] [PATCH 06/17] ofproto: Allow actions richer matches based on actions
If the actions of a flow allow further decoding of L3 and L4 data to provide a richer match then use this richer match. For example: In the case of MPLS, L3 and L4 information may not initially be decoded from the frame as the ethernet type of the frame is an MPLS type and no information is known about the type of the inner frame. However, the type of the inner frame may be provided by an mpls_pop action in which case L3 and L4 information may be decoded providing a finer grained match than is otherwise possible. Signed-off-by: Simon Horman --- v2.15 * No change v2.15 * Rebase for ovs_assert v2.14 * No change v2.13 * Correct indentation in handle_flow_miss_l3_extraction() * Rebase: Pass encap_dl_type argument to flow_extract_l3_onwards() v2.12 * Use eth_type_mpls helper v2.11 * First post --- ofproto/ofproto-dpif.c | 109 1 file changed, 102 insertions(+), 7 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index ad58b42..f0bc475 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -397,7 +397,8 @@ static void subfacet_update_stats(struct subfacet *, const struct dpif_flow_stats *); static void subfacet_make_actions(struct subfacet *, const struct ofpbuf *packet, - struct ofpbuf *odp_actions); + struct ofpbuf *odp_actions, + const struct flow *flow); static int subfacet_install(struct subfacet *, const struct nlattr *actions, size_t actions_len, struct dpif_flow_stats *, enum slow_path_reason); @@ -3082,12 +3083,14 @@ struct flow_miss { struct list packets; enum dpif_upcall_type upcall_type; uint32_t odp_in_port; +bool singleton; }; struct flow_miss_op { struct dpif_op dpif_op; struct subfacet *subfacet; /* Subfacet */ void *garbage; /* Pointer to pass to free(), NULL if none. */ +void *garbage2; /* Pointer to pass to free(), NULL if none. */ uint64_t stub[1024 / 8];/* Temporary buffer. */ }; @@ -3145,6 +3148,19 @@ process_special(struct ofproto_dpif *ofproto, const struct flow *flow, return 0; } +static bool flow_miss_is_singleton(const struct flow *flow) +{ +/* The decoding of the L3 and L4 components of an MPLS frame + * may only occur if there is a pop_mpls action present in + * which case the ethernet type of the internel frame becomes + * known. For this reason the decoding of the L3 and L4 + * components of an MPLS frame is delayed and the match is + * at this point incomplete. Treat it as a singleton to + * avoid a single miss covering multiple flows + */ +return eth_type_mpls(flow->dl_type); +} + static struct flow_miss * flow_miss_find(struct hmap *todo, const struct flow *flow, uint32_t hash) { @@ -3251,6 +3267,39 @@ actions_allow_l3_extraction(const struct ofpact *ofpacts, size_t ofpacts_len) return htons(0); } +static ovs_be16 +handle_flow_miss_l3_extraction(struct flow_miss *miss, + struct rule_dpif *rule, + struct ofpbuf *packet, + struct flow **flowp, + struct ofpbuf *inner_key) +{ +struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->up.ofproto); +ovs_be16 encap_dl_type; +struct flow *inner_flow; + +encap_dl_type = actions_allow_l3_extraction(rule->up.ofpacts, +rule->up.ofpacts_len); +if (!encap_dl_type) { +return encap_dl_type; +} + +ovs_assert(miss->singleton); + +inner_flow = xmemdup(*flowp, sizeof **flowp); +inner_flow->encap_dl_type = encap_dl_type; +flow_extract_l3_onwards(packet, inner_flow, encap_dl_type); + +ofpbuf_reinit(inner_key, 128); +odp_flow_key_from_flow(inner_key, inner_flow, + ofp_port_to_odp_port(ofproto, inner_flow->in_port), + encap_dl_type); +miss->key = inner_key->data; +miss->key_len = inner_key->size; + +*flowp = inner_flow; +return encap_dl_type; +} /* Handles 'miss', which matches 'rule', without creating a facet or subfacet * or creating any datapath flow. May add an "execute" operation to 'ops' and @@ -3269,20 +3318,32 @@ handle_flow_miss_without_facet(struct flow_miss *miss, struct flow_miss_op *op = &ops[*n_ops]; struct dpif_flow_stats stats; struct ofpbuf odp_actions; +struct flow *flow = &miss->flow; +struct ofpbuf inner_key; +__be16 encap_dl_type; COVERAGE_INC(facet_suppress); ofpbuf_use_stub(&odp_actions, op->stub, sizeof op->stub); +ofpbuf_init(&inner_key, 0); dpif_flow_stats_extract(&miss->flow, packet, no
[ovs-dev] [PATCH 10/17] Add support for copy_ttl_out action
This adds support for the OpenFlow 1.1+ copy_ttl_out action. And also adds an NX copy_ttl_out action. The implementation does not support copying in the case where the outermost header is IP as it is unclear to me that Open vSwtich has a notion of an inner IP header to copy the TLL from. The handling of the TTL modification is entirely handled in userspace. Reviewed-by: Isaku Yamahata Signed-off-by: Simon Horman --- v2.15 * Rebase * Mask off wc.masks.mpls_inner_lse in ofputil_normalize_match__() v2.14 * No change v2.13 * Remove bogus test. copy_ttl_out from IP to MPLS only makes sense if the dl_type of the encapsulated frame is known, which implies that the original frame was IP and an mpls_push action has been applied. v2.12 * No change v2.10 * No change v2.9 * Increment FLOW_WC_SEQ * Update tests for upstream changes v2.8 * No change v2.7 * Encode action as OFP11 action in OFP11+ messages v2.6 * Non-trivial rebase v2.5 * First post --- include/openflow/nicira-ext.h |1 + lib/flow.c| 15 +-- lib/flow.h|7 --- lib/match.c |2 +- lib/nx-match.c|2 +- lib/ofp-actions.c | 23 +++ lib/ofp-actions.h |1 + lib/ofp-parse.c |5 + lib/ofp-util.c|5 +++-- lib/ofp-util.def |2 ++ ofproto/ofproto-dpif.c| 23 +++ tests/ofproto-dpif.at | 23 ++- utilities/ovs-ofctl.8.in |7 +++ 13 files changed, 98 insertions(+), 18 deletions(-) diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h index 5cfed85..c592410 100644 --- a/include/openflow/nicira-ext.h +++ b/include/openflow/nicira-ext.h @@ -306,6 +306,7 @@ enum nx_action_subtype { NXAST_WRITE_METADATA, /* struct nx_action_write_metadata */ NXAST_PUSH_MPLS,/* struct nx_action_push_mpls */ NXAST_POP_MPLS, /* struct nx_action_pop_mpls */ +NXAST_COPY_TTL_OUT, /* struct nx_action_header */ NXAST_SET_MPLS_TTL, /* struct nx_action_ttl */ NXAST_DEC_MPLS_TTL, /* struct nx_action_header */ }; diff --git a/lib/flow.c b/lib/flow.c index d73802a..5181d10 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -98,9 +98,12 @@ parse_mpls(struct ofpbuf *b, struct flow *flow) struct mpls_hdr *mh; while ((mh = ofpbuf_try_pull(b, sizeof *mh))) { -if (flow->mpls_depth++ == 0) { +if (flow->mpls_depth == 0) { flow->mpls_lse = mh->mpls_lse; +} else if (flow->mpls_depth == 1) { +flow->inner_mpls_lse = mh->mpls_lse; } +flow->mpls_depth++; if (mh->mpls_lse & htonl(MPLS_BOS_MASK)) { break; } @@ -510,7 +513,7 @@ flow_zero_wildcards(struct flow *flow, const struct flow_wildcards *wildcards) void flow_get_metadata(const struct flow *flow, struct flow_metadata *fmd) { -BUILD_ASSERT_DECL(FLOW_WC_SEQ == 19); +BUILD_ASSERT_DECL(FLOW_WC_SEQ == 20); fmd->tun_id = flow->tunnel.tun_id; fmd->metadata = flow->metadata; @@ -842,14 +845,6 @@ flow_set_mpls_label(struct flow *flow, ovs_be32 label) set_mpls_lse_label(&flow->mpls_lse, label); } -/* Sets the MPLS TTL that 'flow' matches to 'ttl', which should be in the - * range 0...255. */ -void -flow_set_mpls_ttl(struct flow *flow, uint8_t ttl) -{ -set_mpls_lse_ttl(&flow->mpls_lse, ttl); -} - /* Sets the MPLS TC that 'flow' matches to 'tc', which should be in the * range 0...7. */ void diff --git a/lib/flow.h b/lib/flow.h index 3078332..07b9f8b 100644 --- a/lib/flow.h +++ b/lib/flow.h @@ -36,7 +36,7 @@ struct ofpbuf; /* This sequence number should be incremented whenever anything involving flows * or the wildcarding of flows changes. This will cause build assertion * failures in places which likely need to be updated. */ -#define FLOW_WC_SEQ 19 +#define FLOW_WC_SEQ 20 #define FLOW_N_REGS 8 BUILD_ASSERT_DECL(FLOW_N_REGS <= NXM_NX_MAX_REGS); @@ -92,6 +92,7 @@ struct flow { is the datapath port number. */ uint32_t skb_mark; /* Packet mark. */ ovs_be32 mpls_lse; /* MPLS label stack entry. */ +ovs_be32 inner_mpls_lse;/* Inner MPLS label stack entry. */ uint16_t mpls_depth;/* Depth of MPLS stack. */ ovs_be16 vlan_tci; /* If 802.1Q, TCI | VLAN_CFI; otherwise 0. */ ovs_be16 dl_type; /* Ethernet frame type. */ @@ -106,7 +107,7 @@ struct flow { uint8_t arp_tha[6]; /* ARP/ND target hardware address. */ uint8_t nw_ttl; /* IP TTL/Hop Limit. */ uint8_t nw_frag;/* FLOW_FRAG_* flags. */ -uint8_t zeros[4]; +uint8_t zeros[0]; }; BUILD_ASSERT_DECL(sizeof(struct flow) % 4 == 0); @@ -114,7 +115,7 @@ BUILD_ASSERT_DECL(sizeof(struct flow) % 4 == 0); /* Remember to u
[ovs-dev] [PATCH 13/17] datapath: Split ovs_flow_extract
Allow repeated extraction of flow from packets Split the L3 and above portion of ovs_flow_extract() out into ovs_flow_extract_l3_onwards() and call ovs_flow_extract_l3_onwards() from ovs_flow_extract(). This is to allow re-extraction of L3 and higher information using an ethernet type supplied by actions, for example the mpls_pop. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * As suggested by Jarno Rajahalme - Update change log to more closely reflect the extent of the changes in the patch v2.12 * Rebase v2.11 * First post --- datapath/flow.c | 156 ++- datapath/flow.h |2 + 2 files changed, 100 insertions(+), 58 deletions(-) diff --git a/datapath/flow.c b/datapath/flow.c index 17f9ecc..28533ef 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -588,7 +588,8 @@ out: } /** - * ovs_flow_extract - extracts a flow key from an Ethernet frame. + * ovs_flow_extract_l3_onwards - extracts l3 and l4 portion of a flow key + * from an Ethernet frame. * @skb: sk_buff that contains the frame, with skb->data pointing to the * Ethernet header * @in_port: port number on which @skb was received. @@ -596,62 +597,27 @@ out: * @key_lenp: length of output flow key * * The caller must ensure that skb->len >= ETH_HLEN. + * The caller must ensure that the rest of the flow is initialised. + * This, ovs_flow_extract_l3_onwards() should be called by or after + * vs_flow_extract(). * * Returns 0 if successful, otherwise a negative errno value. * * Initializes @skb header pointers as follows: * - *- skb->mac_header: the Ethernet header. - * - *- skb->network_header: just past the Ethernet header, or just past the - * VLAN header, to the first byte of the Ethernet payload. - * - *- skb->transport_header: If key->dl_type is ETH_P_IP or ETH_P_IPV6 + *- skb->transport_header: If eth_type is ETH_P_IP or ETH_P_IPV6 * on output, then just past the IP header, if one is present and * of a correct length, otherwise the same as skb->network_header. * For other key->dl_type values it is left untouched. */ -int ovs_flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key, -int *key_lenp) +int ovs_flow_extract_l3_onwards(struct sk_buff *skb, struct sw_flow_key *key, + int *key_lenp, __be16 eth_type) { int error = 0; - int key_len = SW_FLOW_KEY_OFFSET(eth); - struct ethhdr *eth; - - memset(key, 0, sizeof(*key)); - - key->phy.priority = skb->priority; - if (OVS_CB(skb)->tun_key) - memcpy(&key->tun_key, OVS_CB(skb)->tun_key, sizeof(key->tun_key)); - key->phy.in_port = in_port; - key->phy.skb_mark = skb_get_mark(skb); - - skb_reset_mac_header(skb); - - /* Link layer. We are guaranteed to have at least the 14 byte Ethernet -* header in the linear data area. -*/ - eth = eth_hdr(skb); - memcpy(key->eth.src, eth->h_source, ETH_ALEN); - memcpy(key->eth.dst, eth->h_dest, ETH_ALEN); - - __skb_pull(skb, 2 * ETH_ALEN); - - if (vlan_tx_tag_present(skb)) - key->eth.tci = htons(vlan_get_tci(skb)); - else if (eth->h_proto == htons(ETH_P_8021Q)) - if (unlikely(parse_vlan(skb, key))) - return -ENOMEM; - - key->eth.type = parse_ethertype(skb); - if (unlikely(key->eth.type == htons(0))) - return -ENOMEM; - - skb_reset_network_header(skb); - __skb_push(skb, skb->data - skb_mac_header(skb)); + int key_len = *key_lenp; /* Network layer. */ - if (key->eth.type == htons(ETH_P_IP)) { + if (eth_type == htons(ETH_P_IP)) { struct iphdr *nh; __be16 offset; @@ -710,8 +676,8 @@ int ovs_flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key, } } - } else if ((key->eth.type == htons(ETH_P_ARP) || - key->eth.type == htons(ETH_P_RARP)) && arphdr_ok(skb)) { + } else if ((eth_type == htons(ETH_P_ARP) || + eth_type == htons(ETH_P_RARP)) && arphdr_ok(skb)) { struct arp_eth_header *arp; arp = (struct arp_eth_header *)skb_network_header(skb); @@ -730,18 +696,7 @@ int ovs_flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key, memcpy(key->ipv4.arp.tha, arp->ar_tha, ETH_ALEN); key_len = SW_FLOW_KEY_OFFSET(ipv4.arp); } - } else if (eth_p_mpls(key->eth.type)) { - error = check_header(skb, MPLS_HLEN); - if (unlikely(error)) - goto out; - - key_len = SW_FLOW_KEY_OFFSET(mpls.top_label); - memcpy(&key->mpls.top_label, skb_network_header(skb), MPLS_HLEN); - - /* Update network h
[ovs-dev] [PATCH 16/17] datapath: ovs_flow_used: use encap_eth_type
The ethernet type of an encapsulated frame may be obtained from actions. If so it should be used to correct decoding of L3 and L4 packet data. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * As suggested by Jarno Rajahalme - Initialise encap_eth_type as 0 in ovs_dp_process_received_packet() v2.12 * No change v2.11 * First post flow: Split flow_extract Split the L3 and above portion of flow_extract() out into flow_extract_l3_onwards() and call flow_extract_l3_onwards() from flow_extract(). This is to allow re-extraction of l3 and higher information using flow->encap_dl_type which may be set using information contained in actions. Signed-off-by: Simon Horman --- v2.12 * Rebase * Add dl_type parameter to flow_extract_l3_onwards, allowing this patch to be applied independently of the rest of the MPLS patch-set. v2.11 * First post --- datapath/datapath.c |6 +++--- datapath/flow.c |8 +--- datapath/flow.h |2 +- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/datapath/datapath.c b/datapath/datapath.c index 824fefb..adbf5f3 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -242,6 +242,7 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) struct sw_flow *flow, *outer_flow = NULL; struct dp_stats_percpu *stats; u64 *stats_counter; + __be16 encap_eth_type = 0; int error; stats = this_cpu_ptr(dp->stats_percpu); @@ -266,7 +267,6 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) * allow more information to be included in the flow's match */ if (likely(flow)) { - __be16 encap_eth_type; encap_eth_type = flow->encap_eth_type; if (unlikely(encap_eth_type)) { error = ovs_flow_extract_l3_onwards(skb, &key, &key_len, @@ -298,8 +298,8 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) stats_counter = &stats->n_hit; if (unlikely(outer_flow)) - ovs_flow_used(outer_flow, skb); - ovs_flow_used(OVS_CB(skb)->flow, skb); + ovs_flow_used(outer_flow, htons(0), skb); + ovs_flow_used(OVS_CB(skb)->flow, encap_eth_type, skb); ovs_execute_actions(dp, skb); out: diff --git a/datapath/flow.c b/datapath/flow.c index 31c90b8..770718a 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -180,12 +180,14 @@ static bool icmp6hdr_ok(struct sk_buff *skb) #define TCP_FLAGS_OFFSET 13 #define TCP_FLAG_MASK 0x3f -void ovs_flow_used(struct sw_flow *flow, struct sk_buff *skb) +void ovs_flow_used(struct sw_flow *flow, __be16 encap_eth_type, + struct sk_buff *skb) { u8 tcp_flags = 0; + __be16 eth_type; - if ((flow->key.eth.type == htons(ETH_P_IP) || -flow->key.eth.type == htons(ETH_P_IPV6)) && + eth_type = encap_eth_type ? encap_eth_type : flow->key.eth.type; + if ((eth_type == htons(ETH_P_IP) || eth_type == htons(ETH_P_IPV6)) && flow->key.ip.proto == IPPROTO_TCP && likely(skb->len >= skb_transport_offset(skb) + sizeof(struct tcphdr))) { u8 *tcp = (u8 *)tcp_hdr(skb); diff --git a/datapath/flow.h b/datapath/flow.h index 43accab..969ed13 100644 --- a/datapath/flow.h +++ b/datapath/flow.h @@ -184,7 +184,7 @@ int ovs_flow_extract_l3_onwards(struct sk_buff *, struct sw_flow_key *, int key_lenp[2], __be16 eth_type); int ovs_flow_extract(struct sk_buff *, u16 in_port, struct sw_flow_key *, int *key_lenp); -void ovs_flow_used(struct sw_flow *, struct sk_buff *); +void ovs_flow_used(struct sw_flow *, __be16 encap_eth_type, struct sk_buff *); u64 ovs_flow_used_time(unsigned long flow_jiffies); /* Upper bound on the length of a nlattr-formatted flow key. The longest -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 01/17] dpif-netdev: Limit scope of vlan in dp_netdev_execute_actions
This is to make the code consistent with the proposed MPLS changes and avoid cluttering the top of the NL_ATTR_FOR_EACH_UNSAFE loop. Signed-off-by: Simon Horman --- v2.15 * First post --- lib/dpif-netdev.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 3d9cd85..bec32c3 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1249,7 +1249,6 @@ dp_netdev_execute_actions(struct dp_netdev *dp, unsigned int left; NL_ATTR_FOR_EACH_UNSAFE (a, left, actions, actions_len) { -const struct ovs_action_push_vlan *vlan; int type = nl_attr_type(a); switch ((enum ovs_action_attr) type) { @@ -1261,10 +1260,11 @@ dp_netdev_execute_actions(struct dp_netdev *dp, dp_netdev_action_userspace(dp, packet, key, a); break; -case OVS_ACTION_ATTR_PUSH_VLAN: -vlan = nl_attr_get(a); +case OVS_ACTION_ATTR_PUSH_VLAN: { +const struct ovs_action_push_vlan *vlan = nl_attr_get(a); eth_push_vlan(packet, vlan->vlan_tci); break; +} case OVS_ACTION_ATTR_POP_VLAN: eth_pop_vlan(packet); -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 14/17] datapath: Inner And Outer Flows
Allow a flow to be visible in two forms, an outer flow and an inner flow. This may occur if the actions allow further decoding of a packet to provide a more fine-grained match. In this case the first-pass coarse-grained match will hit the outer-flow and the second-pass fined-grained match will hit the inner-flow. Inner-flows are not visible to user-space. Rather, they are just an internal representation to handle the case of actions allowing further packet decoding. An inner-flow is associated with its outer-flow and deleted when the outer-flow is deleted. An outer-flow may have more than one inner-flow but an inner-flow may only have one outer-flow. For example: In the case of MPLS, L3 and L4 information may not initially be decoded from the frame as the ethernet type of the frame is an MPLS type and no information is known about the type of the inner frame. However, the type of the inner frame may be provided by an mpls_pop action in which case L3 and L4 information may be decoded providing a finer grained match than is otherwise possible. Signed-off-by: Simon Horman --- v2.15 * Rebase v2.14 * No change v2.13 * Merge "datapath: Allow inner_flows" and "datapath: re-lookup a flow if actions allow a more fine grained match" and rename the resulting patch "datapath: Inner and Outer Flows". * As suggested by Jarno Rajahalme - Add encap_eth_type to struct sw_flow and use it to store the ethernet type supplied by actions of a flow if any. This is an optimisation to avoid scanning the actions for each packet. v2.12 * No change v2.11 * First post --- datapath/actions.c | 14 datapath/datapath.c | 222 ++- datapath/datapath.h |1 + datapath/flow.c | 69 +++- datapath/flow.h | 41 +- 5 files changed, 271 insertions(+), 76 deletions(-) diff --git a/datapath/actions.c b/datapath/actions.c index 1bb2c52..62b90b0 100644 --- a/datapath/actions.c +++ b/datapath/actions.c @@ -629,6 +629,20 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, return 0; } +__be16 ovs_actions_allow_l3_extraction(struct sw_flow *flow) +{ + struct sw_flow_actions *acts = rcu_dereference(flow->sf_acts); + const struct nlattr *a; + int rem; + + for (a = acts->actions, rem = acts->actions_len; rem > 0; +a = nla_next(a, &rem)) + if (nla_type(a) == OVS_ACTION_ATTR_POP_MPLS) + return *(__be16 *)nla_data(a); + + return 0; +} + /* We limit the number of times that we pass into execute_actions() * to avoid blowing out the stack in the event that we have a loop. */ #define MAX_LOOPS 5 diff --git a/datapath/datapath.c b/datapath/datapath.c index 617c8ca..28e629f 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -239,7 +239,7 @@ void ovs_dp_detach_port(struct vport *p) void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) { struct datapath *dp = p->dp; - struct sw_flow *flow; + struct sw_flow *flow, *outer_flow = NULL; struct dp_stats_percpu *stats; u64 *stats_counter; int error; @@ -249,6 +249,7 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) if (!OVS_CB(skb)->flow) { struct sw_flow_key key; int key_len; + struct flow_table *table = rcu_dereference(dp->table); /* Extract flow from 'skb' into 'key'. */ error = ovs_flow_extract(skb, p->port_no, &key, &key_len); @@ -258,8 +259,27 @@ void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) } /* Look up flow. */ - flow = ovs_flow_tbl_lookup(rcu_dereference(dp->table), - &key, key_len); + flow = ovs_flow_tbl_lookup(table, &key, key_len); + + /* We have a flow? Superb. +* See if the actions in the flow supply information to +* allow more information to be included in the flow's match +*/ + if (likely(flow)) { + __be16 encap_eth_type; + encap_eth_type = flow->encap_eth_type; + if (unlikely(encap_eth_type)) { + error = ovs_flow_extract_l3_onwards(skb, &key, &key_len, + encap_eth_type); + if (unlikely(error)) { + kfree_skb(skb); + return; + } + outer_flow = flow; + flow = ovs_flow_tbl_lookup(table, &key, key_len); + } + } + if (unlikely(!flow)) { struct dp_upcall_info
[ovs-dev] [PATCH 04/17] packet: packet_get_tcp_flags: use flow's innermost dl_type
Use the innermost dl_type when decoding L3 and L4 data from a packet. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * No change v2.12 * Use flow_innermost_dl_type helper v2.11 * First post --- lib/packets.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/packets.c b/lib/packets.c index 8d10dc8..a52bf6e 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -897,8 +897,8 @@ packet_set_udp_port(struct ofpbuf *packet, ovs_be16 src, ovs_be16 dst) uint8_t packet_get_tcp_flags(const struct ofpbuf *packet, const struct flow *flow) { -if ((flow->dl_type == htons(ETH_TYPE_IP) || - flow->dl_type == htons(ETH_TYPE_IPV6)) && +ovs_be16 dl_type = flow_innermost_dl_type(flow); +if ((dl_type == htons(ETH_TYPE_IP) || dl_type == htons(ETH_TYPE_IPV6)) && flow->nw_proto == IPPROTO_TCP && packet->l7) { const struct tcp_header *tcp = packet->l4; return TCP_FLAGS(tcp->tcp_ctl); -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 05/17] actions: Allow secondary decoding of a flow
Actions may provide information to allow further decoding of a flow. For example: In the case of MPLS, L3 and L4 information may not initially be decoded from the frame as the ethernet type of the frame is an MPLS type and no information is known about the type of the inner frame. However, the type of the inner frame may be provided by an mpls_pop action in which case L3 and L4 information may be decoded providing a finer grained match than is otherwise possible. Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * No change v2.12 * As suggested by Jarno Rajahalme - Use flow->encap_dl_type instead of of obtaining the value by scanning the actions. v2.12 * No change v2.11 * First post fix --- lib/dpif-netdev.c |5 +++-- lib/odp-util.c | 37 - lib/odp-util.h |2 +- ofproto/ofproto-dpif.c | 36 tests/test-odp.c |2 +- 5 files changed, 57 insertions(+), 25 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 0c41a5a..a47deb4 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -878,7 +878,8 @@ dpif_netdev_flow_dump_next(const struct dpif *dpif, void *state_, struct ofpbuf buf; ofpbuf_use_stack(&buf, &state->keybuf, sizeof state->keybuf); -odp_flow_key_from_flow(&buf, &flow->key, flow->key.in_port); +odp_flow_key_from_flow(&buf, &flow->key, flow->key.in_port, + htons(0)); *key = buf.data; *key_len = buf.size; @@ -1116,7 +1117,7 @@ dp_netdev_output_userspace(struct dp_netdev *dp, const struct ofpbuf *packet, buf = &u->buf; ofpbuf_init(buf, ODPUTIL_FLOW_KEY_BYTES + 2 + packet->size); -odp_flow_key_from_flow(buf, flow, flow->in_port); +odp_flow_key_from_flow(buf, flow, flow->in_port, htons(0)); key_len = buf->size; ofpbuf_pull(buf, key_len); ofpbuf_reserve(buf, 2); diff --git a/lib/odp-util.c b/lib/odp-util.c index c00b5a9..3cd85bd 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -1432,10 +1432,11 @@ odp_to_flow_flags(uint32_t tun_flags) * capable of being expanded to allow for that much space. */ void odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow, - uint32_t odp_in_port) + uint32_t odp_in_port, ovs_be16 encap_dl_type) { struct ovs_key_ethernet *eth_key; size_t encap; +ovs_be16 dl_type; if (flow->skb_priority) { nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->skb_priority); @@ -1488,7 +1489,18 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow, nl_msg_put_be16(buf, OVS_KEY_ATTR_ETHERTYPE, flow->dl_type); -if (flow->dl_type == htons(ETH_TYPE_IP)) { +dl_type = flow->dl_type; + +if (flow->mpls_depth) { +struct ovs_key_mpls *mpls_key; + +mpls_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_MPLS, +sizeof *mpls_key); +mpls_key->mpls_top_label = flow->mpls_lse; +dl_type = encap_dl_type; +} + +if (dl_type == htons(ETH_TYPE_IP)) { struct ovs_key_ipv4 *ipv4_key; ipv4_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV4, @@ -1499,7 +1511,7 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow, ipv4_key->ipv4_tos = flow->nw_tos; ipv4_key->ipv4_ttl = flow->nw_ttl; ipv4_key->ipv4_frag = ovs_to_odp_frag(flow->nw_frag); -} else if (flow->dl_type == htons(ETH_TYPE_IPV6)) { +} else if (dl_type == htons(ETH_TYPE_IPV6)) { struct ovs_key_ipv6 *ipv6_key; ipv6_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV6, @@ -1511,8 +1523,8 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow, ipv6_key->ipv6_tclass = flow->nw_tos; ipv6_key->ipv6_hlimit = flow->nw_ttl; ipv6_key->ipv6_frag = ovs_to_odp_frag(flow->nw_frag); -} else if (flow->dl_type == htons(ETH_TYPE_ARP) || - flow->dl_type == htons(ETH_TYPE_RARP)) { +} else if (dl_type == htons(ETH_TYPE_ARP) || + dl_type == htons(ETH_TYPE_RARP)) { struct ovs_key_arp *arp_key; arp_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_ARP, @@ -1525,16 +1537,7 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow, memcpy(arp_key->arp_tha, flow->arp_tha, ETH_ADDR_LEN); } -if (flow->mpls_depth) { -struct ovs_key_mpls *mpls_key; - -mpls_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_MPLS, -sizeof *mpls_key); -mpls_key->mpls_top_label = flow->mpls_lse; -} - -if ((flow->dl_type == htons(ETH_TYPE_IP) - || flow->dl_type == htons(ETH_TYPE_IPV6)) +if ((dl_type == htons(ETH_TYPE_IP) || dl_type == htons(ETH_TYPE_IPV6)) && !(flow->nw_frag & FLOW_NW_FRAG_LATER)) {
[ovs-dev] [PATCH 12/17] datapath: Add basic MPLS support to kernel
Allow datapath to recognize and extract MPLS labels into flow keys and execute actions which push, pop, and set labels on packets. Based heavily on work by Leo Alterman and Ravi K. Cc: Ravi K Cc: Leo Alterman Reviewed-by: Isaku Yamahata Signed-off-by: Simon Horman --- v2.15 * Use OVS_ACTION_SET to set OVS_KEY_ATTR_MPLS instead of OVS_ACTION_ATTR_SET_MPLS v2.14 * Remove include/linux/openvswitch.h portion which added add new key and action attributes. This now present in "User-Space MPLS actions and matches" which is now a dependency of this patch v2.13 * As suggested by Jarno Rajahalme - Rename mpls_bos element of ovs_skb_cb as l2_size as it is set and used regardless of if an MPLS stack is present or not. Update the name of helper functions and documentation accordingly. - Ensure that skb_cb_mpls_bos() never returns NULL * Correct endieness in eth_p_mpls() v2.12 * Update skb and network header on MPLS extraction in ovs_flow_extract() * Use NULL in skb_cb_mpls_bos() * Add eth_p_mpls helper v2.11 * No change v2.10 * No change v2.9 * datapath: Always update the mpls bos if vlan_pop is successful Regardless of the details of how a successful vlan_pop is achieved, the mpls bos needs to be updated. Without this fix it has been observed that the following results in malformed packets v2.8 * No change v2.7 * Rebase v2.6 * As suggested by Yamahata-san - Do not guard against label == 0 for OVS_ACTION_ATTR_SET_MPLS in validate_actions(). A label of 0 is valid - Remove comment stupulating that if the top_label element of struct sw_flow_key is 0 then there is no MPLS label. An MPLS label of 0 is valid and the correct check if ethertype is ntohs(ETH_TYPE_MPLS) or ntohs(ETH_TYPE_MPLS_MCAST) v2.5 * No change v2.4 * No change v2.3 * s/mpls_stack/mpls_bos/ This is in keeping with the naming used in the OpenFlow 1.3 specification v2.2 * Call skb_reset_mac_header() in skb_cb_set_mpls_stack() eth_hdr(skb) is non-NULL when called in skb_cb_set_mpls_stack(). * Add a call to skb_cb_set_mpls_stack() in ovs_packet_cmd_execute(). I apologise that I have mislaid my notes on this but it avoids a kernel panic. I can investigate again if necessary. * Use struct ovs_action_push_mpls instead of __be16 to decode OVS_ACTION_ATTR_PUSH_MPLS in validate_actions(). This is consistent with the data format for the attribute. * Indentation fix in skb_cb_mpls_stack(). [cosmetic] v2.1 * Manual rebase OVS_ACTION_ATTR_SET_MPLS k fix --- datapath/actions.c | 79 +++ datapath/datapath.c | 57 + datapath/datapath.h |9 ++ datapath/flow.c | 31 datapath/flow.h | 13 + datapath/vport.c|2 ++ 6 files changed, 191 insertions(+) diff --git a/datapath/actions.c b/datapath/actions.c index f638ffc..1bb2c52 100644 --- a/datapath/actions.c +++ b/datapath/actions.c @@ -49,6 +49,64 @@ static int make_writable(struct sk_buff *skb, int write_len) return pskb_expand_head(skb, 0, 0, GFP_ATOMIC); } +static __be16 get_ethertype(const struct sk_buff *skb) +{ + struct ethhdr *hdr = (struct ethhdr *)(skb_cb_mpls_bos(skb) - ETH_HLEN); + return hdr->h_proto; +} + +static void set_ethertype(struct sk_buff *skb, const __be16 ethertype) +{ + struct ethhdr *hdr = (struct ethhdr *)(skb_cb_mpls_bos(skb) - ETH_HLEN); + hdr->h_proto = ethertype; +} + +static int push_mpls(struct sk_buff *skb, const struct ovs_action_push_mpls *mpls) +{ + u32 l2_size; + __be32 *new_mpls_label; + + if (skb_cow_head(skb, MPLS_HLEN) < 0) { + kfree_skb(skb); + return -ENOMEM; + } + + l2_size = skb_cb_l2_size(skb); + skb_push(skb, MPLS_HLEN); + memmove(skb_mac_header(skb) - MPLS_HLEN, skb_mac_header(skb), l2_size); + skb_reset_mac_header(skb); + + new_mpls_label = (__be32 *)(skb_mac_header(skb) + l2_size); + *new_mpls_label = mpls->mpls_label; + + set_ethertype(skb, mpls->mpls_ethertype); + return 0; +} + +static int pop_mpls(struct sk_buff *skb, const __be16 *ethertype) +{ + __be16 current_ethertype = get_ethertype(skb); + if (eth_p_mpls(current_ethertype)) { + u32 l2_size = skb_cb_l2_size(skb); + + memmove(skb_mac_header(skb) + MPLS_HLEN, skb_mac_header(skb), l2_size); + + skb_pull(skb, MPLS_HLEN); + skb_reset_mac_header(skb); + + set_ethertype(skb, *ethertype); + } + return 0; +} + +static int set_mpls(struct sk_buff *skb, const __be32 *mpls_label) +{ + __be16 current_ethertype = get_ethertype(skb); + if (eth_p_mpls(current_ethertype)) + memcpy(skb_cb_mpls_bos(skb), mpls_label, sizeof(__be32)); + return 0; +} + /* remove VLAN header from packet and update csum accordingly. */ static int __pop_vlan_tci(str
[ovs-dev] [PATCH 11/17] Add support for copy_ttl_in action
This adds support for the OpenFlow 1.1+ copy_ttl_in action. And also adds an NX copy_ttl_out action. The implementation does not support copying in the case where the outermost header is IP as it is unclear to me that Open vSwtich has a notion of an inner IP header to copy the TLL from. The implementation does not support copying in the case where the outermost and next-to-otermost header is MPLS due to limitations in the current MPLS implementation which only supports modifying the outermost MPLS header. The handling of the TTL modification is entirely handled in userspace. Reviewed-by: Isaku Yamahata Signed-off-by: Simon Horman --- v2.15 * No change v2.14 * No change v2.13 * No change v2.12 * Rebase * Use eth_type_mpls() helper v2.11 * No change v2.10 * No change v2.9 * Update tests for upstream changes v2.8 * No change v2.7 * Correct typo in changelog * Encode action as OFP11 action in OFP11+ messages v2.6 * Non-trivial rebase v2.5 * First post --- include/openflow/nicira-ext.h |1 + lib/match.c |2 +- lib/ofp-actions.c | 23 +++ lib/ofp-actions.h |1 + lib/ofp-parse.c |5 + lib/ofp-print.c |4 +++- lib/ofp-util.def |2 ++ ofproto/ofproto-dpif.c| 24 ++-- tests/ofproto-dpif.at | 21 + utilities/ovs-ofctl.8.in |5 + 10 files changed, 84 insertions(+), 4 deletions(-) diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h index c592410..780618a 100644 --- a/include/openflow/nicira-ext.h +++ b/include/openflow/nicira-ext.h @@ -306,6 +306,7 @@ enum nx_action_subtype { NXAST_WRITE_METADATA, /* struct nx_action_write_metadata */ NXAST_PUSH_MPLS,/* struct nx_action_push_mpls */ NXAST_POP_MPLS, /* struct nx_action_pop_mpls */ +NXAST_COPY_TTL_IN, /* struct nx_action_header */ NXAST_COPY_TTL_OUT, /* struct nx_action_header */ NXAST_SET_MPLS_TTL, /* struct nx_action_ttl */ NXAST_DEC_MPLS_TTL, /* struct nx_action_header */ diff --git a/lib/match.c b/lib/match.c index a87099a..2e3fb28 100644 --- a/lib/match.c +++ b/lib/match.c @@ -995,7 +995,7 @@ match_format(const struct match *match, struct ds *s, unsigned int priority) if (f->dl_type == htons(ETH_TYPE_ARP) || f->dl_type == htons(ETH_TYPE_RARP)) { ds_put_format(s, "arp_op=%"PRIu8",", f->nw_proto); -} else { +} else if (!eth_type_mpls(f->dl_type)) { ds_put_format(s, "nw_proto=%"PRIu8",", f->nw_proto); } } diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c index 65516cd..ae017d4 100644 --- a/lib/ofp-actions.c +++ b/lib/ofp-actions.c @@ -412,6 +412,10 @@ ofpact_from_nxast(const union ofp_action *a, enum ofputil_action_code code, break; } +case OFPUTIL_NXAST_COPY_TTL_IN: +ofpact_put_COPY_TTL_IN(out); +break; + case OFPUTIL_NXAST_COPY_TTL_OUT: ofpact_put_COPY_TTL_OUT(out); break; @@ -806,6 +810,10 @@ ofpact_from_openflow11(const union ofp_action *a, struct ofpbuf *out) return nxm_reg_load_from_openflow12_set_field( (const struct ofp12_action_set_field *)a, out); +case OFPUTIL_OFPAT11_COPY_TTL_IN: +ofpact_put_COPY_TTL_IN(out); +break; + case OFPUTIL_OFPAT11_COPY_TTL_OUT: ofpact_put_COPY_TTL_OUT(out); break; @@ -1170,6 +1178,7 @@ ofpact_check__(const struct ofpact *a, const struct flow *flow, int max_ports, } case OFPACT_DEC_TTL: +case OFPACT_COPY_TTL_IN: case OFPACT_COPY_TTL_OUT: case OFPACT_SET_MPLS_TTL: case OFPACT_DEC_MPLS_TTL: @@ -1424,6 +1433,10 @@ ofpact_to_nxast(const struct ofpact *a, struct ofpbuf *out) ofpact_dec_ttl_to_nxast(ofpact_get_DEC_TTL(a), out); break; +case OFPACT_COPY_TTL_IN: +ofputil_put_NXAST_COPY_TTL_IN(out); +break; + case OFPACT_COPY_TTL_OUT: ofputil_put_NXAST_COPY_TTL_OUT(out); break; @@ -1608,6 +1621,7 @@ ofpact_to_openflow10(const struct ofpact *a, struct ofpbuf *out) case OFPACT_REG_MOVE: case OFPACT_REG_LOAD: case OFPACT_DEC_TTL: +case OFPACT_COPY_TTL_IN: case OFPACT_COPY_TTL_OUT: case OFPACT_SET_MPLS_TTL: case OFPACT_DEC_MPLS_TTL: @@ -1744,6 +1758,10 @@ ofpact_to_openflow11(const struct ofpact *a, struct ofpbuf *out) ofpact_dec_ttl_to_openflow11(ofpact_get_DEC_TTL(a), out); break; +case OFPACT_COPY_TTL_IN: +ofputil_put_OFPAT11_COPY_TTL_IN(out); +break; + case OFPACT_COPY_TTL_OUT: ofputil_put_OFPAT11_COPY_TTL_OUT(out); break; @@ -1902,6 +1920,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact, uint16_t port) case OFPACT_REG_MOVE: case OFPACT_REG_LOAD: case OFPACT_DEC_TTL: +case OFP
[ovs-dev] freight forwarder & logistics provider shared photos with you
Dear My Friend Nice day, Hyun Young is a leading professional freight forwarder and logistics provider who focus on the shipment from South China to all the world. Hyun Young started freight forwarding operation at Shenzhen in 2004. Based at Shenzhen, our ambition have pushed us forward to expand to other cities in south of China. Now we have capacity of handing shipment to or from all the ports in south of China. Holds while whole - heartedly achieves the best enterprise objective, With the great support of our global agency, we provide services to our customers through process-driven operation team, advanced information system, and strong management team. Glance to our company: 1. Sea Freight, included FCL&LCL; 2. Air Freight; 3. Express, included DHL,UPS,FEDEX,SAGAWA and SCOREJP; 4. Import & Export; 5. Land Transportation. We seek no strongest only more specialized, senior. Your satisfied will be our maximal pride. Shenzhen Hyun Young International Transportation CO.,LTD Jacky Yang Add: Floor 7&8, South Bao’an Road, Luohu District, Shenzhen, Guangdong, China. <>___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev