[ovs-dev] [PATCH RFC] ipv4 tcp: Use fine granularity to increase probe_size for tcp pmtu
A couple of month ago, I proposed a fix for over-MTU-sized vxlan packet loss at link[1], neither by fragmenting the tunnelled vxlan packet, nor pushing back PMTU ICMP need fragmented message is accepted by community. The upstream workaround is by adjusting guest mtu smaller or host mtu bigger, or by making virtio driver auto-tuned guest mtu(no consensus by now). Note, gre tunnel also suffer the over-MTU-sized packet loss. While For TCPv4 case, this issue could be solved by using Packetization Layer Path MTU Discovery which is defined as [3] from commit: 5d424d5a674f ("[TCP]: MTU probing"). echo 1 > /proc/sys/net/ipv4/tcp_mtu_probing One drawback of tcp level mtu probing is:The original strategy is double mss_cache for each probe, this is way too aggressive for over-MTU-sized vxlan packet loss issue from the performance result. Also, the probing is characterized by tcp retransmission, which usual taking 6 seconds from the first drop packet to normal connectivity recovery. By incrementing 25% of original mss_cache each time, performance boost from ~1.3Gbits/s(mss_cache 1024Bytes) to ~1.55Gbits/s( mss_cache 1250Bytes), more generic theme could be used there for other tunnel technology. No sure why tcp level mtu probing got disabled by default, any historic known issues or pitfalls? [1]: http://www.spinics.net/lists/netdev/msg306502.html [2]: http://www.ietf.org/rfc/rfc4821.txt Signed-off-by: Fan Du --- net/ipv4/tcp_output.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 20ab06b..ab7e46b 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1856,9 +1856,11 @@ static int tcp_mtu_probe(struct sock *sk) tp->rx_opt.num_sacks || tp->rx_opt.dsack) return -1; - /* Very simple search strategy: just double the MSS. */ + /* Very simple search strategy: +* Increment 25% of orignal MSS forward +*/ mss_now = tcp_current_mss(sk); - probe_size = 2 * tp->mss_cache; + probe_size = (tp->mss_cache + (tp->mss_cache >> 2)); size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache; if (probe_size > tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_high)) { /* TODO: set timer for probe_converge_event */ -- 1.7.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 07/13] datapath: Support VXLAN Group Policy extension
On 02/03/15 at 05:55pm, Jesse Gross wrote: > On Tue, Feb 3, 2015 at 3:18 PM, Jesse Gross wrote: > > It seems like in general userspace should not allow different > > extensions to share the same datapath port. This perhaps isn't much of > > an issue right now but if you enable an extension on one VXLAN port > > (on the userspace side) and have another that is vanilla then we > > shouldn't silently use the extension for everything. > > One other thought on this: > > Is it really right for the kernel to ignore unknown extensions? It > seems like if the user has requested a particular mode and it just > silently doesn't take effect because the kernel doesn't know about it > that's not really great. Agreed to both statements. Are you OK if I fix this in the upstream kernel first and then backport the changes and in the meantime push these changes after addressing Pravin's feedback? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 11/13] datapath: Allow building against 3.19.x
On 02/03/15 at 10:09am, Pravin Shelar wrote: > When I tried this patch series against mainline 3.19-rc7, I got > following error on insert. > > openvswitch: Unknown symbol rpl_vxlan_xmit_skb (err 0) > openvswitch: Unknown symbol rpl_vxlan_sock_release (err 0) > openvswitch: Unknown symbol rpl_vxlan_sock_add (err 0) Good catch. The ifdef logic in compat/vxlan.c was erroneous so the declarations where done properly but the symbols were missing. I've fixed this and verified on 3.19, 3.16 and 3.12. We desperately need CI for kmod on a kernel matrix. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] mcast_snoop: make mcast_fport_bundle generic
On Tuesday, February 03, 2015 04:07:24 PM Ben Pfaff wrote: > On Tue, Jan 20, 2015 at 01:20:56PM -0200, Flavio Leitner wrote: > > The struct mcast_fport_bundle will be used for ports > > forwarding Reports too, so make it generic. > > > > Signed-off-by: Flavio Leitner > > I apologize that it has taken me a long time to review this patch. As > usual, I have been busy; most recently, I was out the last five days to > an out-of-town corporate event where they kept us busy from 7am to 10pm > or later. Hardly had time to breathe, let along review patches. understandable, I just don't want to miss the 2.4 train. > I need a little bit of help understanding this patch. In particular, > the changes remove a 'vlan' parameter from a lot of function prototypes, > and I don't see anything replace it. Was the VLAN somehow unused > previously? I'd like to understand why it was needed before and somehow > not needed now. Yes, it wasn't used at all so I took it out in the cleanup. fbl ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v2] datapath-windows: accommodate to UFID changes
Current flow commands: new, set, get, del need to respond with a NETLINK error in the case OVS_FLOW_ATTR_KEY is missing. OVS_FLOW_ATTR_KEY is now an optional attribute. Also add OVS_FLOW_ATTR_UFID attribute to the kernel for further use. v2 Remove redundant return code Acked-by: Eitan Eliahu Signed-off-by: Alin Gabriel Serdean --- datapath-windows/ovsext/Flow.c | 25 ++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c index d3de8cc..c5954d8 100644 --- a/datapath-windows/ovsext/Flow.c +++ b/datapath-windows/ovsext/Flow.c @@ -95,7 +95,7 @@ static NTSTATUS OvsDoDumpFlows(OvsFlowDumpInput *dumpInput, /* For Parsing attributes in FLOW_* commands */ const NL_POLICY nlFlowPolicy[] = { -[OVS_FLOW_ATTR_KEY] = {.type = NL_A_NESTED, .optional = FALSE}, +[OVS_FLOW_ATTR_KEY] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_MASK] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_ACTIONS] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_STATS] = {.type = NL_A_UNSPEC, @@ -103,7 +103,9 @@ const NL_POLICY nlFlowPolicy[] = { .maxLen = sizeof(struct ovs_flow_stats), .optional = TRUE}, [OVS_FLOW_ATTR_TCP_FLAGS] = {NL_A_U8, .optional = TRUE}, -[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE} +[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE}, +[OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = TRUE, + .minLen = sizeof(OvsU128) } }; /* For Parsing nested OVS_FLOW_ATTR_KEY attributes. @@ -308,6 +310,11 @@ OvsFlowNlCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, goto done; } +if (nlAttrs[OVS_FLOW_ATTR_KEY] == NULL) { +nlError = NL_ERROR_INVAL; +goto done; +} + if ((rc = _MapNlToFlowPut(msgIn, nlAttrs[OVS_FLOW_ATTR_KEY], nlAttrs[OVS_FLOW_ATTR_ACTIONS], nlAttrs[OVS_FLOW_ATTR_CLEAR], &mappedFlow)) @@ -420,7 +427,7 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, PNL_MSG_HDR nlMsgOutHdr = NULL; UINT32 attrOffset = NLMSG_HDRLEN + GENL_HDRLEN + OVS_HDRLEN; PNL_ATTR nlAttrs[__OVS_FLOW_ATTR_MAX]; - +NL_ERROR nlError = NL_ERROR_SUCCESS; OvsFlowGetInput getInput; OvsFlowGetOutput getOutput; NL_BUFFER nlBuf; @@ -455,6 +462,11 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, goto done; } +if (nlAttrs[OVS_FLOW_ATTR_KEY] == NULL) { +nlError = NL_ERROR_INVAL; +goto done; +} + keyAttrOffset = (UINT32)((PCHAR) nlAttrs[OVS_FLOW_ATTR_KEY] - (PCHAR)nlMsgHdr); @@ -532,6 +544,13 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, *replyLen += NlMsgSize(nlMsgOutHdr); done: +if (nlError != NL_ERROR_SUCCESS) { +POVS_MESSAGE_ERROR msgError = (POVS_MESSAGE_ERROR) +usrParamsCtx->outputBuffer; +NlBuildErrorMsg(msgIn, msgError, nlError); +*replyLen = msgError->nlMsg.nlmsgLen; +} + return rc; } -- 1.9.5.msysgit.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v3] datapath-windows: accommodate to UFID changes
Current flow commands: new, set, get, del need to respond with a NETLINK error in the case OVS_FLOW_ATTR_KEY is missing. OVS_FLOW_ATTR_KEY is now an optional attribute. Also add OVS_FLOW_ATTR_UFID attribute to the kernel for further use. v3 Fix broken commit v2 Remove redundant return code Acked-by: Eitan Eliahu Signed-off-by: Alin Gabriel Serdean --- datapath-windows/ovsext/Flow.c | 25 ++--- datapath-windows/ovsext/Flow.h | 6 ++ 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c index d3de8cc..c5954d8 100644 --- a/datapath-windows/ovsext/Flow.c +++ b/datapath-windows/ovsext/Flow.c @@ -95,7 +95,7 @@ static NTSTATUS OvsDoDumpFlows(OvsFlowDumpInput *dumpInput, /* For Parsing attributes in FLOW_* commands */ const NL_POLICY nlFlowPolicy[] = { -[OVS_FLOW_ATTR_KEY] = {.type = NL_A_NESTED, .optional = FALSE}, +[OVS_FLOW_ATTR_KEY] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_MASK] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_ACTIONS] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_STATS] = {.type = NL_A_UNSPEC, @@ -103,7 +103,9 @@ const NL_POLICY nlFlowPolicy[] = { .maxLen = sizeof(struct ovs_flow_stats), .optional = TRUE}, [OVS_FLOW_ATTR_TCP_FLAGS] = {NL_A_U8, .optional = TRUE}, -[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE} +[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE}, +[OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = TRUE, + .minLen = sizeof(OvsU128) } }; /* For Parsing nested OVS_FLOW_ATTR_KEY attributes. @@ -308,6 +310,11 @@ OvsFlowNlCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, goto done; } +if (nlAttrs[OVS_FLOW_ATTR_KEY] == NULL) { +nlError = NL_ERROR_INVAL; +goto done; +} + if ((rc = _MapNlToFlowPut(msgIn, nlAttrs[OVS_FLOW_ATTR_KEY], nlAttrs[OVS_FLOW_ATTR_ACTIONS], nlAttrs[OVS_FLOW_ATTR_CLEAR], &mappedFlow)) @@ -420,7 +427,7 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, PNL_MSG_HDR nlMsgOutHdr = NULL; UINT32 attrOffset = NLMSG_HDRLEN + GENL_HDRLEN + OVS_HDRLEN; PNL_ATTR nlAttrs[__OVS_FLOW_ATTR_MAX]; - +NL_ERROR nlError = NL_ERROR_SUCCESS; OvsFlowGetInput getInput; OvsFlowGetOutput getOutput; NL_BUFFER nlBuf; @@ -455,6 +462,11 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, goto done; } +if (nlAttrs[OVS_FLOW_ATTR_KEY] == NULL) { +nlError = NL_ERROR_INVAL; +goto done; +} + keyAttrOffset = (UINT32)((PCHAR) nlAttrs[OVS_FLOW_ATTR_KEY] - (PCHAR)nlMsgHdr); @@ -532,6 +544,13 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, *replyLen += NlMsgSize(nlMsgOutHdr); done: +if (nlError != NL_ERROR_SUCCESS) { +POVS_MESSAGE_ERROR msgError = (POVS_MESSAGE_ERROR) +usrParamsCtx->outputBuffer; +NlBuildErrorMsg(msgIn, msgError, nlError); +*replyLen = msgError->nlMsg.nlmsgLen; +} + return rc; } diff --git a/datapath-windows/ovsext/Flow.h b/datapath-windows/ovsext/Flow.h index e817bcf..02a72de 100644 --- a/datapath-windows/ovsext/Flow.h +++ b/datapath-windows/ovsext/Flow.h @@ -37,6 +37,12 @@ typedef struct _OvsFlow { NL_ATTR actions[1]; } OvsFlow; +typedef union _OvsU128{ +uint32_t u32[4]; +struct { +uint64_t lo, hi; +} u64; +} OvsU128; typedef struct _OvsLayers { UINT32 l3Ofs; // IPv4, IPv6, ARP, or other L3 header. -- 1.9.5.msysgit.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] OVN architecture
I've read through this thread and although there was a lot of discussion about OVN and integration with OpenStack I'd like to ask if there has been any discussion about what considerations might need to be made tomake OVN easy to utilize with Linux Containers/Namespaces? Use of container technologies like LXC/LXD, Docker, CoreOS's Rocket etc all are growing in importance and popularity but networking containers and/or nested containers across multiple hosts still requires a bit of voodoo. Brian ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] OVN architecture
On Wed, Feb 4, 2015 at 6:40 AM, brian mullan wrote: > I've read through this thread and although there was a lot of discussion > about OVN and > integration with OpenStack I'd like to ask if there has been any discussion > about what > considerations might need to be made tomake OVN easy to utilize with Linux > Containers/Namespaces? Inserting an openvswitch interface inside a linux namespace is straightforward. Once you do that and write some information in OVSDB as defined in IntegrationGuide.md, there should be no difference between a VM and a container for OVN. Having said that, different people would like to use Linux containers with different workflow. So coming up with a general purpose agent to integrate with all of them may be a challenge. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] travis: Add 32 bit (-m32) cross-compile build
Inspired by Ben Pfaff's email on 32 bit build environment, this adds a 32bit build to the travis build matrix to catch alignment and padding issues. The 32 bit build is only enabled for non-DPDK builds as DPDK itself is currently not capable to be compiled with -m32. The build also has SSL disabled as the Ubuntu libssl-devel package is not multiarch compatible on the travis build system. Signed-off-by: Thomas Graf --- .travis.yml| 1 + .travis/build.sh | 8 ++-- .travis/prepare.sh | 1 + 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/.travis.yml b/.travis.yml index 1ffd15a..1838bb2 100644 --- a/.travis.yml +++ b/.travis.yml @@ -9,6 +9,7 @@ env: - OPTS="--disable-ssl" - TESTSUITE=1 KERNEL=3.18.1 - TESTSUITE=1 OPTS="--enable-shared" + - BUILD_ENV="-m32" OPTS="--disable-ssl" - KERNEL=3.17.7 DPDK=1 - KERNEL=3.17.7 DPDK=1 OPTS="--enable-shared" - KERNEL=3.17.7 diff --git a/.travis/build.sh b/.travis/build.sh index 3570992..a8a515b 100755 --- a/.travis/build.sh +++ b/.travis/build.sh @@ -4,6 +4,7 @@ set -o errexit KERNELSRC="" CFLAGS="-Werror" +SPARSE_FLAGS="" EXTRA_OPTS="" function install_kernel() @@ -74,7 +75,7 @@ if [ "$DPDK" ]; then EXTRA_OPTS+="--with-dpdk=./dpdk-$DPDK_VER/build" elif [ $CC != "clang" ]; then # DPDK headers currently trigger sparse errors -CFLAGS="$CFLAGS -Wsparse-error" +SPARSE_FLAGS="$SPARSE_FLAGS -Wsparse-error" fi configure_ovs $EXTRA_OPTS $* @@ -86,8 +87,11 @@ fi if [ $CC = "clang" ]; then make CFLAGS="$CFLAGS -Wno-error=unused-command-line-argument" +elif [[ $BUILD_ENV =~ "-m32" ]]; then +# Disable sparse for 32bit builds on 64bit machine +make CFLAGS="$CFLAGS $BUILD_ENV" else -make CFLAGS="$CFLAGS" C=1 +make CFLAGS="$CFLAGS $BUILD_ENV $SPARSE_FLAGS" C=1 fi if [ $TESTSUITE ] && [ $CC != "clang" ]; then diff --git a/.travis/prepare.sh b/.travis/prepare.sh index f8bd0a1..a78282b 100755 --- a/.travis/prepare.sh +++ b/.travis/prepare.sh @@ -2,6 +2,7 @@ sudo apt-get update -qq sudo apt-get install -qq libssl-dev llvm-dev +sudo apt-get install -qq gcc-multilib git clone git://git.kernel.org/pub/scm/devel/sparse/chrisl/sparse.git cd sparse && make && sudo make install PREFIX=/usr && cd .. -- 1.9.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 0/6 v4] Datapath sync & VXLAN-GBP
v4: - Pushed already ACKed patches if no dependencies remained - Addressed Pravin's feedback: - Fixed missing symbols if VXLAN compat stack is needed - Only set TUNNEL_VXLAN_OPT if VXLAN metadata was found in the packet - Addressed Ben's feedback (Fixed by Madhu Challa): - Fixed manual page formatting for html/pdf output - Fixed hex formatting for tun_gbp_flags - Fixed padding of flow_tnl for 32 bit and fixed wrong byte hole comment Note: Jesse raised an excellent point WRT to shared datapath ports which should be fixed, it was agreed to fix it in the upstream kernel first and then backport the fix. v3: - Adressed Pravin's feedback: - Use #ifdef to check for skb_vlan_tag_present instead of autoconf - Include base don USE_KERNEL_TUNNEL_API instead of separate autoconf check - Removed now used HAVE_VXLAN_XMIT_SKB_XNET_ARG check - New patch to account for xnet attribtue in upstream vxlan_xmit_skb() to keep code more in sync - New patch to account for new vxflags and removed vxlan_sock attribute to vxlan_xmit_skb() as changed in net-next in by Tom. v2: - Documentation in vswitch.xml and section in ovs-ofctl(8) - New test in ovs-ofctl.at to check OXM/NXM availability This series accounts for various upstream kernel changes and then introduces VXLAN-GBP: Introduces support for the group policy extension to the VXLAN virtual port. The extension is disabled by default and only enabled if the user has provided the respective configuration. ovs-vsctl add-port br0 vxlan0 -- \ set Interface vxlan0 type=vxlan options:exts=gbp The configuration interface to enable the extension is based on a new attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION which can carry additional extensions as needed in the future. The group policy metadata is stored as binary blob (struct ovs_vxlan_opts) internally just like Geneve options but transported as nested Netlink attributes to user space. Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced. The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive. Madhu Challa (1): ofproto: Add NXM_NX_TUN_GBP_ID and NXM_NX_TUN_GBP_FLAGS Thomas Graf (5): datapath: Fix missing symbols when required to use own VXLAN stack datapath: Support VXLAN Group Policy extension datapath: Account for "vxlan: Eliminate dependency on UDP socket in transmit path" datapath: Allow building against 3.19.x tunnel: Provide framework for tunnel extensions for VXLAN-GBP and others FAQ.md| 1 + NEWS | 3 +- acinclude.m4 | 4 +- datapath/Modules.mk | 3 +- datapath/flow_netlink.c | 114 +++--- datapath/linux/compat/gso.c | 7 +- datapath/linux/compat/gso.h | 5 +- datapath/linux/compat/include/linux/openvswitch.h | 10 ++ datapath/linux/compat/include/net/ip_tunnels.h| 16 ++- datapath/linux/compat/include/net/vxlan.h | 14 ++- datapath/linux/compat/vxlan.c | 17 ++-- datapath/vport-geneve.c | 6 +- datapath/vport-vxlan.c| 84 +++- datapath/vport-vxlan.h| 11 +++ lib/dpif-netlink.c| 20 +++- lib/flow.c| 16 +-- lib/flow.h| 6 +- lib/match.c | 37 ++- lib/match.h | 4 + lib/meta-flow.c | 36 +++ lib/meta-flow.h | 28 ++ lib/netdev-vport.c| 18 lib/netdev.h | 2 + lib/nx-match.c| 6 +- lib/odp-util.c| 36 ++- lib/odp-util.h| 3 +- lib/ofp-print.c | 9 ++ lib/ofp-util.c| 10 +- lib/packets.h | 3 + ofproto/ofproto-dpif-xlate.c | 2 +- tests/odp.at | 16 +-- tests/ofproto.at | 6 +- tests/ovs-ofctl.at| 4 + tests/tunnel.at | 82 utilities/ovs-ofctl.8.in | 34 +++ vswitchd/vswitch.xml | 20 36 files changed, 586 insertions(+), 107 deletions(-) crea
[ovs-dev] [PATCH 1/6] datapath: Fix missing symbols when required to use own VXLAN stack
Fixes an insufficient ifdef in compat/vxlan.c which caused required symbols not to be included in the build. The declarations were properly enabled so the build would succeed but the module would spit missing symbols when being inserted. The fix uses a new define USE_UPSTREAM_VXLAN which is set in the compat header as required. This centralizes the decision when to include VXLAN compat code to a single place which eases further changes. Reported-by: Pravin Shelar Signed-off-by: Thomas Graf --- datapath/linux/compat/gso.c | 7 --- datapath/linux/compat/gso.h | 5 +++-- datapath/linux/compat/include/net/vxlan.h | 2 ++ datapath/linux/compat/vxlan.c | 7 +-- 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/datapath/linux/compat/gso.c b/datapath/linux/compat/gso.c index 13e8f4e..b20ad8a 100644 --- a/datapath/linux/compat/gso.c +++ b/datapath/linux/compat/gso.c @@ -17,7 +17,6 @@ */ #include -#if LINUX_VERSION_CODE < KERNEL_VERSION(3,16,0) #include #include @@ -53,6 +52,7 @@ MODULE_PARM_DESC(vlan_tso, "Enable TSO for VLAN packets"); #define vlan_tso true #endif +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,16,0) static bool dev_supports_vlan_tx(struct net_device *dev) { #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,37) @@ -295,7 +295,8 @@ int rpl_ip_local_out(struct sk_buff *skb) } #endif /* 3.16 */ -#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0) +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0) || \ + !defined USE_UPSTREAM_VXLAN struct sk_buff *ovs_iptunnel_handle_offloads(struct sk_buff *skb, bool csum_help, void (*fix_segment)(struct sk_buff *)) @@ -344,4 +345,4 @@ error: kfree_skb(skb); return ERR_PTR(err); } -#endif /* 3.12 */ +#endif /* 3.12 || !USE_UPSTREAM_VXLAN */ diff --git a/datapath/linux/compat/gso.h b/datapath/linux/compat/gso.h index 1009892..023d6d3 100644 --- a/datapath/linux/compat/gso.h +++ b/datapath/linux/compat/gso.h @@ -2,7 +2,8 @@ #define __LINUX_GSO_WRAPPER_H #include -#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0) +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,12,0) || \ + !defined USE_UPSTREAM_VXLAN #include #include @@ -71,7 +72,7 @@ struct sk_buff *ovs_iptunnel_handle_offloads(struct sk_buff *skb, gso_fix_segment_t fix_segment); -#endif /* 3.12 */ +#endif /* 3.12 || !USE_UPSTREAM_VXLAN */ #if LINUX_VERSION_CODE < KERNEL_VERSION(3,16,0) #define ip_local_out rpl_ip_local_out diff --git a/datapath/linux/compat/include/net/vxlan.h b/datapath/linux/compat/include/net/vxlan.h index 643f837..478d49d 100644 --- a/datapath/linux/compat/include/net/vxlan.h +++ b/datapath/linux/compat/include/net/vxlan.h @@ -78,6 +78,8 @@ struct vxlanhdr_gbp { #endif #ifdef HAVE_VXLAN_METADATA +#define USE_UPSTREAM_VXLAN + static inline int rpl_vxlan_xmit_skb(struct vxlan_sock *vs, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, diff --git a/datapath/linux/compat/vxlan.c b/datapath/linux/compat/vxlan.c index dbccd3c..9342a40 100644 --- a/datapath/linux/compat/vxlan.c +++ b/datapath/linux/compat/vxlan.c @@ -18,6 +18,8 @@ * This code is derived from kernel vxlan module. */ +#ifndef USE_UPSTREAM_VXLAN + #include #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt @@ -58,13 +60,14 @@ #include "datapath.h" #include "gso.h" #include "vlan.h" -#ifndef USE_KERNEL_TUNNEL_API +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0) /* VXLAN protocol header */ struct vxlanhdr { __be32 vx_flags; __be32 vx_vni; }; +#endif /* Callback from net/ipv4/udp.c to receive packets */ static int vxlan_udp_encap_recv(struct sock *sk, struct sk_buff *skb) @@ -369,4 +372,4 @@ void vxlan_sock_release(struct vxlan_sock *vs) queue_work(system_wq, &vs->del_work); } -#endif /* 3.12 */ +#endif /* !USE_UPSTREAM_VXLAN */ -- 1.9.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 2/6] datapath: Support VXLAN Group Policy extension
Upstream commit: openvswitch: Support VXLAN Group Policy extension Introduces support for the group policy extension to the VXLAN virtual port. The extension is disabled by default and only enabled if the user has provided the respective configuration. ovs-vsctl add-port br0 vxlan0 -- \ set Interface vxlan0 type=vxlan options:exts=gbp The configuration interface to enable the extension is based on a new attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION which can carry additional extensions as needed in the future. The group policy metadata is stored as binary blob (struct ovs_vxlan_opts) internally just like Geneve options but transported as nested Netlink attributes to user space. Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced. The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller Upstream: 1dd144 ("openvswitch: Support VXLAN Group Policy extension") Signed-off-by: Thomas Graf --- datapath/Modules.mk | 3 +- datapath/flow_netlink.c | 114 +++--- datapath/linux/compat/include/linux/openvswitch.h | 10 ++ datapath/linux/compat/include/net/ip_tunnels.h| 16 ++- datapath/vport-geneve.c | 6 +- datapath/vport-vxlan.c| 82 +++- datapath/vport-vxlan.h| 11 +++ 7 files changed, 223 insertions(+), 19 deletions(-) create mode 100644 datapath/vport-vxlan.h diff --git a/datapath/Modules.mk b/datapath/Modules.mk index cca4887..33f9dd9 100644 --- a/datapath/Modules.mk +++ b/datapath/Modules.mk @@ -30,7 +30,8 @@ openvswitch_headers = \ vlan.h \ vport.h \ vport-internal_dev.h \ - vport-netdev.h + vport-netdev.h \ + vport-vxlan.h openvswitch_extras = \ README.md diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c index 6cd5391..34edcfe 100644 --- a/datapath/flow_netlink.c +++ b/datapath/flow_netlink.c @@ -49,6 +49,7 @@ #include "datapath.h" #include "flow.h" #include "flow_netlink.h" +#include "vport-vxlan.h" struct ovs_len_tbl { int len; @@ -268,6 +269,9 @@ size_t ovs_tun_key_attr_size(void) + nla_total_size(0)/* OVS_TUNNEL_KEY_ATTR_CSUM */ + nla_total_size(0)/* OVS_TUNNEL_KEY_ATTR_OAM */ + nla_total_size(256) /* OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS */ + /* OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS is mutually exclusive with +* OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS and covered by it. +*/ + nla_total_size(2)/* OVS_TUNNEL_KEY_ATTR_TP_SRC */ + nla_total_size(2); /* OVS_TUNNEL_KEY_ATTR_TP_DST */ } @@ -308,6 +312,7 @@ static const struct ovs_len_tbl ovs_tunnel_key_lens[OVS_TUNNEL_KEY_ATTR_MAX + 1] [OVS_TUNNEL_KEY_ATTR_TP_DST]= { .len = sizeof(u16) }, [OVS_TUNNEL_KEY_ATTR_OAM] = { .len = 0 }, [OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS] = { .len = OVS_ATTR_NESTED }, + [OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS]= { .len = OVS_ATTR_NESTED }, }; /* The size of the argument for each %OVS_KEY_ATTR_* Netlink attribute. */ @@ -459,6 +464,41 @@ static int genev_tun_opt_from_nlattr(const struct nlattr *a, return 0; } +static const struct nla_policy vxlan_opt_policy[OVS_VXLAN_EXT_MAX + 1] = { + [OVS_VXLAN_EXT_GBP] = { .type = NLA_U32 }, +}; + +static int vxlan_tun_opt_from_nlattr(const struct nlattr *a, +struct sw_flow_match *match, bool is_mask, +bool log) +{ + struct nlattr *tb[OVS_VXLAN_EXT_MAX+1]; + unsigned long opt_key_offset; + struct ovs_vxlan_opts opts; + int err; + + BUILD_BUG_ON(sizeof(opts) > sizeof(match->key->tun_opts)); + + err = nla_parse_nested(tb, OVS_VXLAN_EXT_MAX, a, vxlan_opt_policy); + if (err < 0) + return err; + + memset(&opts, 0, sizeof(opts)); + + if (tb[OVS_VXLAN_EXT_GBP]) + opts.gbp = nla_get_u32(tb[OVS_VXLAN_EXT_GBP]); + + if (!is_mask) + SW_FLOW_KEY_PUT(match, tun_opts_len, sizeof(opts), false); + else + SW_FLOW_KEY_PUT(match, tun_opts_len, 0xff, true); + + opt_key_offset = TUN_METADATA_OFFSET(sizeof(opts)); + SW_FLOW_KEY_MEMCPY_OFFSET(match, opt_key_offset, &opts, sizeof(opts), + is_mask); + return 0; +} + static int ipv4_tun_from_nlattr(const struct nlattr *attr, struct sw_flow_match *match, bool is_mask, bool log) @@ -467,6 +507,7 @@ static in
[ovs-dev] [PATCH 3/6] datapath: Account for "vxlan: Eliminate dependency on UDP socket in transmit path"
Excludes VXLAN_F_REMCSUM_TX bits as OVS currently doesn't support it. Upstream commit: vxlan: Eliminate dependency on UDP socket in transmit path In the vxlan transmit path there is no need to reference the socket for a tunnel which is needed for the receive side. We do, however, need the vxlan_dev flags. This patch eliminate references to the socket in the transmit path, and changes VXLAN_F_UNSHAREABLE to be VXLAN_F_RCV_FLAGS. This mask is used to store the flags applicable to receive (GBP, CSUM6_RX, and REMCSUM_RX) in the vxlan_sock flags. Signed-off-by: Tom Herbert Signed-off-by: David S. Miller Upstream: af33c1adae1e ("vxlan: Eliminate dependency on UDP socket in transmit path") Signed-off-by: Thomas Graf Acked-by: Pravin B Shelar --- datapath/linux/compat/include/net/vxlan.h | 12 datapath/linux/compat/vxlan.c | 10 +- datapath/vport-vxlan.c| 2 +- 3 files changed, 14 insertions(+), 10 deletions(-) diff --git a/datapath/linux/compat/include/net/vxlan.h b/datapath/linux/compat/include/net/vxlan.h index 478d49d..d30de31 100644 --- a/datapath/linux/compat/include/net/vxlan.h +++ b/datapath/linux/compat/include/net/vxlan.h @@ -77,6 +77,10 @@ struct vxlanhdr_gbp { #define VXLAN_F_GBP0x800 #endif +#ifndef VXLAN_F_RCV_FLAGS +#define VXLAN_F_RCV_FLAGS VXLAN_F_GBP +#endif + #ifdef HAVE_VXLAN_METADATA #define USE_UPSTREAM_VXLAN @@ -84,15 +88,15 @@ static inline int rpl_vxlan_xmit_skb(struct vxlan_sock *vs, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - struct vxlan_metadata *md, bool xnet) + struct vxlan_metadata *md, bool xnet, u32 vxflags) { if (skb_is_gso(skb) && skb_is_encapsulated(skb)) { kfree_skb(skb); return -ENOSYS; } - return vxlan_xmit_skb(vs, rt, skb, src, dst, tos, ttl, df, - src_port, dst_port, md, xnet); + return vxlan_xmit_skb(rt, skb, src, dst, tos, ttl, df, + src_port, dst_port, md, xnet, vxflags); } #define vxlan_xmit_skb rpl_vxlan_xmit_skb @@ -134,7 +138,7 @@ int vxlan_xmit_skb(struct vxlan_sock *vs, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - struct vxlan_metadata *md, bool xnet); + struct vxlan_metadata *md, bool xnet, u32 vxflags); #define vxlan_src_port rpl_vxlan_src_port __be16 vxlan_src_port(__u16 port_min, __u16 port_max, struct sk_buff *skb); diff --git a/datapath/linux/compat/vxlan.c b/datapath/linux/compat/vxlan.c index 9342a40..9d70611 100644 --- a/datapath/linux/compat/vxlan.c +++ b/datapath/linux/compat/vxlan.c @@ -209,7 +209,7 @@ static struct sk_buff *handle_offloads(struct sk_buff *skb) return ovs_iptunnel_handle_offloads(skb, false, vxlan_gso); } -static void vxlan_build_gbp_hdr(struct vxlanhdr *vxh, struct vxlan_sock *vs, +static void vxlan_build_gbp_hdr(struct vxlanhdr *vxh, u32 vxflags, struct vxlan_metadata *md) { struct vxlanhdr_gbp *gbp; @@ -230,7 +230,7 @@ int vxlan_xmit_skb(struct vxlan_sock *vs, struct rtable *rt, struct sk_buff *skb, __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df, __be16 src_port, __be16 dst_port, - struct vxlan_metadata *md, bool xnet) + struct vxlan_metadata *md, bool xnet, u32 vxflags) { struct vxlanhdr *vxh; struct udphdr *uh; @@ -263,8 +263,8 @@ int vxlan_xmit_skb(struct vxlan_sock *vs, vxh->vx_flags = htonl(VXLAN_HF_VNI); vxh->vx_vni = md->vni; - if (vs->flags & VXLAN_F_GBP) - vxlan_build_gbp_hdr(vxh, vs, md); + if (vxflags & VXLAN_F_GBP) + vxlan_build_gbp_hdr(vxh, vxflags, md); __skb_push(skb, sizeof(*uh)); skb_reset_transport_header(skb); @@ -344,7 +344,7 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port, } vs->rcv = rcv; vs->data = data; - vs->flags = flags; + vs->flags = (flags & VXLAN_F_RCV_FLAGS); /* Disable multicast loopback */ inet_sk(sk)->mc_loop = 0; diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c index 3dcd69c..7fcb88a 100644 --- a/datapath/vport-vxlan.c +++ b/datapath/vport-vxlan.c @@ -259,7 +259,7 @@ static int vxlan_tnl_send(struct vport *vport, struct sk_buff *skb) tun_key->ipv4_tos, tun_key->ipv4_ttl, df, src_port, dst_port, -&md, false); +
[ovs-dev] [PATCH 4/6] datapath: Allow building against 3.19.x
Signed-off-by: Thomas Graf --- FAQ.md | 1 + NEWS | 2 +- acinclude.m4 | 4 ++-- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/FAQ.md b/FAQ.md index 02fe2d3..2de4936 100644 --- a/FAQ.md +++ b/FAQ.md @@ -147,6 +147,7 @@ A: The following table lists the Linux kernel versions against which the |2.0.x | 2.6.32 to 3.10 |2.1.x | 2.6.32 to 3.11 |2.3.x | 2.6.32 to 3.14 +|2.4.x | 2.6.32 to 3.19 Open vSwitch userspace should also work with the Linux kernel module built into Linux 3.3 and later. diff --git a/NEWS b/NEWS index 38a6fdf..adfa1d1 100644 --- a/NEWS +++ b/NEWS @@ -57,7 +57,7 @@ Post-v2.3.0 - Added support for DPDK Tunneling. VXLAN and GRE are supported protocols. This is generic tunneling mechanism for userspace datapath. - Support for multicast snooping (IGMPv1 and IGMPv2) - - Support for Linux kernels up to 3.18.x + - Support for Linux kernels up to 3.19.x - The documentation now use the term 'destination' to mean one of syslog, console or file for vlog logging instead of the previously used term 'facility'. diff --git a/acinclude.m4 b/acinclude.m4 index cb3e148..c2f45ce 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -134,10 +134,10 @@ AC_DEFUN([OVS_CHECK_LINUX], [ AC_MSG_RESULT([$kversion]) if test "$version" -ge 3; then - if test "$version" = 3 && test "$patchlevel" -le 18; then + if test "$version" = 3 && test "$patchlevel" -le 19; then : # Linux 3.x else - AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version newer than 3.18.x is not supported (please refer to the FAQ for advice)]) + AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version newer than 3.19.x is not supported (please refer to the FAQ for advice)]) fi else if test "$version" -le 1 || test "$patchlevel" -le 5 || test "$sublevel" -le 31; then -- 1.9.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 6/6] ofproto: Add NXM_NX_TUN_GBP_ID and NXM_NX_TUN_GBP_FLAGS
From: Madhu Challa Introduces two new NXMs to represent VXLAN-GBP [0] fields. actions=load:0x10->NXM_NX_TUN_GBP_ID[],NORMAL tun_gbp_id=0x10,actions=drop This enables existing VXLAN tunnels to carry security label information such as a SELinux context to other network peers. The values are carried to/from the datapath using the attribute OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS. [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy-00 Signed-off-by: Madhu Challa --- NEWS | 1 + lib/flow.c | 16 + lib/flow.h | 6 ++-- lib/match.c | 37 +++- lib/match.h | 4 +++ lib/meta-flow.c | 36 +++ lib/meta-flow.h | 28 +++ lib/nx-match.c | 6 +++- lib/odp-util.c | 36 ++- lib/odp-util.h | 3 +- lib/ofp-print.c | 9 + lib/ofp-util.c | 10 +- lib/packets.h| 3 ++ ofproto/ofproto-dpif-xlate.c | 2 +- tests/odp.at | 16 - tests/ofproto.at | 6 ++-- tests/ovs-ofctl.at | 4 +++ tests/tunnel.at | 82 ++-- utilities/ovs-ofctl.8.in | 34 ++ 19 files changed, 274 insertions(+), 65 deletions(-) diff --git a/NEWS b/NEWS index adfa1d1..3ab39ac 100644 --- a/NEWS +++ b/NEWS @@ -61,6 +61,7 @@ Post-v2.3.0 - The documentation now use the term 'destination' to mean one of syslog, console or file for vlog logging instead of the previously used term 'facility'. + - Support for VXLAN Group Policy extension v2.3.0 - 14 Aug 2014 diff --git a/lib/flow.c b/lib/flow.c index b0cb71d..81b36f9 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -119,7 +119,7 @@ struct mf_ctx { * away. Some GCC versions gave warnings on ALWAYS_INLINE, so these are * defined as macros. */ -#if (FLOW_WC_SEQ != 30) +#if (FLOW_WC_SEQ != 31) #define MINIFLOW_ASSERT(X) ovs_assert(X) BUILD_MESSAGE("FLOW_WC_SEQ changed: miniflow_extract() will have runtime " "assertions enabled. Consider updating FLOW_WC_SEQ after " @@ -765,13 +765,15 @@ flow_unwildcard_tp_ports(const struct flow *flow, struct flow_wildcards *wc) void flow_get_metadata(const struct flow *flow, struct flow_metadata *fmd) { -BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); +BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); fmd->dp_hash = flow->dp_hash; fmd->recirc_id = flow->recirc_id; fmd->tun_id = flow->tunnel.tun_id; fmd->tun_src = flow->tunnel.ip_src; fmd->tun_dst = flow->tunnel.ip_dst; +fmd->gbp_id = flow->tunnel.gbp_id; +fmd->gbp_flags = flow->tunnel.gbp_flags; fmd->metadata = flow->metadata; memcpy(fmd->regs, flow->regs, sizeof fmd->regs); fmd->pkt_mark = flow->pkt_mark; @@ -912,7 +914,7 @@ void flow_wildcards_init_for_packet(struct flow_wildcards *wc, memset(&wc->masks, 0x0, sizeof wc->masks); /* Update this function whenever struct flow changes. */ -BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); +BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); if (flow->tunnel.ip_dst) { if (flow->tunnel.flags & FLOW_TNL_F_KEY) { @@ -925,6 +927,8 @@ void flow_wildcards_init_for_packet(struct flow_wildcards *wc, WC_MASK_FIELD(wc, tunnel.ip_ttl); WC_MASK_FIELD(wc, tunnel.tp_src); WC_MASK_FIELD(wc, tunnel.tp_dst); +WC_MASK_FIELD(wc, tunnel.gbp_id); +WC_MASK_FIELD(wc, tunnel.gbp_flags); } else if (flow->tunnel.tun_id) { WC_MASK_FIELD(wc, tunnel.tun_id); } @@ -1009,7 +1013,7 @@ uint64_t flow_wc_map(const struct flow *flow) { /* Update this function whenever struct flow changes. */ -BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); +BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); uint64_t map = (flow->tunnel.ip_dst) ? MINIFLOW_MAP(tunnel) : 0; @@ -1061,7 +1065,7 @@ void flow_wildcards_clear_non_packet_fields(struct flow_wildcards *wc) { /* Update this function whenever struct flow changes. */ -BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); +BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); memset(&wc->masks.metadata, 0, sizeof wc->masks.metadata); memset(&wc->masks.regs, 0, sizeof wc->masks.regs); @@ -1620,7 +1624,7 @@ flow_push_mpls(struct flow *flow, int n, ovs_be16 mpls_eth_type, flow->mpls_lse[0] = set_mpls_lse_values(ttl, tc, 1, htonl(label)); /* Clear all L3 and L4 fields and dp_hash. */ -BUILD_ASSERT(FLOW_WC_SEQ == 30); +BUILD_ASSERT(FLOW_WC_SEQ == 31); memset((char *) flow + FLOW_SEGMENT_2_ENDS_AT, 0, sizeof(struct flow) - FLOW_SEGMENT_2_ENDS_AT); flow->dp_hash = 0; diff --git a/lib/flow.h b/lib/flow.h index dd989ee..f503097 100644 --- a/lib/flow.h +++ b/lib/flow.h @@ -38,7 +38,7 @@ struct pkt_metadata; /* This sequence number should be incremented whenever anything
[ovs-dev] [PATCH 5/6] tunnel: Provide framework for tunnel extensions for VXLAN-GBP and others
Supports a new "exts" field in the tunnel configuration which takes a comma separated list of enabled extensions. The only extension supported so far is GBP but this can be used to enable RCO and possibly others as soon as the OVS datapath supports them. Signed-off-by: Thomas Graf Acked-by: Ben Pfaff --- lib/dpif-netlink.c | 20 +--- lib/netdev-vport.c | 18 ++ lib/netdev.h | 2 ++ vswitchd/vswitch.xml | 20 4 files changed, 57 insertions(+), 3 deletions(-) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 1d8abdf..337ebd6 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -849,10 +849,24 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, struct netdev *netdev, } tnl_cfg = netdev_get_tunnel_config(netdev); -if (tnl_cfg && tnl_cfg->dst_port != 0) { +if (tnl_cfg && (tnl_cfg->dst_port != 0 || tnl_cfg->exts)) { ofpbuf_use_stack(&options, options_stub, sizeof options_stub); -nl_msg_put_u16(&options, OVS_TUNNEL_ATTR_DST_PORT, - ntohs(tnl_cfg->dst_port)); +if (tnl_cfg->dst_port) { +nl_msg_put_u16(&options, OVS_TUNNEL_ATTR_DST_PORT, + ntohs(tnl_cfg->dst_port)); +} +if (tnl_cfg->exts) { +size_t ext_ofs; +int i; + +ext_ofs = nl_msg_start_nested(&options, OVS_TUNNEL_ATTR_EXTENSION); +for (i = 0; i < 32; i++) { +if (tnl_cfg->exts & (1 << i)) { +nl_msg_put_flag(&options, i); +} +} +nl_msg_end_nested(&options, ext_ofs); +} request.options = ofpbuf_data(&options); request.options_len = ofpbuf_size(&options); } diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index 91acabb..9d02f2f 100644 --- a/lib/netdev-vport.c +++ b/lib/netdev-vport.c @@ -532,6 +532,24 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args) !strcmp(node->key, "in_key") || !strcmp(node->key, "out_key")) { /* Handled separately below. */ +} else if (!strcmp(node->key, "exts")) { +char *str = xstrdup(node->value); +char *ext, *save_ptr = NULL; + +tnl_cfg.exts = 0; + +ext = strtok_r(str, ",", &save_ptr); +while (ext) { +if (!strcmp(type, "vxlan") && !strcmp(ext, "gbp")) { +tnl_cfg.exts |= (1 << OVS_VXLAN_EXT_GBP); +} else { +VLOG_WARN("%s: unknown extension '%s'", name, ext); +} + +ext = strtok_r(NULL, ",", &save_ptr); +} + +free(str); } else { VLOG_WARN("%s: unknown %s argument '%s'", name, type, node->key); } diff --git a/lib/netdev.h b/lib/netdev.h index c5aa17d..33f301a 100644 --- a/lib/netdev.h +++ b/lib/netdev.h @@ -120,6 +120,8 @@ struct netdev_tunnel_config { ovs_be32 ip_src; ovs_be32 ip_dst; +uint32_t exts; + uint8_t ttl; bool ttl_inherit; diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 37a33a6..2c8dca8 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -1897,6 +1897,26 @@ to false to disable. + + + +Optional. Comma separated list of optional VXLAN extensions to + enable. The following extensions are supported: + + + +gbp: VXLAN-GBP allows to transport the group policy +context of a packet across the VXLAN tunnel to other network +peers. See the field description of tun_gbp_id and +tun_gbp_flags in ovs-ofctl(8) for additional +information. + (https://tools.ietf.org/html/draft-smith-vxlan-group-policy) + + + + + + Only gre and ipsec_gre interfaces support -- 1.9.3 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] travis: Add 32 bit (-m32) cross-compile build
On Wed, Feb 04, 2015 at 04:10:59PM +0100, Thomas Graf wrote: > Inspired by Ben Pfaff's email on 32 bit build environment, this adds a > 32bit build to the travis build matrix to catch alignment and padding > issues. > > The 32 bit build is only enabled for non-DPDK builds as DPDK itself is > currently not capable to be compiled with -m32. > > The build also has SSL disabled as the Ubuntu libssl-devel package is > not multiarch compatible on the travis build system. > > Signed-off-by: Thomas Graf Too bad about sparse. Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] mcast_snoop: make mcast_fport_bundle generic
On Wed, Feb 04, 2015 at 11:07:05AM -0200, Flavio Leitner wrote: > On Tuesday, February 03, 2015 04:07:24 PM Ben Pfaff wrote: > > On Tue, Jan 20, 2015 at 01:20:56PM -0200, Flavio Leitner wrote: > > > The struct mcast_fport_bundle will be used for ports > > > forwarding Reports too, so make it generic. > > > > > > Signed-off-by: Flavio Leitner > > > > I apologize that it has taken me a long time to review this patch. As > > usual, I have been busy; most recently, I was out the last five days to > > an out-of-town corporate event where they kept us busy from 7am to 10pm > > or later. Hardly had time to breathe, let along review patches. > > understandable, I just don't want to miss the 2.4 train. > > > I need a little bit of help understanding this patch. In particular, > > the changes remove a 'vlan' parameter from a lot of function prototypes, > > and I don't see anything replace it. Was the VLAN somehow unused > > previously? I'd like to understand why it was needed before and somehow > > not needed now. > > Yes, it wasn't used at all so I took it out in the cleanup. Oh, that makes sense. I'll take another quick look then. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] mcast_snoop: make mcast_fport_bundle generic
On Wed, Feb 04, 2015 at 09:57:34AM -0800, Ben Pfaff wrote: > On Wed, Feb 04, 2015 at 11:07:05AM -0200, Flavio Leitner wrote: > > On Tuesday, February 03, 2015 04:07:24 PM Ben Pfaff wrote: > > > On Tue, Jan 20, 2015 at 01:20:56PM -0200, Flavio Leitner wrote: > > > > The struct mcast_fport_bundle will be used for ports > > > > forwarding Reports too, so make it generic. > > > > > > > > Signed-off-by: Flavio Leitner > > > > > > I apologize that it has taken me a long time to review this patch. As > > > usual, I have been busy; most recently, I was out the last five days to > > > an out-of-town corporate event where they kept us busy from 7am to 10pm > > > or later. Hardly had time to breathe, let along review patches. > > > > understandable, I just don't want to miss the 2.4 train. > > > > > I need a little bit of help understanding this patch. In particular, > > > the changes remove a 'vlan' parameter from a lot of function prototypes, > > > and I don't see anything replace it. Was the VLAN somehow unused > > > previously? I'd like to understand why it was needed before and somehow > > > not needed now. > > > > Yes, it wasn't used at all so I took it out in the cleanup. > > Oh, that makes sense. I'll take another quick look then. Both patches applied, thank you! ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] appveyor - windows autobuild service
On Fri, Jan 23, 2015 at 9:33 AM, Gurucharan Shetty wrote: > On Thu, Jan 22, 2015 at 9:19 PM, Ben Pfaff wrote: >> The travis autobuilder has been pretty awesome for GNU/Linux continuous >> integration. I'd like to have autobuilders for other platforms too. >> After a few minutes searching, I see that appveyor.com offers a similar >> service, with similar Github integration, for Windows builds. If anyone >> has time and inclination to integrate this with OVS, I'd appreciate it. > I will spend some time. I did spend some time on this and was able to compile OVS userspace successfully on Windows without openssl support. It is slow because the free version provides only one CPU. I will be spending some time on openssl support and then will send the patches needed for integration. > > >> ___ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2] datapath-windows: accommodate to UFID changes
hi Alin, Thanks for the patch. From my reading, it seems that you are attempting to add support in the kernel datapath to not handle the UFID attribute gracefully, even though there’s no support for it in the kernel datapath. It would be useful to add a comment in the code to this effect that UFID is not actually supported. Also, IIRC, the userspace code probes the kernel datapath figure out if UFID is supported by adding a flow and seeing if we get a INVALID error code. I see that, you are checking if the KEY attribute is present, and if not, you are returning NL_ERROR_INVAL. This is fine. I was wondering if you got a chance to see if nl_transact() in userspace is able to handle this error code and propagate it up to the dpif-netlink.c code? Also, do you know if there’s a possibility of KEY != NULL && UFID != NULL? If this is the case, then we should return NL_ERROR_INVAL since userspace should not assume that kernel datapath was able to handle UFID != NULL. I had a comment in OvsFlowNlCmdHandler() as indicated below. Looks good otherwise. I should be able to ACK the next version. > /* For Parsing nested OVS_FLOW_ATTR_KEY attributes. > @@ -308,6 +310,11 @@ OvsFlowNlCmdHandler(POVS_USER_PARAMS_CONTEXT > usrParamsCtx, >goto done; > } > > +if (nlAttrs[OVS_FLOW_ATTR_KEY] == NULL) { > +nlError = NL_ERROR_INVAL; > +goto done; > +} Looking at the code, it seems that a combination of genlMsgHdr->cmd == OVS_FLOW_CMD_DEL && !(nlAttrs[OVS_FLOW_ATTR_KEY]), will result in flow flush. I guess this logic needs to be updated, to not flush the flows if cmd == OVS_FLOW_CMD_DEL, and nlAttrs[OVS_FLOW_ATTR_KEY] == NULL && nlAttrs[OVS_FLOW_ATTR_UFID] != NULL. /* FLOW_DEL command w/o any key input is a flush case. */ if ((genlMsgHdr->cmd == OVS_FLOW_CMD_DEL) && (!(nlAttrs[OVS_FLOW_ATTR_KEY]) && !(nlAttrs[OVS_FLOW_ATTR_UFID])) { } thanks, -- Nithin ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] appveyor - windows autobuild service
> On Feb 4, 2015, at 11:42 AM, Gurucharan Shetty wrote: > > On Fri, Jan 23, 2015 at 9:33 AM, Gurucharan Shetty wrote: >> On Thu, Jan 22, 2015 at 9:19 PM, Ben Pfaff wrote: >>> The travis autobuilder has been pretty awesome for GNU/Linux continuous >>> integration. I'd like to have autobuilders for other platforms too. >>> After a few minutes searching, I see that appveyor.com offers a similar >>> service, with similar Github integration, for Windows builds. If anyone >>> has time and inclination to integrate this with OVS, I'd appreciate it. >> I will spend some time. > I did spend some time on this and was able to compile OVS userspace > successfully on Windows without openssl support. It is slow because > the free version provides only one CPU. > I will be spending some time on openssl support and then will send the > patches needed for integration. hi Guru, Thanks for investigating this and the results are encouraging. I had a question regarding the machines that they provide. Have they been installed with the Visual studio packages for building the kernel datapath. We mostly need WDK 8.1. thanks, -- Nithin ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] appveyor - windows autobuild service
> I had a question regarding the machines that they provide. Have they been > installed with the Visual studio packages for building the kernel datapath. > We mostly need WDK 8.1. These are the pre-installed software in the machine: http://www.appveyor.com/docs/installed-software Do you have what you will need? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] ofproto: Initialize tunnel module earlier.
The tunnel module can get called by the handler threads right after they are started, so we need to call ofproto_tunnel_init() before opening the backer. Late initialization caused a spurious ovs-vswitchd handler thread crash on start-up due to the tunnel module fat_rwlock not being initialized yet. Signed-off-by: Jarno Rajahalme --- ofproto/ofproto-dpif.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index b909fd9..44272f4 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -1272,6 +1272,9 @@ construct(struct ofproto *ofproto_) struct shash_node *node, *next; int error; +/* Tunnel module can get used right after the udpif threads are running. */ +ofproto_tunnel_init(); + error = open_dpif_backer(ofproto->up.type, &ofproto->backer); if (error) { return error; @@ -1289,7 +1292,6 @@ construct(struct ofproto *ofproto_) ofproto->mbridge = mbridge_create(); ofproto->has_bonded_bundles = false; ofproto->lacp_enabled = false; -ofproto_tunnel_init(); ovs_mutex_init_adaptive(&ofproto->stats_mutex); ovs_mutex_init(&ofproto->vsp_mutex); -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] appveyor - windows autobuild service
> On Feb 4, 2015, at 12:27 PM, Gurucharan Shetty wrote: > >> I had a question regarding the machines that they provide. Have they been >> installed with the Visual studio packages for building the kernel datapath. >> We mostly need WDK 8.1. > These are the pre-installed software in the machine: > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.appveyor.com_docs_installed-2Dsoftware&d=AwIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=pNHQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80&m=NVbs94iGwtdpwwmAZuR7VDY19i8j6H56IJeu2DzDYpI&s=gIzJQMKwwI84WOcS2N4mvt8k8zmaEHuaGzh84xe6vPE&e= > > > Do you have what you will need? Thanks for the link Guru. I looked at the list. They have WDK that goes with Server 2008, but not Server 2012. So, we won’t be able to build the OVS datapath. I found this link where someone has requested WDK 8.1 support: http://help.appveyor.com/discussions/suggestions/553-request-wdk-81-update I’ll also voice my support for the request. It might help. Do we build Linux datapath today in the public cloud (not appveyor)? -- Nithin ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] appveyor - windows autobuild service
> I’ll also voice my support for the request. It might help. From other discussions, it looks like they are quite prompt in installing new software. > Do we build Linux datapath today in the public cloud (not appveyor)? We build it. I don't think we let people download it from anywhere after the build. I think, as a first step, we can start with Windows userspace. This will be helpful for Linux developers if they are unsure of whether their change will cause Windows build failure. Since Windows datapath changes are only done by Windows developers, they would make sure that it is not broken. > > -- Nithin ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] appveyor - windows autobuild service
On Wed, Feb 04, 2015 at 11:42:46AM -0800, Gurucharan Shetty wrote: > On Fri, Jan 23, 2015 at 9:33 AM, Gurucharan Shetty wrote: > > On Thu, Jan 22, 2015 at 9:19 PM, Ben Pfaff wrote: > >> The travis autobuilder has been pretty awesome for GNU/Linux continuous > >> integration. I'd like to have autobuilders for other platforms too. > >> After a few minutes searching, I see that appveyor.com offers a similar > >> service, with similar Github integration, for Windows builds. If anyone > >> has time and inclination to integrate this with OVS, I'd appreciate it. > > I will spend some time. > I did spend some time on this and was able to compile OVS userspace > successfully on Windows without openssl support. It is slow because > the free version provides only one CPU. > I will be spending some time on openssl support and then will send the > patches needed for integration. That's great! Thanks for following up. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ofproto: Initialize tunnel module earlier.
On Wed, Feb 04, 2015 at 12:59:04PM -0800, Jarno Rajahalme wrote: > The tunnel module can get called by the handler threads right after > they are started, so we need to call ofproto_tunnel_init() before > opening the backer. > > Late initialization caused a spurious ovs-vswitchd handler thread > crash on start-up due to the tunnel module fat_rwlock not being > initialized yet. > > Signed-off-by: Jarno Rajahalme Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/2] mcast_snoop: make mcast_fport_bundle generic
On Wednesday, February 04, 2015 10:54:59 AM Ben Pfaff wrote: > On Wed, Feb 04, 2015 at 09:57:34AM -0800, Ben Pfaff wrote: > > On Wed, Feb 04, 2015 at 11:07:05AM -0200, Flavio Leitner wrote: > > > On Tuesday, February 03, 2015 04:07:24 PM Ben Pfaff wrote: > > > > On Tue, Jan 20, 2015 at 01:20:56PM -0200, Flavio Leitner wrote: > > > > > The struct mcast_fport_bundle will be used for ports > > > > > forwarding Reports too, so make it generic. > > > > > > > > > > Signed-off-by: Flavio Leitner > > > > > > > > I apologize that it has taken me a long time to review this patch. As > > > > usual, I have been busy; most recently, I was out the last five days to > > > > an out-of-town corporate event where they kept us busy from 7am to 10pm > > > > or later. Hardly had time to breathe, let along review patches. > > > > > > understandable, I just don't want to miss the 2.4 train. > > > > > > > I need a little bit of help understanding this patch. In particular, > > > > the changes remove a 'vlan' parameter from a lot of function prototypes, > > > > and I don't see anything replace it. Was the VLAN somehow unused > > > > previously? I'd like to understand why it was needed before and somehow > > > > not needed now. > > > > > > Yes, it wasn't used at all so I took it out in the cleanup. > > > > Oh, that makes sense. I'll take another quick look then. > > Both patches applied, thank you! Thanks a lot! fbl ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] travis: Add 32 bit (-m32) cross-compile build
On 02/04/15 at 09:53am, Ben Pfaff wrote: > On Wed, Feb 04, 2015 at 04:10:59PM +0100, Thomas Graf wrote: > > Inspired by Ben Pfaff's email on 32 bit build environment, this adds a > > 32bit build to the travis build matrix to catch alignment and padding > > issues. > > > > The 32 bit build is only enabled for non-DPDK builds as DPDK itself is > > currently not capable to be compiled with -m32. > > > > The build also has SSL disabled as the Ubuntu libssl-devel package is > > not multiarch compatible on the travis build system. > > > > Signed-off-by: Thomas Graf > > Too bad about sparse. I will investigate it further in a quiet moment. __WORDSIZE was still 64 when compiled with sparse and -m32 which resulted in several errors which check for 64bit using UINTPTR_MAX value. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] netdev-dpdk: Allow changing NON_PMD_CORE_ID for testing purpose.
On Tue, Feb 3, 2015 at 5:54 PM, Alex Wang wrote: > For testing purpose, developers may want to change the NON_PMD_CORE_ID > and use a different core for non-pmd threads. Since the netdev-dpdk > module is hard-coded to assert the non-pmd threads using core 0, such > change will cause abortion of OVS. > > This commit fixes the assertion and allows changing NON_PMD_CORE_ID. > > Signed-off-by: Alex Wang > --- > lib/dpctl.c |2 +- > lib/dpif-netdev.h |1 - > lib/netdev-dpdk.c | 12 ++-- > lib/netdev-dpdk.h |2 ++ > 4 files changed, 9 insertions(+), 8 deletions(-) > > diff --git a/lib/dpctl.c b/lib/dpctl.c > index 4c2614b..125023c 100644 > --- a/lib/dpctl.c > +++ b/lib/dpctl.c > @@ -31,11 +31,11 @@ > #include "dirs.h" > #include "dpctl.h" > #include "dpif.h" > -#include "dpif-netdev.h" > #include "dynamic-string.h" > #include "flow.h" > #include "match.h" > #include "netdev.h" > +#include "netdev-dpdk.h" > #include "netlink.h" > #include "odp-util.h" > #include "ofp-parse.h" > diff --git a/lib/dpif-netdev.h b/lib/dpif-netdev.h > index d811507..410fcfa 100644 > --- a/lib/dpif-netdev.h > +++ b/lib/dpif-netdev.h > @@ -42,7 +42,6 @@ static inline void dp_packet_pad(struct ofpbuf *b) > > #define NR_QUEUE 1 > #define NR_PMD_THREADS 1 > -#define NON_PMD_CORE_ID 0 > > #ifdef __cplusplus > } > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 0ede200..391695f 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -1553,8 +1553,8 @@ pmd_thread_setaffinity_cpu(int cpu) > VLOG_ERR("Thread affinity error %d",err); > return err; > } > -/* lcore_id 0 is reseved for use by non pmd threads. */ > -ovs_assert(cpu); > +/* NON_PMD_CORE_ID is reserved for use by non pmd threads. */ > +ovs_assert(cpu != NON_PMD_CORE_ID); > RTE_PER_LCORE(_lcore_id) = cpu; > > return 0; > @@ -1563,13 +1563,13 @@ pmd_thread_setaffinity_cpu(int cpu) > void > thread_set_nonpmd(void) > { > -/* We have to use 0 to allow non pmd threads to perform certain DPDK > - * operations, like rte_eth_dev_configure(). */ > -RTE_PER_LCORE(_lcore_id) = 0; > +/* We have to use NON_PMD_CORE_ID to allow non-pmd threads to perform > + * certain DPDK operations, like rte_eth_dev_configure(). */ The is not equivalent comment. have you confirmed that rte_eth_dev_configure() works on any core? > +RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID; > } > > static bool > thread_is_pmd(void) > { > -return rte_lcore_id() != 0; > +return rte_lcore_id() != NON_PMD_CORE_ID; > } > diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h > index c24d6da..9a47165 100644 > --- a/lib/netdev-dpdk.h > +++ b/lib/netdev-dpdk.h > @@ -5,6 +5,8 @@ > > struct dpif_packet; > > +#define NON_PMD_CORE_ID 0 > + > #ifdef DPDK_NETDEV > > #include > -- > 1.7.9.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [ovs-discuss] Open vSwitch STT support
On Tue, Feb 3, 2015 at 11:04 PM, Vasiliy Tolstov wrote: > > 2015-02-04 10:04 GMT+03:00 Vasiliy Tolstov : >> >> Does it possible to add to readme test results posted in mailing list >> about vxlan and stt and do this tests also (on the same hardware) for gre, >> listp, geneve? > > > Or openvswitch FAQ I will summarize STT performance in the OVS documentation. > > > -- > Vasiliy Tolstov, > e-mail: v.tols...@selfip.ru > jabber: v...@selfip.ru ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 0/2]
Joe and I have been experimenting with BPF and its application for OVS. This patch shows our attempt to implement ovs actions using eBPF. The kernel changes are against the 'net-next'. The corresponding user space changes will be post next. This patch set implements an BPF action, that has the same interface as current OVS output action. Instead of sending out a packet, it only generates a kernel message. This feature is neither complete nor useful as is. We are mostly interested in comments on: The infrastructure changes to support and running BPF functions, and suggestions on extensions beyond those patches. Andy Zhou (2): BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH openvswitch: implements the BPF_PROG action in datapath include/linux/bpf.h | 2 +- include/uapi/linux/bpf.h | 1 + include/uapi/linux/openvswitch.h | 29 - net/Makefile | 4 +- net/openvswitch/Makefile | 2 + net/openvswitch/actions.c| 30 + net/openvswitch/bpf.c| 87 + net/openvswitch/datapath.c | 6 ++- net/openvswitch/flow_netlink.c | 92 +++- net/openvswitch/flow_netlink.h | 8 10 files changed, 254 insertions(+), 7 deletions(-) create mode 100644 net/openvswitch/bpf.c -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 1/2] BPF: add a new BPF program type BPF_PROG_TYPE_OPENVSWITCH
Add a new program type for openvswitch. Implements the BPF verifier for the new type. Signed-off-by: Andy Zhou --- include/linux/bpf.h | 2 +- include/uapi/linux/bpf.h | 1 + include/uapi/linux/openvswitch.h | 29 +- net/Makefile | 4 +- net/openvswitch/Makefile | 2 + net/openvswitch/bpf.c| 87 6 files changed, 122 insertions(+), 3 deletions(-) create mode 100644 net/openvswitch/bpf.c diff --git a/include/linux/bpf.h b/include/linux/bpf.h index bbfceb7..2e71cc2 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -99,7 +99,7 @@ enum bpf_access_type { struct bpf_verifier_ops { /* return eBPF function prototype for verification */ - const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id); + const struct bpf_func_proto *(*get_func_proto)(int func_id); /* return true if 'size' wide access at offset 'off' within bpf_context * with 'type' (read or write) is allowed diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 45da7ec..a9a6b24 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -118,6 +118,7 @@ enum bpf_map_type { enum bpf_prog_type { BPF_PROG_TYPE_UNSPEC, BPF_PROG_TYPE_SOCKET_FILTER, + BPF_PROG_TYPE_OPENVSWITCH, }; /* flags for BPF_MAP_UPDATE_ELEM command */ diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h index 7a8785a..92c 100644 --- a/include/uapi/linux/openvswitch.h +++ b/include/uapi/linux/openvswitch.h @@ -1,6 +1,6 @@ /* - * Copyright (c) 2007-2013 Nicira, Inc. + * Copyright (c) 2007-2015 Nicira, Inc. * * This program is free software; you can redistribute it and/or * modify it under the terms of version 2 of the GNU General Public @@ -569,6 +569,17 @@ struct ovs_action_push_vlan { __be16 vlan_tci;/* 802.1Q TCI (VLAN ID and priority). */ }; +/** + * struct ovs_action_bpf_prog - %OVS_ACTION_ATTR_BPF_PROG action argument. + * + * XXX The argument size is fixed for now. + */ +struct ovs_action_bpf_prog { + __be32 prog_fd; + __be32 arg0; + __be32 arg1; +}; + /* Data path hash algorithm for computing Datapath hash. * * The algorithm type only specifies the fields in a flow @@ -631,10 +642,26 @@ enum ovs_action_attr { OVS_ACTION_ATTR_HASH, /* struct ovs_action_hash. */ OVS_ACTION_ATTR_PUSH_MPLS,/* struct ovs_action_push_mpls. */ OVS_ACTION_ATTR_POP_MPLS, /* __be16 ethertype. */ + OVS_ACTION_ATTR_SET_MASKED, /* place holder */ + OVS_ACTION_ATTR_BPF_PROG, /* strcut ovs_action_bpf_prog */ __OVS_ACTION_ATTR_MAX }; #define OVS_ACTION_ATTR_MAX (__OVS_ACTION_ATTR_MAX - 1) +/* integer value in 'imm' field of BPF_CALL instruction selects which OVS helper + * function eBPF program intends to call + */ +enum ovs_bpf_func_id { + OVS_BPF_FUNC_unspec, + OVS_BPF_FUNC_output, /* int ovs_bpf_output(ctxt) */ + __OVS_BPF_FUNC_MAX_ID, +}; + +struct ovs_bpf_action_ctxt { + void *skb; + u32 arg0; + u32 arg1; +}; #endif /* _LINUX_OPENVSWITCH_H */ diff --git a/net/Makefile b/net/Makefile index 38704bd..7e92dea 100644 --- a/net/Makefile +++ b/net/Makefile @@ -67,7 +67,9 @@ obj-$(CONFIG_DNS_RESOLVER)+= dns_resolver/ obj-$(CONFIG_CEPH_LIB) += ceph/ obj-$(CONFIG_BATMAN_ADV) += batman-adv/ obj-$(CONFIG_NFC) += nfc/ -obj-$(CONFIG_OPENVSWITCH) += openvswitch/ +ifneq ($(CONFIG_OPENVSWITCH),) +obj-y += openvswitch/ +endif obj-$(CONFIG_VSOCKETS) += vmw_vsock/ obj-$(CONFIG_NET_MPLS_GSO) += mpls/ obj-$(CONFIG_HSR) += hsr/ diff --git a/net/openvswitch/Makefile b/net/openvswitch/Makefile index 91b9478..5a3a2b7 100644 --- a/net/openvswitch/Makefile +++ b/net/openvswitch/Makefile @@ -18,3 +18,5 @@ openvswitch-y := \ obj-$(CONFIG_OPENVSWITCH_GENEVE)+= vport-geneve.o obj-$(CONFIG_OPENVSWITCH_VXLAN)+= vport-vxlan.o obj-$(CONFIG_OPENVSWITCH_GRE) += vport-gre.o + +obj-y += bpf.o diff --git a/net/openvswitch/bpf.c b/net/openvswitch/bpf.c new file mode 100644 index 000..8a33e93 --- /dev/null +++ b/net/openvswitch/bpf.c @@ -0,0 +1,87 @@ +/* Copyright (c) 2015 Nicira Inc. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of version 2 of the GNU General Public + * License as published by the Free Software Foundation. + */ +#include +#include +#include +#include +#include +#include + +static u64 bpf_helper_output(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5) +{ + struct sk_buff *skb = (struct sk_buff *) (unsigned long) r1; + uint32_t port = (uint32_t) (unsigned long) r2; + + printk("helper output %p to port %d\n", skb, port); + return 0; +} + +struct bpf_func_proto bpf_helper_output_proto = { + .func = bpf_helper_outp
[ovs-dev] [RFC: add openvswitch actions using BPF 0/9]
User space changes of BPF OVS action agains curret OVS master. Andy Zhou (9): hack: Do not compile datapath odp: add a new ODP action: OVS_ACTION_ATTR_BPF_PROG tests: add a OVS_ACTION_ATTR_BPF_PROG ation unit test case autoconf: support -with-llc options bpf: add the first BPF program. lib: import into libbpf to ovs/lib ofproto-dpif: Add eBPF program loader and runtime infrasturcure. ofproto-dpif: Add datapath eBPF support detection ofproto-dpif-xlate: generate BPF output action (Hack) INSTALL.BPF.md| 42 ++ Makefile.am | 5 +- acinclude.m4 | 18 +- bpf/automake.mk | 27 ++ bpf/bpf-shared.h | 12 + bpf/ovs-actions.c | 13 + bpf/ovs-bpf-helpers.h | 35 ++ configure.ac | 2 + datapath/linux/compat/include/linux/openvswitch.h | 29 +- lib/automake.mk | 22 ++ lib/dpif-netdev.c | 1 + lib/dpif.c| 1 + lib/libbpf.c | 100 + lib/libbpf.h | 185 + lib/odp-execute.c | 1 + lib/odp-util.c| 33 ++ ofproto/automake.mk | 2 + ofproto/ofproto-dpif-bpf.c| 454 ++ ofproto/ofproto-dpif-bpf.h| 42 ++ ofproto/ofproto-dpif-xlate.c | 18 +- ofproto/ofproto-dpif.c| 21 + tests/odp.at | 1 + vswitchd/automake.mk | 5 + 23 files changed, 1064 insertions(+), 5 deletions(-) create mode 100644 INSTALL.BPF.md create mode 100644 bpf/automake.mk create mode 100644 bpf/bpf-shared.h create mode 100644 bpf/ovs-actions.c create mode 100644 bpf/ovs-bpf-helpers.h create mode 100644 lib/libbpf.c create mode 100644 lib/libbpf.h create mode 100644 ofproto/ofproto-dpif-bpf.c create mode 100644 ofproto/ofproto-dpif-bpf.h -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 2/2] openvswitch: implements the BPF_PROG action in datapath
BPF_PROG action allows an action to be implemented in eBPF language and downloaded by the userspace at runtime. Signed-off-by: Andy Zhou --- net/openvswitch/actions.c | 30 ++ net/openvswitch/datapath.c | 6 ++- net/openvswitch/flow_netlink.c | 92 +- net/openvswitch/flow_netlink.h | 8 4 files changed, 132 insertions(+), 4 deletions(-) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index b4cffe6..29e9171 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -38,8 +38,12 @@ #include "datapath.h" #include "flow.h" +#include "flow_netlink.h" #include "vport.h" +typedef int (*ovs_bpf_func_t)(const struct ovs_bpf_action_ctxt *, + const struct bpf_insn *); + static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, struct sw_flow_key *key, const struct nlattr *attr, int len); @@ -747,6 +751,28 @@ static int execute_recirc(struct datapath *dp, struct sk_buff *skb, return 0; } +static int execute_bpf(struct sk_buff *skb, struct sw_flow_key *key, + const struct nlattr *a) +{ + struct ovs_action_bpf_runtime *rt; + struct bpf_prog *prog; + struct ovs_bpf_action_ctxt ctxt; + ovs_bpf_func_t ovs_bpf_func; + int err; + + rt = nla_data(a); + prog = rt->prog; + + /* Build the BPF program runtime context. */ + ctxt.skb = (void *)skb; + ctxt.arg0 = rt->arg0; + ctxt.arg1 = rt->arg1; + + ovs_bpf_func = (ovs_bpf_func_t)(prog->bpf_func); + err = ovs_bpf_func(&ctxt, prog->insnsi); + return err; +} + /* Execute a list of actions against 'skb'. */ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, struct sw_flow_key *key, @@ -814,6 +840,10 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, } break; + case OVS_ACTION_ATTR_BPF_PROG: + err = execute_bpf(skb, key, a); + break; + case OVS_ACTION_ATTR_SET: err = execute_set_action(skb, key, nla_data(a)); break; diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index ae5e77c..810a0bf 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -699,6 +699,8 @@ static size_t ovs_flow_cmd_msg_size(const struct sw_flow_actions *acts, len += nla_total_size(ovs_key_attr_size()); /* OVS_FLOW_ATTR_ACTIONS */ + /* XXX this logic needs to be fixed to accommodate BPF_PROG action +* will expand the run time action size. */ if (should_fill_actions(ufid_flags)) len += nla_total_size(acts->actions_len); @@ -1017,7 +1019,7 @@ err_unlock_ovs: ovs_unlock(); kfree_skb(reply); err_kfree_acts: - kfree(acts); + free_flow_actions(acts); err_kfree_flow: ovs_flow_free(new_flow, false); error: @@ -1152,7 +1154,7 @@ err_unlock_ovs: ovs_unlock(); kfree_skb(reply); err_kfree_acts: - kfree(acts); + free_flow_actions(acts); error: return error; } diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index 8b9a612..466e85f 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include @@ -1546,11 +1547,38 @@ static struct sw_flow_actions *nla_alloc_flow_actions(int size, bool log) return sfa; } +void free_flow_actions(struct sw_flow_actions *sf_acts) +{ + const struct nlattr *a; + int rem; + struct ovs_action_bpf_runtime *rt; + + nla_for_each_attr(a, sf_acts->actions, sf_acts->actions_len, rem) { + int type = nla_type(a); + + switch (type) { + case OVS_ACTION_ATTR_BPF_PROG: + rt = nla_data(a); + bpf_prog_put(rt->prog); + break; + } + } + + kfree(sf_acts); +} + +static void free_flow_actions_rcu(struct rcu_head *head) +{ + struct sw_flow_actions *acts = container_of(head, struct sw_flow_actions, rcu); + + free_flow_actions(acts); +} + /* Schedules 'sf_acts' to be freed after the next RCU grace period. * The caller must hold rcu_read_lock for this to be sensible. */ void ovs_nla_free_flow_actions(struct sw_flow_actions *sf_acts) { - kfree_rcu(sf_acts, rcu); + call_rcu(&sf_acts->rcu, free_flow_actions_rcu); } static struct nlattr *reserve_sfa_size(struct sw_flow_actions **sfa, @@ -1695,6 +1723,37 @@ static int validate_and_copy_sample(const struct nlattr *attr, return 0; } +static int validate_and_copy_bpf(const struct n
[ovs-dev] [RFC: add openvswitch actions using BPF 1/9] hack: Do not compile datapath
The OVS eBPF patch sets are developed against master 'net-next' tree, which is currently targeting for kernel version 3.19. The datapath in OVS mater does not currently support 3.19 kernel. But we should not really using OVS datapath, The changes in the follow patches are only changes in user space programs, which should work with the kernel module built from the 'net-next' tree. This patch disables building of the datapath and disables the configuration time kernel version check. Signed-off-by: Andy Zhou --- Makefile.am | 3 ++- acinclude.m4 | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Makefile.am b/Makefile.am index 3e5e0b2..1363abe 100644 --- a/Makefile.am +++ b/Makefile.am @@ -7,7 +7,8 @@ AUTOMAKE_OPTIONS = foreign subdir-objects ACLOCAL_AMFLAGS = -I m4 -SUBDIRS = datapath +# SUBDIRS = datapath +SUBDIRS = AM_CPPFLAGS = $(SSL_CFLAGS) AM_LDFLAGS = $(SSL_LDFLAGS) diff --git a/acinclude.m4 b/acinclude.m4 index cb3e148..f419056 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -134,7 +134,7 @@ AC_DEFUN([OVS_CHECK_LINUX], [ AC_MSG_RESULT([$kversion]) if test "$version" -ge 3; then - if test "$version" = 3 && test "$patchlevel" -le 18; then + if test "$version" = 3 && test "$patchlevel" -le 19; then : # Linux 3.x else AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version newer than 3.18.x is not supported (please refer to the FAQ for advice)]) -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 2/9] odp: add a new ODP action: OVS_ACTION_ATTR_BPF_PROG
Add a new ODP action and implements the corresponding Netlink message handling of this new attribute. Signed-off-by: Andy Zhou --- datapath/linux/compat/include/linux/openvswitch.h | 29 +++- lib/dpif-netdev.c | 1 + lib/dpif.c| 1 + lib/odp-execute.c | 1 + lib/odp-util.c| 33 +++ 5 files changed, 64 insertions(+), 1 deletion(-) diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index a59e109..7284023 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -569,6 +569,18 @@ struct ovs_action_push_mpls { }; /** + * struct ovs_action_bpf_prog - %OVS_ACTION_ATTR_BPF_PROG action argument. + * + * XXX provides bpf program id and execution context. + * + */ +struct ovs_action_bpf_prog { + __be32 prog_fd; + __be32 arg0; + __be32 arg1; +}; + +/** * struct ovs_action_push_vlan - %OVS_ACTION_ATTR_PUSH_VLAN action argument. * @vlan_tpid: Tag protocol identifier (TPID) to push. * @vlan_tci: Tag control identifier (TCI) to push. The CFI bit must be set @@ -685,7 +697,7 @@ enum ovs_action_attr { * data immediately followed by a mask. * The data must be zero for the unmasked * bits. */ - + OVS_ACTION_ATTR_BPF_PROG, /* struct ovs_action_bpf_prog. */ #ifndef __KERNEL__ OVS_ACTION_ATTR_TUNNEL_PUSH, /* struct ovs_action_push_tnl*/ OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */ @@ -695,4 +707,19 @@ enum ovs_action_attr { #define OVS_ACTION_ATTR_MAX (__OVS_ACTION_ATTR_MAX - 1) +/* integer value in 'imm' field of BPF_CALL instruction selects which OVS helper + * function eBPF program intends to call + */ +enum ovs_bpf_func_id { + OVS_BPF_FUNC_unspec, + OVS_BPF_FUNC_output, /* int ovs_bpf_output(ctxt) */ + __OVS_BPF_FUNC_MAX_ID, +}; + +struct ovs_bpf_action_ctxt { + void *skb; + uint32_t arg0; + uint32_t arg1; +}; + #endif /* _LINUX_OPENVSWITCH_H */ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 54bad02..e991716 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -3240,6 +3240,7 @@ dp_execute_cb(void *aux_, struct dpif_packet **packets, int cnt, case OVS_ACTION_ATTR_SAMPLE: case OVS_ACTION_ATTR_HASH: case OVS_ACTION_ATTR_UNSPEC: +case OVS_ACTION_ATTR_BPF_PROG: case __OVS_ACTION_ATTR_MAX: OVS_NOT_REACHED(); } diff --git a/lib/dpif.c b/lib/dpif.c index 6ecd1d9..8aac7dd 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -1132,6 +1132,7 @@ dpif_execute_helper_cb(void *aux_, struct dpif_packet **packets, int cnt, case OVS_ACTION_ATTR_SET_MASKED: case OVS_ACTION_ATTR_SAMPLE: case OVS_ACTION_ATTR_UNSPEC: +case OVS_ACTION_ATTR_BPF_PROG: case __OVS_ACTION_ATTR_MAX: OVS_NOT_REACHED(); } diff --git a/lib/odp-execute.c b/lib/odp-execute.c index 98ac18c..ee06362 100644 --- a/lib/odp-execute.c +++ b/lib/odp-execute.c @@ -591,6 +591,7 @@ odp_execute_actions(void *dp, struct dpif_packet **packets, int cnt, bool steal, break; case OVS_ACTION_ATTR_UNSPEC: +case OVS_ACTION_ATTR_BPF_PROG: case __OVS_ACTION_ATTR_MAX: OVS_NOT_REACHED(); } diff --git a/lib/odp-util.c b/lib/odp-util.c index 37d73a9..1477d13 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -88,6 +88,7 @@ odp_action_len(uint16_t type) case OVS_ACTION_ATTR_SET: return -2; case OVS_ACTION_ATTR_SET_MASKED: return -2; case OVS_ACTION_ATTR_SAMPLE: return -2; +case OVS_ACTION_ATTR_BPF_PROG: return sizeof(struct ovs_action_bpf_prog); case OVS_ACTION_ATTR_UNSPEC: case __OVS_ACTION_ATTR_MAX: @@ -185,6 +186,17 @@ format_odp_sample_action(struct ds *ds, const struct nlattr *attr) ds_put_format(ds, "))"); } +static void +format_bpf_prog_action(struct ds *ds, const struct nlattr *a) +{ +struct ovs_action_bpf_prog *bpf_act = (struct ovs_action_bpf_prog *) + nl_attr_get(a); + +ds_put_cstr(ds, "bpf"); +ds_put_format(ds, "(%u,%u,%u)", ntohl(bpf_act->prog_fd), + ntohl(bpf_act->arg0), ntohl(bpf_act->arg1)); +} + static const char * slow_path_reason_to_string(uint32_t reason) { @@ -682,6 +694,9 @@ format_odp_action(struct ds *ds, const struct nlattr *a) case OVS_ACTION_ATTR_SAMPLE: format_odp_sample_action(ds, a); break; +case OVS_ACTION_ATTR_BPF_PROG: +format_bpf_prog_action(ds, a); +break; case OVS_ACTION_ATTR_UNSPEC: case __OVS_ACTION_ATTR_MAX: default: @@ -1006,6 +1021,24 @@ parse_odp_action(const char *s, const struct simap *port_names,
[ovs-dev] [RFC: add openvswitch actions using BPF 3/9] tests: add a OVS_ACTION_ATTR_BPF_PROG ation unit test case
Test parsing and formating of the newly added action. Signed-off-by: Andy Zhou --- tests/odp.at | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/odp.at b/tests/odp.at index 8f96c6a..29a977f 100644 --- a/tests/odp.at +++ b/tests/odp.at @@ -282,6 +282,7 @@ tnl_pop(4) tnl_push(tnl_port(4),header(size=42,type=3,eth(dst=f8:bc:12:44:34:b6,src=f8:bc:12:46:58:e0,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x40),gre((flags=0x20,proto=0x6558),key=0x1e241)),out_port(1)) tnl_push(tnl_port(4),header(size=46,type=3,eth(dst=f8:bc:12:44:34:b6,src=f8:bc:12:46:58:e0,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=47,tos=0,ttl=64,frag=0x40),gre((flags=0xa0,proto=0x6558),csum=0x0,key=0x1e241)),out_port(1)) tnl_push(tnl_port(6),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b6,src=f8:bc:12:46:58:e0,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x40),udp(src=0,dst=4789),vxlan(flags=0x800,vni=0x1c700)),out_port(1)) +bpf(1,1,1) ]) AT_CHECK_UNQUOTED([ovstest test-odp parse-actions < actions.txt], [0], [`cat actions.txt` -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 4/9] autoconf: support -with-llc options
Add a configuration option to allow specifing the LLVM backend module for compiling BPF C source code into BPF instruction sets. See INSTALL.BPF.md for usgae information. Signed-off-by: Joe Stringer Signed-off-by: Andy Zhou --- INSTALL.BPF.md | 42 ++ Makefile.am| 1 + acinclude.m4 | 16 configure.ac | 2 ++ 4 files changed, 61 insertions(+) create mode 100644 INSTALL.BPF.md diff --git a/INSTALL.BPF.md b/INSTALL.BPF.md new file mode 100644 index 000..bc1f5ad --- /dev/null +++ b/INSTALL.BPF.md @@ -0,0 +1,42 @@ +# Dependencies + +- Need Clang + llvm-dev version 3.X for any (2 <= X <= 4) +- http://llvm.org/apt/ + +apt-get install libelf-dev clang-3.4 llvm-3.4-dev + +# Build LLVM plugin + +*[Upstream] git://git.kernel.org/pub/scm/linux/kernel/git/ast/bpf* + +git clone git://github.com/joestringer/linux +NN=$PWD/linux + +cd $NN/tools/bpf/llvm/bld +make LLVM_CONFIG=`which llvm-config-3.4` + +# Configure OVS to use BPF LLC + +OVS=/path/to/openvswitch +cd $OVS +./configure --with-llc=$NN/tools/bpf/llvm/bld/Debug+Asserts/bin/llc + +# Build OVS and BPF module + +make +cd datapath/bpf +make + +# Load BPF module + +cd $NN/ +make M=samples/bpf +samples/bpf/simple_load $OVS/datapath/bpf/simple.bpf + +Note down which fd was used for the program. This will be used in next step. + +# Install flow to use BPF + +$OVS/utilities/ovs-dpctl add-flow "eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00)" "bpf(, 0)" + +Success? diff --git a/Makefile.am b/Makefile.am index 1363abe..ee8fe03 100644 --- a/Makefile.am +++ b/Makefile.am @@ -71,6 +71,7 @@ docs = \ DESIGN.md \ FAQ.md \ INSTALL.md \ + INSTALL.BPF.md \ INSTALL.Debian.md \ INSTALL.Docker.md \ INSTALL.DPDK.md \ diff --git a/acinclude.m4 b/acinclude.m4 index f419056..e3b5356 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -157,6 +157,22 @@ AC_DEFUN([OVS_CHECK_LINUX], [ AM_CONDITIONAL(LINUX_ENABLED, test -n "$KBUILD") ]) +dnl OVS_CHECK_BPF +dnl +dnl Check clang set ups for compiling BPF +AC_DEFUN([OVS_CHECK_LLC], [ + AC_ARG_WITH([llc], + [AC_HELP_STRING([--with-llc=/path/to/llc], + [Specify the bpf clang backend binrary])]) + + if (test X"$with_llc" != X); then +LLC=$with_llc + if (test ! -x $LLC); then + AC_MSG_ERROR([BPF llc backend is configured but not found ]) + fi + fi +]) + dnl OVS_CHECK_DPDK dnl dnl Configure DPDK source tree diff --git a/configure.ac b/configure.ac index d2d02ca..7510d56 100644 --- a/configure.ac +++ b/configure.ac @@ -160,8 +160,10 @@ OVS_ENABLE_SPARSE AC_ARG_VAR(KARCH, [Kernel Architecture String]) AC_SUBST(KARCH) +AC_ARG_VAR(LLC, [LLVM bpf backend processor]) OVS_CHECK_LINUX OVS_CHECK_DPDK +OVS_CHECK_LLC OVS_CHECK_PRAGMA_MESSAGE AC_SUBST([OVS_CFLAGS]) AC_SUBST([OVS_LDFLAGS]) -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 0/2] XenServer: Improve XAPI plugin
This patch prevents wiping flows if the xen pool's controller is not the first controller in the `get-manger` output. It also updates the quoting to be to be consistant and follows PEP8 guidelines. .../etc_xapi.d_plugins_openvswitch-cfg-update | 163 +++- 1 file changed, 88 insertions(+), 75 deletions(-) ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 5/9] bpf: add the first BPF program.
Added a BPF implementation of output action. The intention is to have all OVS related BPF programs live under the 'bpf' folder. Signed-off-by: Andy Zhou --- Makefile.am | 1 + bpf/automake.mk | 27 +++ bpf/bpf-shared.h | 12 bpf/ovs-actions.c | 13 + bpf/ovs-bpf-helpers.h | 35 +++ 5 files changed, 88 insertions(+) create mode 100644 bpf/automake.mk create mode 100644 bpf/bpf-shared.h create mode 100644 bpf/ovs-actions.c create mode 100644 bpf/ovs-bpf-helpers.h diff --git a/Makefile.am b/Makefile.am index ee8fe03..dce9056 100644 --- a/Makefile.am +++ b/Makefile.am @@ -354,6 +354,7 @@ dist-docs: include m4/automake.mk include lib/automake.mk +include bpf/automake.mk include ofproto/automake.mk include utilities/automake.mk include tests/automake.mk diff --git a/bpf/automake.mk b/bpf/automake.mk new file mode 100644 index 000..20fc776 --- /dev/null +++ b/bpf/automake.mk @@ -0,0 +1,27 @@ +# Copyright (C) 2015 Nicira, Inc. +# +# Copying and distribution of this file, with or without modification, +# are permitted in any medium without royalty provided the copyright +# notice and this notice are preserved. This file is offered as-is, +# without warranty of any kind. + +if LINUX +sbin_PROGRAMS += bpf/ovs-actions.bpf + + +EXTRA_DIST += $(srcdir)/bpf/ovs-bpf-helpers.h \ + $(srcdir)/bpf/bpf-shared.h \ + $(srcdir)/bpf/ovs-actions.c + +DEP_FILES = $(srcdir)/bpf/ovs-bpf-helpers.h \ +$(srcdir)/bpf/bpf-shared.h \ + $(srcdir)/datapath/linux/compat/include/linux/openvswitch.h + +BPF_INCLUDES=-I. -I$(srcdir)/datapath/linux/compat/include -I/usr/include + +bpf/ovs-actions.bpf: $(srcdir)/bpf/ovs-actions.c $(DEP_FILES) + $(AM_V_GEN)clang -DHAVE_CONFIG_H $(BPF_INCLUDES) $(NOSTDINC_FLAGS) \ + $(AM_CFLAGS) $(EXTRA_CFLAGS) -Wno-unused-value -Wno-pointer-sign \ + -O2 -emit-llvm -c $< -o -| $(LLC) -filetype=obj -o $@ + +endif diff --git a/bpf/bpf-shared.h b/bpf/bpf-shared.h new file mode 100644 index 000..a710c3c --- /dev/null +++ b/bpf/bpf-shared.h @@ -0,0 +1,12 @@ + +/* Shared data structures between bpf and C programs. */ + +/* a helper structure used by eBPF C program + * to describe map attributes to elf_bpf loader + */ +struct bpf_map_def { + unsigned int type; + unsigned int key_size; + unsigned int value_size; + unsigned int max_entries; +}; diff --git a/bpf/ovs-actions.c b/bpf/ovs-actions.c new file mode 100644 index 000..05a0a71 --- /dev/null +++ b/bpf/ovs-actions.c @@ -0,0 +1,13 @@ +#include +#include "ovs-bpf-helpers.h" + +int output_action(struct ovs_bpf_action_ctxt *ctxt); + +SEC("ovs/output") +int +output_action(struct ovs_bpf_action_ctxt *ctxt) +{ + return ovs_bpf_helper_output(ctxt->skb, ctxt->arg0); +} + +char _license[] SEC("license") = "GPL"; diff --git a/bpf/ovs-bpf-helpers.h b/bpf/ovs-bpf-helpers.h new file mode 100644 index 000..ba10084 --- /dev/null +++ b/bpf/ovs-bpf-helpers.h @@ -0,0 +1,35 @@ +#ifndef __OVS_BPF_HELPERS_H +#define __OVS_BPF_HELPERS_H + +#include +#include +struct sk_buff; + +/* helper macro to place programs, maps, license in + * different sections in elf_bpf file. Section names + * are interpreted by elf_bpf loader + */ +#define SEC(NAME) __attribute__((section(NAME), used)) + +/* helper functions called from eBPF programs written in C */ +static void *(*bpf_map_lookup_elem)(void *map, void *key) = + (void *) OVS_BPF_FUNC_map_lookup_elem; +static int (*bpf_map_update_elem)(void *map, void *key, void *value, + unsigned long long flags) = + (void *) OVS_BPF_FUNC_map_update_elem; +static int (*bpf_map_delete_elem)(void *map, void *key) = + (void *) OVS_BPF_FUNC_map_delete_elem; +static int (*ovs_bpf_helper_output)(struct sk_buff *skb, uint32_t out_port) = + (void *) OVS_BPF_FUNC_output; + +/* llvm builtin functions that eBPF C program may use to + * emit BPF_LD_ABS and BPF_LD_IND instructions + */ +unsigned long long load_byte(void *skb, +unsigned long long off) asm("llvm.bpf.load.byte"); +unsigned long long load_half(void *skb, +unsigned long long off) asm("llvm.bpf.load.half"); +unsigned long long load_word(void *skb, +unsigned long long off) asm("llvm.bpf.load.word"); + +#endif -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 7/9] ofproto-dpif: Add eBPF program loader and runtime infrasturcure.
Added a eBPF program loader in ofproto-dpif layer. It only loades the OVS programs. Signed-off-by: Andy Zhou --- ofproto/automake.mk| 2 + ofproto/ofproto-dpif-bpf.c | 454 + ofproto/ofproto-dpif-bpf.h | 42 + vswitchd/automake.mk | 5 + 4 files changed, 503 insertions(+) create mode 100644 ofproto/ofproto-dpif-bpf.c create mode 100644 ofproto/ofproto-dpif-bpf.h diff --git a/ofproto/automake.mk b/ofproto/automake.mk index 0058ff3..86719de 100644 --- a/ofproto/automake.mk +++ b/ofproto/automake.mk @@ -28,6 +28,8 @@ ofproto_libofproto_la_SOURCES = \ ofproto/ofproto.h \ ofproto/ofproto-dpif.c \ ofproto/ofproto-dpif.h \ + ofproto/ofproto-dpif-bpf.c \ + ofproto/ofproto-dpif-bpf.h \ ofproto/ofproto-dpif-ipfix.c \ ofproto/ofproto-dpif-ipfix.h \ ofproto/ofproto-dpif-mirror.c \ diff --git a/ofproto/ofproto-dpif-bpf.c b/ofproto/ofproto-dpif-bpf.c new file mode 100644 index 000..12f6766 --- /dev/null +++ b/ofproto/ofproto-dpif-bpf.c @@ -0,0 +1,454 @@ +/* + * Copyright (c) 2015 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "simap.h" +#include "util.h" +#include "dirs.h" +#include "unixctl.h" +#include "dynamic-string.h" +#include "ovs-atomic.h" +#include "openvswitch/thread.h" +#include "openvswitch/vlog.h" +#include "ofproto-dpif-bpf.h" +#include "libbpf.h" +#include "bpf/bpf-shared.h" + +#define DEFAULT_BPF_ACTION_FILE "ovs-actions.bpf" +#define BPF_MAP_STUB_SIZE 32 + +VLOG_DEFINE_THIS_MODULE(bpf); + +struct bpf_info { +struct simap bpf_maps; +struct simap bpf_progs; +}; + +static struct bpf_info ovs_bpfs__ = { +SIMAP_INITIALIZER(&ovs_bpfs__.bpf_maps), +SIMAP_INITIALIZER(&ovs_bpfs__.bpf_progs), +}; + +static bool bpf_enable__ = false; +static struct ovs_mutex mutex = OVS_MUTEX_INITIALIZER; +static struct bpf_info *const ovs_bpfs OVS_GUARDED_BY(mutex) = &ovs_bpfs__; + +#define BPF_UNIXCTL_CHECK(conn) \ +if (bpf_unixctl_check_enable(conn) == -1) return + +/* + * Get the default eBPF action file name. + * + * Returns the file name. Caller is responsible for free the memory. */ +static char * +bpf_default_path(void) +{ +return xasprintf("%s/%s", ovs_pkgdatadir(), DEFAULT_BPF_ACTION_FILE); +} + +static int +bpf_unixctl_check_enable(struct unixctl_conn *conn) +{ +if (!bpf_enable__) { +unixctl_command_reply(conn, "Current datapath doest not support eBPF actions.\n"); +return -1; +} + +return 0; +} + +static void +add_map(char *map_name, int fd) +{ +ovs_mutex_lock(&mutex); +simap_put(&ovs_bpfs->bpf_maps, map_name, fd); +ovs_mutex_unlock(&mutex); +} + +static void +add_prog(char *prog_name, int fd) +{ +ovs_mutex_lock(&mutex); +simap_put(&ovs_bpfs->bpf_progs, prog_name, fd); +ovs_mutex_unlock(&mutex); +} + +static int +is_ovs_section(const char *shname) { + return !memcmp(shname, "ovs", 3); +} + +static int +load_maps(struct bpf_map_def *maps, int len, int map_fds[]) +{ +int i, fd; + +for (i = 0; i < len / sizeof(struct bpf_map_def); i++) { +char *name; + +fd = bpf_create_map(maps[i].type, maps[i].key_size, +maps[i].value_size, maps[i].max_entries); + +if (fd < 0) { +return fd; +} + +name = xasprintf("map%d(type[%u],ks[%u],vs[%u],max[%u])", + i, maps[i].type, maps[i].key_size, + maps[i].value_size, maps[i].max_entries); + +map_fds[i] = fd; +add_map(name, fd); +free(name); +} + +return 0; +} + +static int +load_prog(int type, struct bpf_insn *prog, int size, char *license) +{ +return bpf_prog_load(type, prog, size, license); +} + +static int +parse_relo_and_apply(Elf_Data *data, Elf_Data *symbols, + GElf_Shdr *shdr, struct bpf_insn *insn, int map_fds[]) +{ +int i, nrels; + +nrels = shdr->sh_size / shdr->sh_entsize; + +for (i = 0; i < nrels; i++) { +GElf_Sym sym; +GElf_Rel rel; +unsigned int insn_idx; + +gelf_getrel(data, i, &rel); + +insn_idx = rel.r_offset / sizeof(struct bpf_insn); + +gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &sym); + +if (insn[insn_idx].
[ovs-dev] [RFC: add openvswitch actions using BPF 6/9] lib: import into libbpf to ovs/lib
Those files are basically imported from 'net-next: sample/bpf' directory. This is a stop gap measure unil libbpf is upstreamed. Signed-off-by: Andy Zhou --- lib/automake.mk | 22 +++ lib/libbpf.c| 100 ++ lib/libbpf.h| 185 3 files changed, 307 insertions(+) create mode 100644 lib/libbpf.c create mode 100644 lib/libbpf.h diff --git a/lib/automake.mk b/lib/automake.mk index 2a5844b..d9488c0 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -317,6 +317,28 @@ lib_libsflow_la_CFLAGS += -Wno-unused-parameter endif if LINUX +lib_LTLIBRARIES += lib/libbpf.la +lib_libbpf_la_LDFLAGS = \ +-version-info $(LT_CURRENT):$(LT_REVISION):$(LT_AGE) \ + -Wl,--version-script=$(top_builddir)/lib/libbpf.sym \ +$(AM_LDFLAGS) +lib_libbpf_la_SOURCES = \ + lib/libbpf.h \ + lib/libbpf.c +lib_libbpf_la_CPPFLAGS = $(AM_CPPFLAGS) +# GCC Bug 60784 affects both GCC 4.8 and GCC 4.9. -Wmissing-field-initializers will +# give spurious warning when compiling libbpf. Turn it off for now. +lib_libbpf_la_CFLAGS = $(filter-out -Wmissing-field-initializers, $(AM_CFLAGS)) +lib_libbpf_la_CFLAGS += -Wno-missing-field-initializers +if HAVE_WNO_UNUSED +lib_libbpf_la_CFLAGS += -Wno-unused +endif +if HAVE_WNO_UNUSED_PARAMETER +lib_libbpf_la_CFLAGS += -Wno-unused-parameter +endif +endif + +if LINUX lib_libopenvswitch_la_SOURCES += \ lib/dpif-netlink.c \ lib/dpif-netlink.h \ diff --git a/lib/libbpf.c b/lib/libbpf.c new file mode 100644 index 000..045bdba --- /dev/null +++ b/lib/libbpf.c @@ -0,0 +1,100 @@ +/* eBPF mini library */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "libbpf.h" + +#include + +static __u64 ptr_to_u64(void *ptr) +{ +return (__u64) (unsigned long) ptr; +} + +int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size, + int max_entries) +{ +union bpf_attr attr = { +.map_type = map_type, +.key_size = key_size, +.value_size = value_size, +.max_entries = max_entries, +}; + +return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr)); +} + +int bpf_update_elem(int fd, void *key, void *value, unsigned long long flags) +{ +union bpf_attr attr = { +.map_fd = fd, +.key = ptr_to_u64(key), +.value = ptr_to_u64(value), +.flags = flags, +}; + +return syscall(__NR_bpf, BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr)); +} + +int bpf_lookup_elem(int fd, void *key, void *value) +{ +union bpf_attr attr = { +.map_fd = fd, +.key = ptr_to_u64(key), +.value = ptr_to_u64(value), +}; + +return syscall(__NR_bpf, BPF_MAP_LOOKUP_ELEM, &attr, sizeof(attr)); +} + +int bpf_delete_elem(int fd, void *key) +{ +union bpf_attr attr = { +.map_fd = fd, +.key = ptr_to_u64(key), +}; + +return syscall(__NR_bpf, BPF_MAP_DELETE_ELEM, &attr, sizeof(attr)); +} + +int bpf_get_next_key(int fd, void *key, void *next_key) +{ +union bpf_attr attr = { +.map_fd = fd, +.key = ptr_to_u64(key), +.next_key = ptr_to_u64(next_key), +}; + +return syscall(__NR_bpf, BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr)); +} + +#define ROUND_UP(x, n) (((x) + (n) - 1u) & ~((n) - 1u)) + +char bpf_log_buf[LOG_BUF_SIZE]; + +int bpf_prog_load(enum bpf_prog_type prog_type, + const struct bpf_insn *insns, int prog_len, + const char *license) +{ +union bpf_attr attr = { +.prog_type = prog_type, +.insns = ptr_to_u64((void *) insns), +.insn_cnt = prog_len / sizeof(struct bpf_insn), +.license = ptr_to_u64((void *) license), +.log_buf = ptr_to_u64(bpf_log_buf), +.log_size = LOG_BUF_SIZE, +.log_level = 1, +}; + +bpf_log_buf[0] = 0; + +return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr)); +} diff --git a/lib/libbpf.h b/lib/libbpf.h new file mode 100644 index 000..fe6f31e --- /dev/null +++ b/lib/libbpf.h @@ -0,0 +1,185 @@ +/* eBPF mini library */ +#ifndef __LIBBPF_H +#define __LIBBPF_H + +struct bpf_insn; + +int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size, + int max_entries); +int bpf_update_elem(int fd, void *key, void *value, unsigned long long flags); +int bpf_lookup_elem(int fd, void *key, void *value); +int bpf_delete_elem(int fd, void *key); +int bpf_get_next_key(int fd, void *key, void *next_key); + +int bpf_prog_load(enum bpf_prog_type prog_type, + const struct bpf_insn *insns, int insn_len, + const char *license); + +#define LOG_BUF_SIZE 65536 +extern char bpf_log_buf[LOG_BUF_SIZE]; + +/* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */ + +#define BPF_ALU64_REG(OP, DST, SRC) \ +((struct bpf_insn) {\ +.code = BPF_ALU64 | BPF_OP(OP) |
[ovs-dev] [PATCH 2/2] XenServer: Don't reset on xe-toolstack-restart
With XenServer only 1 manager is configured in the pool, which may not be the first manager returned from `get-manager` as it returns in lexicographical order. Signed-off-by: Jason Kölker --- .../etc_xapi.d_plugins_openvswitch-cfg-update | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update b/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update index aeaa1e7..a66becc 100755 --- a/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update +++ b/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update @@ -59,10 +59,11 @@ def update(session, args): raise XenAPIPlugin.Failure('MORE_THAN_ONE_POOL_FOR_HOST', []) new_controller = False pool = session.xenapi.pool.get_record(pools[0]) -controller = pool.get('vswitch_controller', '') +controller = pool.get('vswitch_controller') ret_str = '' -currentController = vswitchCurrentController() -if controller == '' and currentController != '': +currentControllers = vswitchCurrentControllers() + +if not controller and currentControllers: delete_cacert() try: emergency_reset(session, None) @@ -70,7 +71,7 @@ def update(session, args): pass removeControllerCfg() ret_str += 'Successfully removed controller config. ' -elif controller != currentController: +elif controller not in currentControllers: delete_cacert() try: emergency_reset(session, None) @@ -194,14 +195,13 @@ def update(session, args): return 'No change to configuration' -def vswitchCurrentController(): +def vswitchCurrentControllers(): controller = vswitchCfgQuery(['get-manager']) -if controller == '': -return controller -if len(controller) < 4 or controller[0:4] != 'ssl:': -return controller -else: -return controller.split(':')[1] +if not controller: +return + +return [c.lstrip('ssl:').split(':')[0] +for c in controller.split('\n')] def removeControllerCfg(): -- 2.1.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [RFC: add openvswitch actions using BPF 8/9] ofproto-dpif: Add datapath eBPF support detection
First cut in added datpath eBPF support detector. This feature has not been fully developed. The stub implementation simply assumes the datpath support eBPF. Signed-off-by: Andy Zhou Current interface only support 'yes' or 'no' detection. Should we consider 'support level'? --- ofproto/ofproto-dpif.c | 21 + 1 file changed, 21 insertions(+) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index b909fd9..4bcd034 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -58,6 +58,7 @@ #include "ofproto-dpif-sflow.h" #include "ofproto-dpif-upcall.h" #include "ofproto-dpif-xlate.h" +#include "ofproto-dpif-bpf.h" #include "poll-loop.h" #include "ovs-rcu.h" #include "ovs-router.h" @@ -293,6 +294,9 @@ struct dpif_backer { * actions. */ bool masked_set_action; +/* True if the datapath supports bpf actions */ +bool enable_bpf_action; + /* Maximum number of MPLS label stack entries that the datapath supports * in a match */ size_t max_mpls_depth; @@ -909,6 +913,7 @@ static bool check_variable_length_userdata(struct dpif_backer *backer); static size_t check_max_mpls_depth(struct dpif_backer *backer); static bool check_recirc(struct dpif_backer *backer); static bool check_ufid(struct dpif_backer *backer); +static bool check_bpf_actions(struct dpif_backer *backer); static bool check_masked_set_action(struct dpif_backer *backer); static int @@ -1007,6 +1012,7 @@ open_dpif_backer(const char *type, struct dpif_backer **backerp) backer->max_mpls_depth = check_max_mpls_depth(backer); backer->masked_set_action = check_masked_set_action(backer); backer->enable_ufid = check_ufid(backer); +backer->enable_bpf_action = check_bpf_actions(backer); backer->rid_pool = recirc_id_pool_create(); ovs_mutex_init(&backer->recirc_mutex); cmap_init(&backer->recirc_map); @@ -1032,6 +1038,9 @@ open_dpif_backer(const char *type, struct dpif_backer **backerp) backer->variable_length_userdata = check_variable_length_userdata(backer); backer->dp_version_string = dpif_get_dp_version(backer->dpif); +/* Load eBPF actions if datapath supports it. */ +error = ofproto_dpif_bpf_init(backer->enable_bpf_action); + return error; } @@ -1110,7 +1119,19 @@ check_ufid(struct dpif_backer *backer) return enable_ufid; } +/* Tests whether 'dpif' supports eBPF base flow actions. + * + * Returns true if 'dpif' supports eBPF based actions. + */ +static bool +check_bpf_actions( struct dpif_backer *backer OVS_UNUSED) +{ +/* XXX */ +return true; +} + /* Tests whether 'backer''s datapath supports variable-length + * * OVS_USERSPACE_ATTR_USERDATA in OVS_ACTION_ATTR_USERSPACE actions. We need * to disable some features on older datapaths that don't support this * feature. -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH 1/2] XenServer: PEP8 Cleanup for openvswitch-cfg-update
Signed-off-by: Jason Kölker --- .../etc_xapi.d_plugins_openvswitch-cfg-update | 147 +++-- 1 file changed, 80 insertions(+), 67 deletions(-) diff --git a/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update b/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update index 2df838a..aeaa1e7 100755 --- a/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update +++ b/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update @@ -1,7 +1,7 @@ #!/usr/bin/env python # # xapi plugin script to update the cache of configuration items in the -# ovs-vswitchd configuration that are managed in the xapi database when +# ovs-vswitchd configuration that are managed in the xapi database when # integrated with Citrix management tools. # Copyright (C) 2009, 2010, 2011, 2012, 2013 Nicira, Inc. @@ -22,15 +22,15 @@ # TBD: the system in a bad state if anything goes wrong. import XenAPIPlugin -import XenAPI import os import subprocess import syslog import re -vsctl="/usr/bin/ovs-vsctl" -ofctl="/usr/bin/ovs-ofctl" -cacert_filename="/etc/openvswitch/vswitchd.cacert" +vsctl = '/usr/bin/ovs-vsctl' +ofctl = '/usr/bin/ovs-ofctl' +cacert_filename = '/etc/openvswitch/vswitchd.cacert' + # Delete the CA certificate, so that we go back to boot-strapping mode def delete_cacert(): @@ -40,35 +40,36 @@ def delete_cacert(): # Ignore error if file doesn't exist pass + def update(session, args): # Refresh bridge network UUIDs in case this host joined or left a pool. -script = "/opt/xensource/libexec/interface-reconfigure" +script = '/opt/xensource/libexec/interface-reconfigure' try: -retval = subprocess.call([script, "rewrite"]) +retval = subprocess.call([script, 'rewrite']) if retval != 0: -syslog.syslog("%s exited with status %d" % (script, retval)) +syslog.syslog('%s exited with status %d' % (script, retval)) except OSError, e: -syslog.syslog("%s: failed to execute (%s)" % (script, e.strerror)) +syslog.syslog('%s: failed to execute (%s)' % (script, e.strerror)) pools = session.xenapi.pool.get_all() # We assume there is only ever one pool... if len(pools) == 0: -raise XenAPIPlugin.Failure("NO_POOL_FOR_HOST", []) +raise XenAPIPlugin.Failure('NO_POOL_FOR_HOST', []) if len(pools) > 1: -raise XenAPIPlugin.Failure("MORE_THAN_ONE_POOL_FOR_HOST", []) +raise XenAPIPlugin.Failure('MORE_THAN_ONE_POOL_FOR_HOST', []) new_controller = False pool = session.xenapi.pool.get_record(pools[0]) -controller = pool.get("vswitch_controller", "") -ret_str = "" +controller = pool.get('vswitch_controller', '') +ret_str = '' currentController = vswitchCurrentController() -if controller == "" and currentController != "": +if controller == '' and currentController != '': delete_cacert() try: emergency_reset(session, None) except: pass removeControllerCfg() -ret_str += "Successfully removed controller config. " +ret_str += 'Successfully removed controller config. ' elif controller != currentController: delete_cacert() try: @@ -77,10 +78,10 @@ def update(session, args): pass setControllerCfg(controller) new_controller = True -ret_str += "Successfully set controller to %s. " % controller +ret_str += 'Successfully set controller to %s. ' % controller try: -pool_fail_mode = pool["other_config"]["vswitch-controller-fail-mode"] +pool_fail_mode = pool['other_config']['vswitch-controller-fail-mode'] except KeyError, e: pool_fail_mode = None @@ -99,7 +100,8 @@ def update(session, args): host_mgmt_device = None pool_mgmt_macs = {} if new_controller: -recs = session.xenapi.PIF.get_all_records_where('field "management"="true"') +query = 'field "management"="true"' +recs = session.xenapi.PIF.get_all_records_where(query) for rec in recs.itervalues(): pool_mgmt_macs[rec.get('MAC')] = rec.get('device') @@ -136,11 +138,12 @@ def update(session, args): dib_changed = True # Change bridge fail_mode if XAPI state differs from OVS state. -bridge_fail_mode = vswitchCfgQuery(["get", "Bridge", -bridge, "fail_mode"]).strip('[]"') +bridge_fail_mode = vswitchCfgQuery(['get', 'Bridge', +bridge, 'fail_mode']).strip('[]"') try: -fail_mode = bton[bridge]["other_config"]["vswitch-controller-fail-mode"] +other_config = bton[bridge]['other_config'] +fail_mode = other_config['vswitch-controller-fail-mode'] except KeyError, e: fail_mode = None @@ -152,98 +155,108 @@ def update(session, args): if bridge_fail_mode != fail_mode: vswitchCfgMod(['--',
[ovs-dev] [RFC: add openvswitch actions using BPF 9/9] ofproto-dpif-xlate: generate BPF output action (Hack)
This is a hack in xlate code to always generate both BPF output and the original output action. Since the currnet BPF output 'action' simply generate a kenrel log message about the output port number. With this patch, Every time datapath output a packet, it also generate log about the port number in the kernel log. Signed-off-by: Andy Zhou --- ofproto/ofproto-dpif-xlate.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 0786513..41a7502 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -15,6 +15,7 @@ #include #include "ofproto/ofproto-dpif-xlate.h" +#include "ofproto/ofproto-dpif-bpf.h" #include #include @@ -2833,14 +2834,27 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, OVS_ACTION_ATTR_TUNNEL_POP, odp_tnl_port); } else { +int fd; +struct ovs_action_bpf_prog bpf_output; + +fd = ofproto_dpif_bpf_lookup("ovs/output"); + +if (fd > 0) { +bpf_output.prog_fd = htonl(fd); +bpf_output.arg0 = htonl(out_port); +bpf_output.arg1 = htonl(0); +nl_msg_push_unspec(ctx->xout->odp_actions, + OVS_ACTION_ATTR_BPF_PROG, + &bpf_output, sizeof(bpf_output)); +} /* Tunnel push-pop action is not compatible with * IPFIX action. */ add_ipfix_output_action(ctx, out_port); nl_msg_put_odp_port(ctx->xout->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port); - } - } +} +} } ctx->sflow_odp_port = odp_port; -- 1.9.1 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] netdev-dpdk: Allow changing NON_PMD_CORE_ID for testing purpose.
Hey Daniele, Do you still remember why you mentioned: "/* We have to use 0 to allow non pmd threads to perform certain DPDK * operations, like rte_eth_dev_configure(). */ " in your commit: db73f716 (netdev-dpdk: Fix race condition with DPDK mempools in non pmd threads) This posted commit works during my manual test. And the rte_eth_dev_configure() in dpdk-1.7.1 does not require the caller lcore id to be 0. But Pravin mentioned that you may know more about why use 0 for non-pmd threads (to prevent crash?). Could you share some thoughts? Thanks, Alex Wang, On Wed, Feb 4, 2015 at 2:14 PM, Pravin Shelar wrote: > On Tue, Feb 3, 2015 at 5:54 PM, Alex Wang wrote: > > For testing purpose, developers may want to change the NON_PMD_CORE_ID > > and use a different core for non-pmd threads. Since the netdev-dpdk > > module is hard-coded to assert the non-pmd threads using core 0, such > > change will cause abortion of OVS. > > > > This commit fixes the assertion and allows changing NON_PMD_CORE_ID. > > > > Signed-off-by: Alex Wang > > --- > > lib/dpctl.c |2 +- > > lib/dpif-netdev.h |1 - > > lib/netdev-dpdk.c | 12 ++-- > > lib/netdev-dpdk.h |2 ++ > > 4 files changed, 9 insertions(+), 8 deletions(-) > > > > diff --git a/lib/dpctl.c b/lib/dpctl.c > > index 4c2614b..125023c 100644 > > --- a/lib/dpctl.c > > +++ b/lib/dpctl.c > > @@ -31,11 +31,11 @@ > > #include "dirs.h" > > #include "dpctl.h" > > #include "dpif.h" > > -#include "dpif-netdev.h" > > #include "dynamic-string.h" > > #include "flow.h" > > #include "match.h" > > #include "netdev.h" > > +#include "netdev-dpdk.h" > > #include "netlink.h" > > #include "odp-util.h" > > #include "ofp-parse.h" > > diff --git a/lib/dpif-netdev.h b/lib/dpif-netdev.h > > index d811507..410fcfa 100644 > > --- a/lib/dpif-netdev.h > > +++ b/lib/dpif-netdev.h > > @@ -42,7 +42,6 @@ static inline void dp_packet_pad(struct ofpbuf *b) > > > > #define NR_QUEUE 1 > > #define NR_PMD_THREADS 1 > > -#define NON_PMD_CORE_ID 0 > > > > #ifdef __cplusplus > > } > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 0ede200..391695f 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -1553,8 +1553,8 @@ pmd_thread_setaffinity_cpu(int cpu) > > VLOG_ERR("Thread affinity error %d",err); > > return err; > > } > > -/* lcore_id 0 is reseved for use by non pmd threads. */ > > -ovs_assert(cpu); > > +/* NON_PMD_CORE_ID is reserved for use by non pmd threads. */ > > +ovs_assert(cpu != NON_PMD_CORE_ID); > > > RTE_PER_LCORE(_lcore_id) = cpu; > > > > return 0; > > @@ -1563,13 +1563,13 @@ pmd_thread_setaffinity_cpu(int cpu) > > void > > thread_set_nonpmd(void) > > { > > -/* We have to use 0 to allow non pmd threads to perform certain DPDK > > - * operations, like rte_eth_dev_configure(). */ > > -RTE_PER_LCORE(_lcore_id) = 0; > > +/* We have to use NON_PMD_CORE_ID to allow non-pmd threads to > perform > > + * certain DPDK operations, like rte_eth_dev_configure(). */ > > > The is not equivalent comment. have you confirmed that > rte_eth_dev_configure() works on any core? > > > +RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID; > > } > > > > static bool > > thread_is_pmd(void) > > { > > -return rte_lcore_id() != 0; > > +return rte_lcore_id() != NON_PMD_CORE_ID; > > } > > diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h > > index c24d6da..9a47165 100644 > > --- a/lib/netdev-dpdk.h > > +++ b/lib/netdev-dpdk.h > > @@ -5,6 +5,8 @@ > > > > struct dpif_packet; > > > > +#define NON_PMD_CORE_ID 0 > > + > > #ifdef DPDK_NETDEV > > > > #include > > -- > > 1.7.9.5 > > > > ___ > > dev mailing list > > dev@openvswitch.org > > http://openvswitch.org/mailman/listinfo/dev > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/6] datapath: Support VXLAN Group Policy extension
On Wed, Feb 4, 2015 at 7:45 AM, Thomas Graf wrote: > Upstream commit: > openvswitch: Support VXLAN Group Policy extension > > Introduces support for the group policy extension to the VXLAN virtual > port. The extension is disabled by default and only enabled if the user > has provided the respective configuration. > > ovs-vsctl add-port br0 vxlan0 -- \ > set Interface vxlan0 type=vxlan options:exts=gbp > > The configuration interface to enable the extension is based on a new > attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION > which can carry additional extensions as needed in the future. > > The group policy metadata is stored as binary blob (struct ovs_vxlan_opts) > internally just like Geneve options but transported as nested Netlink > attributes to user space. > > Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the > binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced. > > The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing > OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive. > > Signed-off-by: Thomas Graf > Signed-off-by: David S. Miller > > Upstream: 1dd144 ("openvswitch: Support VXLAN Group Policy extension") > Signed-off-by: Thomas Graf > --- > datapath/Modules.mk | 3 +- > datapath/flow_netlink.c | 114 > +++--- > datapath/linux/compat/include/linux/openvswitch.h | 10 ++ > datapath/linux/compat/include/net/ip_tunnels.h| 16 ++- > datapath/vport-geneve.c | 6 +- > datapath/vport-vxlan.c| 82 +++- > datapath/vport-vxlan.h| 11 +++ > 7 files changed, 223 insertions(+), 19 deletions(-) > create mode 100644 datapath/vport-vxlan.h > ... > + > + flags = TUNNEL_KEY; > + vxlan_port = vxlan_vport(vport); > + if (vxlan_port->exts & VXLAN_F_GBP && md->gbp) > + flags |= TUNNEL_VXLAN_OPT; > why not just check for md->gpb? I do not see value in storing vxlan_port->ext. This is redundant information, we can just look at vxlan_sock->flags in vxlan_get_options() to generate vxlan port configuration. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v2] net: openvswitch: Support masked set actions.
OVS userspace already probes the openvswitch kernel module for OVS_ACTION_ATTR_SET_MASKED support. This patch adds the kernel module implementation of masked set actions. The existing set action sets many fields at once. When only a subset of the IP header fields, for example, should be modified, all the IP fields need to be exact matched so that the other field values can be copied to the set action. A masked set action allows modification of an arbitrary subset of the supported header bits without requiring the rest to be matched. Masked set action is now supported for all writeable key types, except for the tunnel key. The set tunnel action is an exception as any input tunnel info is cleared before action processing starts, so there is no tunnel info to mask. The kernel module converts all (non-tunnel) set actions to masked set actions. This makes action processing more uniform, and results in less branching and duplicating the action processing code. When returning actions to userspace, the set actions that were converted to masked set actions are converted back to normal set actions. We use a kernel internal action code to be able to tell the userspace provided and converted masked set actions apart. Signed-off-by: Jarno Rajahalme --- v2: Fixed checkpatch warnigns and errors, rebase. include/uapi/linux/openvswitch.h | 22 ++- net/openvswitch/actions.c| 373 -- net/openvswitch/flow_netlink.c | 165 ++--- 3 files changed, 400 insertions(+), 160 deletions(-) diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h index 7a8785a..bbd49a0 100644 --- a/include/uapi/linux/openvswitch.h +++ b/include/uapi/linux/openvswitch.h @@ -599,6 +599,12 @@ struct ovs_action_hash { * @OVS_ACTION_ATTR_SET: Replaces the contents of an existing header. The * single nested %OVS_KEY_ATTR_* attribute specifies a header to modify and its * value. + * @OVS_ACTION_ATTR_SET_MASKED: Replaces the contents of an existing header. A + * nested %OVS_KEY_ATTR_* attribute specifies a header to modify, its value, + * and a mask. For every bit set in the mask, the corresponding bit value + * is copied from the value to the packet header field, rest of the bits are + * left unchanged. The non-masked value bits must be passed in as zeroes. + * Masking is not supported for the %OVS_KEY_ATTR_TUNNEL attribute. * @OVS_ACTION_ATTR_PUSH_VLAN: Push a new outermost 802.1Q header onto the * packet. * @OVS_ACTION_ATTR_POP_VLAN: Pop the outermost 802.1Q header off the packet. @@ -617,6 +623,9 @@ struct ovs_action_hash { * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all * fields within a header are modifiable, e.g. the IPv4 protocol and fragment * type may not be changed. + * + * @OVS_ACTION_ATTR_SET_TO_MASKED: Kernel internal masked set action translated + * from the @OVS_ACTION_ATTR_SET. */ enum ovs_action_attr { @@ -631,8 +640,19 @@ enum ovs_action_attr { OVS_ACTION_ATTR_HASH, /* struct ovs_action_hash. */ OVS_ACTION_ATTR_PUSH_MPLS,/* struct ovs_action_push_mpls. */ OVS_ACTION_ATTR_POP_MPLS, /* __be16 ethertype. */ + OVS_ACTION_ATTR_SET_MASKED, /* One nested OVS_KEY_ATTR_* including + * data immediately followed by a mask. + * The data must be zero for the unmasked + * bits. */ + + __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted + * from userspace. */ - __OVS_ACTION_ATTR_MAX +#ifdef __KERNEL__ + OVS_ACTION_ATTR_SET_TO_MASKED, /* Kernel module internal masked + * set action converted from + * OVS_ACTION_ATTR_SET. */ +#endif }; #define OVS_ACTION_ATTR_MAX (__OVS_ACTION_ATTR_MAX - 1) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index b4cffe6..b491c1c 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -185,10 +185,15 @@ static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key, return 0; } -static int set_mpls(struct sk_buff *skb, struct sw_flow_key *key, - const __be32 *mpls_lse) +/* 'KEY' must not have any bits set outside of the 'MASK' */ +#define MASKED(OLD, KEY, MASK) ((KEY) | ((OLD) & ~(MASK))) +#define SET_MASKED(OLD, KEY, MASK) ((OLD) = MASKED(OLD, KEY, MASK)) + +static int set_mpls(struct sk_buff *skb, struct sw_flow_key *flow_key, + const __be32 *mpls_lse, const __be32 *mask) { __be32 *stack; + __be32 lse; int err; err = skb_ensure_writable(skb, skb->mac_len + MPLS_HLEN); @@ -196,14 +201,16 @@ static int set_mpls(struct sk_buff *skb, struct sw_flow_key *key, return err; stack = (__be32 *)skb_mpls_header(skb); + lse = M
Re: [ovs-dev] [PATCH v2] datapath-windows: accommodate to UFID changes
Hey Alin, Thanks for the patch. Please find my comments inline. I have not worked on this code for sometime, so may miss something here. Can you please also let us know the unit testing that you did? Although change is minor, but editing FLOW_KEY* related code may end up breaking some functionality. Thanks. Regards, Ankur From: dev on behalf of Alin Serdean Sent: Wednesday, February 4, 2015 6:19 AM To: dev@openvswitch.org Subject: [ovs-dev] [PATCH v2] datapath-windows: accommodate to UFID changes Current flow commands: new, set, get, del need to respond with a NETLINK error in the case OVS_FLOW_ATTR_KEY is missing. OVS_FLOW_ATTR_KEY is now an optional attribute. Also add OVS_FLOW_ATTR_UFID attribute to the kernel for further use. v2 Remove redundant return code Acked-by: Eitan Eliahu Signed-off-by: Alin Gabriel Serdean --- datapath-windows/ovsext/Flow.c | 25 ++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c index d3de8cc..c5954d8 100644 --- a/datapath-windows/ovsext/Flow.c +++ b/datapath-windows/ovsext/Flow.c @@ -95,7 +95,7 @@ static NTSTATUS OvsDoDumpFlows(OvsFlowDumpInput *dumpInput, /* For Parsing attributes in FLOW_* commands */ const NL_POLICY nlFlowPolicy[] = { -[OVS_FLOW_ATTR_KEY] = {.type = NL_A_NESTED, .optional = FALSE}, +[OVS_FLOW_ATTR_KEY] = {.type = NL_A_NESTED, .optional = TRUE}, [ANKUR]: Will it not affect the flush case, where we will be getting a DEL operation without FLOW_KEY? [OVS_FLOW_ATTR_MASK] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_ACTIONS] = {.type = NL_A_NESTED, .optional = TRUE}, [OVS_FLOW_ATTR_STATS] = {.type = NL_A_UNSPEC, @@ -103,7 +103,9 @@ const NL_POLICY nlFlowPolicy[] = { .maxLen = sizeof(struct ovs_flow_stats), .optional = TRUE}, [OVS_FLOW_ATTR_TCP_FLAGS] = {NL_A_U8, .optional = TRUE}, -[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE} +[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE}, +[OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = TRUE, + .minLen = sizeof(OvsU128) } }; [ANKUR]: As nithin correctly, mentioned what we are trying to do with OVS_FLOW_ATTR_UFID is not clear? Do we not want to support it or do we want it to be a NO OP? If we want it to be a NO OP then is it going to create some functionality issue (since we'll end up returning success to user space while kernel did not do anything) /* For Parsing nested OVS_FLOW_ATTR_KEY attributes. @@ -308,6 +310,11 @@ OvsFlowNlCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, goto done; } +if (nlAttrs[OVS_FLOW_ATTR_KEY] == NULL) { +nlError = NL_ERROR_INVAL; +goto done; +} + [ANKUR]: Thanks for adding the check. As nithin mentioned, please confirm that this error code is handled in userspace. if ((rc = _MapNlToFlowPut(msgIn, nlAttrs[OVS_FLOW_ATTR_KEY], nlAttrs[OVS_FLOW_ATTR_ACTIONS], nlAttrs[OVS_FLOW_ATTR_CLEAR], &mappedFlow)) @@ -420,7 +427,7 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, PNL_MSG_HDR nlMsgOutHdr = NULL; UINT32 attrOffset = NLMSG_HDRLEN + GENL_HDRLEN + OVS_HDRLEN; PNL_ATTR nlAttrs[__OVS_FLOW_ATTR_MAX]; - +NL_ERROR nlError = NL_ERROR_SUCCESS; OvsFlowGetInput getInput; OvsFlowGetOutput getOutput; NL_BUFFER nlBuf; @@ -455,6 +462,11 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, goto done; } +if (nlAttrs[OVS_FLOW_ATTR_KEY] == NULL) { +nlError = NL_ERROR_INVAL; +goto done; +} + [ANKUR]: Thanks for adding this. Same comment as previous check. keyAttrOffset = (UINT32)((PCHAR) nlAttrs[OVS_FLOW_ATTR_KEY] - (PCHAR)nlMsgHdr); @@ -532,6 +544,13 @@ _FlowNlGetCmdHandler(POVS_USER_PARAMS_CONTEXT usrParamsCtx, *replyLen += NlMsgSize(nlMsgOutHdr); done: +if (nlError != NL_ERROR_SUCCESS) { +POVS_MESSAGE_ERROR msgError = (POVS_MESSAGE_ERROR) +usrParamsCtx->outputBuffer; +NlBuildErrorMsg(msgIn, msgError, nlError); +*replyLen = msgError->nlMsg.nlmsgLen; +} + return rc; [ANKUR]: Caller of this function is handling the error case. Is there a reason that you added it here. My bad if i am missing something. } -- 1.9.5.msysgit.0 ___ dev mailing list dev@openvswitch.org https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_mailman_listinfo_dev&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=RDZsrBxhAiOewDSD-jiF-W03FqHevF2o7haW6eQzmtA&m=p6keu4ALm2BI9PBvGYFt7d_QZOjcdyd7DGsma9v2CjE&s=ii5o_QQckVciSbMVP6CrD8smqvKA7zJZJgrOehuCR6Q&e= ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ofproto: Initialize tunnel module earlier.
Thanks for review, merged with master. Jarno On Feb 4, 2015, at 1:47 PM, Ben Pfaff wrote: > On Wed, Feb 04, 2015 at 12:59:04PM -0800, Jarno Rajahalme wrote: >> The tunnel module can get called by the handler threads right after >> they are started, so we need to call ofproto_tunnel_init() before >> opening the backer. >> >> Late initialization caused a spurious ovs-vswitchd handler thread >> crash on start-up due to the tunnel module fat_rwlock not being >> initialized yet. >> >> Signed-off-by: Jarno Rajahalme > > Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] netdev-dpdk: Allow changing NON_PMD_CORE_ID for testing purpose.
Hey Alex, I remember that calling certain DPDK functions (that we use to initialize devices) with '_lcore_id !=0' resulted in an error. If this is not the case anymore with DPDK 1.7.1 (i.e. if you tested this patch with NON_PMD_CORE_ID!=0 and it worked), then I have no objection. I would test this myself, but I don't have access to a DPDK capable system right now. If you want to be more sure about this you can build DPDK with CONFIG_RTE_LIBRTE_MBUF_DEBUG=y and watch for failed assertions. Last thing: we could also avoid using the _lcore_id to set the affinity of a thread (e.g. a thread with '_lcore_id == 1' doesn't necessarily need to be pinned to cpu 1). This would be another way to achieve the same goal, but I prefer your approach, because it is more consistent with DPDK internals. Hope this helps. Let me know if there's anything else about this Thanks, Daniele 2015-02-05 0:14 GMT+01:00 Alex Wang : > Hey Daniele, > > Do you still remember why you mentioned: > > "/* We have to use 0 to allow non pmd threads to perform certain DPDK > * operations, like rte_eth_dev_configure(). */ > " > in your commit: db73f716 (netdev-dpdk: Fix race condition with DPDK > mempools in non pmd threads) > > This posted commit works during my manual test. And the > rte_eth_dev_configure() in dpdk-1.7.1 does not require the caller lcore id > to > be 0. > > But Pravin mentioned that you may know more about why use 0 for non-pmd > threads (to prevent crash?). > > Could you share some thoughts? > > Thanks, > Alex Wang, > > On Wed, Feb 4, 2015 at 2:14 PM, Pravin Shelar wrote: >> >> On Tue, Feb 3, 2015 at 5:54 PM, Alex Wang wrote: >> > For testing purpose, developers may want to change the NON_PMD_CORE_ID >> > and use a different core for non-pmd threads. Since the netdev-dpdk >> > module is hard-coded to assert the non-pmd threads using core 0, such >> > change will cause abortion of OVS. >> > >> > This commit fixes the assertion and allows changing NON_PMD_CORE_ID. >> > >> > Signed-off-by: Alex Wang >> > --- >> > lib/dpctl.c |2 +- >> > lib/dpif-netdev.h |1 - >> > lib/netdev-dpdk.c | 12 ++-- >> > lib/netdev-dpdk.h |2 ++ >> > 4 files changed, 9 insertions(+), 8 deletions(-) >> > >> > diff --git a/lib/dpctl.c b/lib/dpctl.c >> > index 4c2614b..125023c 100644 >> > --- a/lib/dpctl.c >> > +++ b/lib/dpctl.c >> > @@ -31,11 +31,11 @@ >> > #include "dirs.h" >> > #include "dpctl.h" >> > #include "dpif.h" >> > -#include "dpif-netdev.h" >> > #include "dynamic-string.h" >> > #include "flow.h" >> > #include "match.h" >> > #include "netdev.h" >> > +#include "netdev-dpdk.h" >> > #include "netlink.h" >> > #include "odp-util.h" >> > #include "ofp-parse.h" >> > diff --git a/lib/dpif-netdev.h b/lib/dpif-netdev.h >> > index d811507..410fcfa 100644 >> > --- a/lib/dpif-netdev.h >> > +++ b/lib/dpif-netdev.h >> > @@ -42,7 +42,6 @@ static inline void dp_packet_pad(struct ofpbuf *b) >> > >> > #define NR_QUEUE 1 >> > #define NR_PMD_THREADS 1 >> > -#define NON_PMD_CORE_ID 0 >> > >> > #ifdef __cplusplus >> > } >> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >> > index 0ede200..391695f 100644 >> > --- a/lib/netdev-dpdk.c >> > +++ b/lib/netdev-dpdk.c >> > @@ -1553,8 +1553,8 @@ pmd_thread_setaffinity_cpu(int cpu) >> > VLOG_ERR("Thread affinity error %d",err); >> > return err; >> > } >> > -/* lcore_id 0 is reseved for use by non pmd threads. */ >> > -ovs_assert(cpu); >> > +/* NON_PMD_CORE_ID is reserved for use by non pmd threads. */ >> > +ovs_assert(cpu != NON_PMD_CORE_ID); >> >> > RTE_PER_LCORE(_lcore_id) = cpu; >> > >> > return 0; >> > @@ -1563,13 +1563,13 @@ pmd_thread_setaffinity_cpu(int cpu) >> > void >> > thread_set_nonpmd(void) >> > { >> > -/* We have to use 0 to allow non pmd threads to perform certain >> > DPDK >> > - * operations, like rte_eth_dev_configure(). */ >> > -RTE_PER_LCORE(_lcore_id) = 0; >> > +/* We have to use NON_PMD_CORE_ID to allow non-pmd threads to >> > perform >> > + * certain DPDK operations, like rte_eth_dev_configure(). */ >> >> >> The is not equivalent comment. have you confirmed that >> rte_eth_dev_configure() works on any core? >> >> > +RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID; >> > } >> > >> > static bool >> > thread_is_pmd(void) >> > { >> > -return rte_lcore_id() != 0; >> > +return rte_lcore_id() != NON_PMD_CORE_ID; >> > } >> > diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h >> > index c24d6da..9a47165 100644 >> > --- a/lib/netdev-dpdk.h >> > +++ b/lib/netdev-dpdk.h >> > @@ -5,6 +5,8 @@ >> > >> > struct dpif_packet; >> > >> > +#define NON_PMD_CORE_ID 0 >> > + >> > #ifdef DPDK_NETDEV >> > >> > #include >> > -- >> > 1.7.9.5 >> > >> > ___ >> > dev mailing list >> > dev@openvswitch.org >> > http://openvswitch.org/mailman/listinfo/dev > > ___ dev mailing list dev@openvswitch.org http:
Re: [ovs-dev] [PATCH v2] net: openvswitch: Support masked set actions.
On Wed, Feb 4, 2015 at 4:16 PM, Jarno Rajahalme wrote: > OVS userspace already probes the openvswitch kernel module for > OVS_ACTION_ATTR_SET_MASKED support. This patch adds the kernel module > implementation of masked set actions. > > The existing set action sets many fields at once. When only a subset > of the IP header fields, for example, should be modified, all the IP > fields need to be exact matched so that the other field values can be > copied to the set action. A masked set action allows modification of > an arbitrary subset of the supported header bits without requiring the > rest to be matched. > > Masked set action is now supported for all writeable key types, except > for the tunnel key. The set tunnel action is an exception as any > input tunnel info is cleared before action processing starts, so there > is no tunnel info to mask. > > The kernel module converts all (non-tunnel) set actions to masked set > actions. This makes action processing more uniform, and results in > less branching and duplicating the action processing code. When > returning actions to userspace, the set actions that were converted to > masked set actions are converted back to normal set actions. We use a > kernel internal action code to be able to tell the userspace provided > and converted masked set actions apart. > > Signed-off-by: Jarno Rajahalme > --- > v2: Fixed checkpatch warnigns and errors, rebase. > Looks good. Acked-by: Pravin B Shelar ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2] net: openvswitch: Support masked set actions.
On Wed, Feb 4, 2015 at 4:16 PM, Jarno Rajahalme wrote: > OVS userspace already probes the openvswitch kernel module for > OVS_ACTION_ATTR_SET_MASKED support. This patch adds the kernel module > implementation of masked set actions. > > The existing set action sets many fields at once. When only a subset > of the IP header fields, for example, should be modified, all the IP > fields need to be exact matched so that the other field values can be > copied to the set action. A masked set action allows modification of > an arbitrary subset of the supported header bits without requiring the > rest to be matched. > > Masked set action is now supported for all writeable key types, except > for the tunnel key. The set tunnel action is an exception as any > input tunnel info is cleared before action processing starts, so there > is no tunnel info to mask. > > The kernel module converts all (non-tunnel) set actions to masked set > actions. This makes action processing more uniform, and results in > less branching and duplicating the action processing code. When > returning actions to userspace, the set actions that were converted to > masked set actions are converted back to normal set actions. We use a > kernel internal action code to be able to tell the userspace provided > and converted masked set actions apart. > > Signed-off-by: Jarno Rajahalme > --- > v2: Fixed checkpatch warnigns and errors, rebase. I had a few minor comments in my review from December 10th, can you take a look? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH V2 2/2] XenServer: Don't reset on xe-toolstack-restart
With XenServer only 1 manager is configured in the pool, which may not be the first manager returned from `get-manager` as it returns in lexicographical order. V2: Fixes vswitchCurrentControllers() to always return a list Signed-off-by: Jason Kölker --- .../etc_xapi.d_plugins_openvswitch-cfg-update | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update b/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update index aeaa1e7..1bceccf 100755 --- a/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update +++ b/xenserver/etc_xapi.d_plugins_openvswitch-cfg-update @@ -59,10 +59,11 @@ def update(session, args): raise XenAPIPlugin.Failure('MORE_THAN_ONE_POOL_FOR_HOST', []) new_controller = False pool = session.xenapi.pool.get_record(pools[0]) -controller = pool.get('vswitch_controller', '') +controller = pool.get('vswitch_controller') ret_str = '' -currentController = vswitchCurrentController() -if controller == '' and currentController != '': +currentControllers = vswitchCurrentControllers() + +if not controller and currentControllers: delete_cacert() try: emergency_reset(session, None) @@ -70,7 +71,7 @@ def update(session, args): pass removeControllerCfg() ret_str += 'Successfully removed controller config. ' -elif controller != currentController: +elif controller not in currentControllers: delete_cacert() try: emergency_reset(session, None) @@ -194,14 +195,11 @@ def update(session, args): return 'No change to configuration' -def vswitchCurrentController(): -controller = vswitchCfgQuery(['get-manager']) -if controller == '': -return controller -if len(controller) < 4 or controller[0:4] != 'ssl:': -return controller -else: -return controller.split(':')[1] +def vswitchCurrentControllers(): +controllers = vswitchCfgQuery(['get-manager']) +return [controller.lstrip('ssl:').split(':')[0] +for controller in controllers.split('\n') +if controller] def removeControllerCfg(): -- 2.1.0 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2] datapath-windows: accommodate to UFID changes
On 4 February 2015 at 16:31, Ankur Sharma wrote: > > @@ -103,7 +103,9 @@ const NL_POLICY nlFlowPolicy[] = { > .maxLen = sizeof(struct ovs_flow_stats), > .optional = TRUE}, > [OVS_FLOW_ATTR_TCP_FLAGS] = {NL_A_U8, .optional = TRUE}, > -[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE} > +[OVS_FLOW_ATTR_USED] = {NL_A_U64, .optional = TRUE}, > +[OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = TRUE, > + .minLen = sizeof(OvsU128) } > }; > > [ANKUR]: As nithin correctly, mentioned what we are trying to do with > OVS_FLOW_ATTR_UFID is not clear? Do we not want to support it or do we want > it to be a NO OP? If we want it to be a NO OP then is it going to create some > functionality issue (since we'll end up returning success to user space while > kernel did not do anything) See check_ufid(), dpif_feature_probe() in userspace. If the kernel silently ignores the UFID attribute during the feature probe, then userspace should detect that and disable the UFID feature. After that, no UFID should be sent to the kernel. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] RFC: Geneve OpenFlow support.
On Fri, Jan 23, 2015 at 10:10 AM, Madhu Challa wrote: > This commit adds support for Geneve flow mods via two nicira extensions. > NXM_NX_TUN_METADATA for being able to match on tun_metadata nxms and > NXAST_RAW_TUN_METADATA for tun_metadata actions. This is great, thanks! > The other side affect of keeping a table is that when flows are deleted > we cannot necesarily remove and resuse the holes in the option space without > additional complexity. I have not yet figured out how to maintain a reference > count of the flows that are using a particular entry. Hmm, this is both nasty and important. I don't a good idea on how to solve this at the moment, hopefully someone else does... > ovs-ofctl is expanded to be able to pass multiple tun_metadata options. > > example: > ovs-ofctl add-flow br0 > "tun_id=0x32,tun_src=12.1.1.5,tun_metadata=1234561122334455667788,tun_metadata=123478aabbccdd,in_port=1 > actions=output:2" My guess is that we probably need some additional parsing/formatting support in order to make this usable to humans. Parsing is definitely useful for small scale testing (i.e. so we can write a flow that matches on class=0xa, type=0xb, value=0xc without breaking out a calculator). However, probably even more important is formatting because while these flows can obviously be generated by a controller without too much issue, they will be really hard to debug. > Actions > === > A new action tun_metadata is added. Multiple of these actions can appear in > a flowmod. The actions consult the same table to figure out mappings. The biggest concern that I have with this patch is how to handle actions. I think it's probably important to be able to load non-static data into a field - something that is derived from a register or protocol field. Do you have any ideas how this might be possible? I'm not sure that I fully understand the reason that actions need to consult the table since there's no lookup involved. I vaguely remember discussing sorting them to avoid different orders but a table seems a little bit like an overkill. Can you describe it a little more? Can you also update the man pages and add some unit tests? I hope that Ben or Jarno will comment on the metaflow and NXM parts when the time is right, so I mostly focused my comments on the ODP and tun metadata areas. > diff --git a/lib/meta-flow.h b/lib/meta-flow.h > index 4a6c443..4646650 100644 > --- a/lib/meta-flow.h > +++ b/lib/meta-flow.h > @@ -1377,6 +1377,27 @@ enum OVS_PACKED_ENUM mf_field_id { > */ > MFF_ND_TLL, > > +/* > + * "tun_metadata". > + * > + * Encap metadata for tunnels. > + * > + * Each NXM can carry upto 255 bytes of encap metadata when not masked or > + * upto 127 bytes of encap metadata followed by equal length mask when > + * masked. It seems vaguely weird to allow different lengths depending on whether there is a mask or not. I know that this is due to NXM encoding limitations and not something that we're explicitly enforcing but it still seems a little odd. > + * Type: bytestring. > + * Maskable: bitwise. > + * Formatting: tun_metadata. > + * Prerequisites: none. > + * Access: read/write. My guess is that we probably can't usefully write to the field at moment (I think). > diff --git a/lib/odp-util.c b/lib/odp-util.c > index 37d73a9..37e2ec3 100644 > --- a/lib/odp-util.c > +++ b/lib/odp-util.c > #define GENEVE_OPT(class, type) ((OVS_FORCE uint32_t)(class) << 8 | (type)) > -static int > +static int OVS_UNUSED > parse_geneve_opts(const struct nlattr *attr) If parse_geneve_opts() isn't actually useful, we can just dump it. It was just a placeholder, really. > @@ -1354,12 +1355,11 @@ odp_tun_key_from_attr(const struct nlattr *attr, > struct flow_tnl *tun) > tun->flags |= FLOW_TNL_F_OAM; > break; > case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS: { > -if (parse_geneve_opts(a)) { > -return ODP_FIT_ERROR; > -} > /* It is necessary to reproduce options exactly (including order) > * so it's easiest to just echo them back. */ This comment is referring to setting unknown = true, so we can remove it. In the current OVS implementation, everything is echoed back (this wasn't true before). > diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c > index e694fd9..70ecf92 100644 > --- a/lib/ofp-actions.c > +++ b/lib/ofp-actions.c > +/* Action structure for NXAST_TUN_METADATA. */ > +struct nx_action_tun_metadata { > +ovs_be16 type; > +ovs_be16 len; > +ovs_be32 vendor; > +ovs_be16 subtype; > +uint8_t data_len; > +uint8_t zero[4]; > +uint8_t data[256]; > +}; Do we need the separate data_len field? Isn't it implied by the earlier len? > diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c > new file mode 100644 > index 000..3748bdf > --- /dev/null > +++ b/lib/tun-metadata.c > +/* callers responsibility to verify entry->len is same as l