Re: [ovs-dev] [PATCH 3/7] Restore all flow changes by compose_output_action__().

2013-04-23 Thread Rajahalme, Jarno (NSN - FI/Espoo)
On Apr 22, 2013, at 19:16 , ext Ben Pfaff wrote:

> On Mon, Apr 22, 2013 at 09:12:23AM -0700, Ben Pfaff wrote:
>> On Thu, Apr 18, 2013 at 06:07:41PM +0300, Jarno Rajahalme wrote:
>>> This makes sure that output actions leave no changes to any flow fields,
>>> while all explicit set_field actions are retained actross output actions.
>>> 
>>> Signed-off-by: Jarno Rajahalme 
>> 
>> Queued up to apply, thanks.
> 
> I applied these first three commits to master.  I need some further time
> to study the rest.

Thanks, Ben :-)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2.26] datapath: Add basic MPLS support to kernel

2013-04-23 Thread Rajahalme, Jarno (NSN - FI/Espoo)

On Apr 23, 2013, at 4:51 , ext Simon Horman wrote:

> On Mon, Apr 22, 2013 at 01:55:43PM +, Rajahalme, Jarno (NSN - FI/Espoo) 
> wrote:
>> 
>> Here the skb_network_header is changed to point to the L3 header. Is it
>> significant that in some cases (?) mpls_stack_depth may remain at zero,
>> even when a MPLS header was in fact added? (See above).
> 
> With the current code I believe there are the following cases:
> 
> Input: non-MPLS skb: Output: network header and mac_len correspond to the
> beginning of the L3 headers
> Input: MPLS: Output: network header and mac_len correspond to the
> end of the L2 headers.
> 
> In the case of MPLS output the end of the L2 headers and the beginning
> of the L3 headers will differ.
> 
> 
> As far as I know the network header and mac_len only need to correspond to
> the beginning of the L3 header if GSO segmentation will occur (actually,
> some proposed changes to the network stack are required, see "[PATCH 0/2]
> Small Modifications to GSO to allow segmentation of MPLS").  That only
> occurs if the skb is GSO. Which in turn can only occur if the recieved
> packet is non-MPLS. This is because the linux kernel doesn't support
> MPLS offloads on receive (or anywhere else for that matter).
> 
> In the case that we have a non-MPLS skb the stack depth starts at zero and
> is tracked. This is used to update the network header and mac_len.
> Otherwise the stack depth is unknown and the network header and mac_len are
> left as-is, corresponding to the end of the L2 headers.
> 
> Actually, it is possible to tighten up the if clause to be the following,
> as it is only necessary to update the network header and mac_len for GSO skbs.
> 
>   if (mpls_stack_depth && skb_is_gso(skb)) {
>   ...
>   }
> 
> It is possible for us to find and track the MPLS stack depth for all cases
> and to update the network header and mac_len. However I don't think that
> there is any run-time benefit and it seems expensive to find out what the
> original stack depth was - I believe it would require parsing the MPLS
> entire stack for each packet.
> 

Thanks for explaining this.

I think it would be better to keep updating the the network_header and mac_len 
for the Non-MPLS input packets regardless of the GSO status of the skb. It 
would be more consistent and less surprising.

  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2.26] datapath: Add basic MPLS support to kernel

2013-04-23 Thread Simon Horman
On Tue, Apr 23, 2013 at 07:41:37AM +, Rajahalme, Jarno (NSN - FI/Espoo) 
wrote:
> 
> On Apr 23, 2013, at 4:51 , ext Simon Horman wrote:
> 
> > On Mon, Apr 22, 2013 at 01:55:43PM +, Rajahalme, Jarno (NSN - FI/Espoo) 
> > wrote:
> >> 
> >> Here the skb_network_header is changed to point to the L3 header. Is it
> >> significant that in some cases (?) mpls_stack_depth may remain at zero,
> >> even when a MPLS header was in fact added? (See above).
> > 
> > With the current code I believe there are the following cases:
> > 
> > Input: non-MPLS skb: Output: network header and mac_len correspond to the
> > beginning of the L3 headers
> > Input: MPLS: Output: network header and mac_len correspond to the
> > end of the L2 headers.
> > 
> > In the case of MPLS output the end of the L2 headers and the beginning
> > of the L3 headers will differ.
> > 
> > 
> > As far as I know the network header and mac_len only need to correspond to
> > the beginning of the L3 header if GSO segmentation will occur (actually,
> > some proposed changes to the network stack are required, see "[PATCH 0/2]
> > Small Modifications to GSO to allow segmentation of MPLS").  That only
> > occurs if the skb is GSO. Which in turn can only occur if the recieved
> > packet is non-MPLS. This is because the linux kernel doesn't support
> > MPLS offloads on receive (or anywhere else for that matter).
> > 
> > In the case that we have a non-MPLS skb the stack depth starts at zero and
> > is tracked. This is used to update the network header and mac_len.
> > Otherwise the stack depth is unknown and the network header and mac_len are
> > left as-is, corresponding to the end of the L2 headers.
> > 
> > Actually, it is possible to tighten up the if clause to be the following,
> > as it is only necessary to update the network header and mac_len for GSO 
> > skbs.
> > 
> > if (mpls_stack_depth && skb_is_gso(skb)) {
> > ...
> > }
> > 
> > It is possible for us to find and track the MPLS stack depth for all cases
> > and to update the network header and mac_len. However I don't think that
> > there is any run-time benefit and it seems expensive to find out what the
> > original stack depth was - I believe it would require parsing the MPLS
> > entire stack for each packet.
> > 
> 
> Thanks for explaining this.
> 
> I think it would be better to keep updating the the network_header and 
> mac_len for the Non-MPLS input packets regardless of the GSO status of the 
> skb. It would be more consistent and less surprising.

I agree entirely that it would be more consistent and less surprising.
But I'm not sure if the cost is worth it.

Jesse, do you have an opinion on this?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH V3] hotplug: add openvswitch script

2013-04-23 Thread Ian Campbell
Based on Waldi's RFC at
http://lists.xen.org/archives/html/xen-devel/2012-09/msg00943.html

To use it set vif.default.script="vif-openvswitch" in /etc/xen/xl.conf or use
script=vif-openvswitch in the vif configuration.

Appears to do the right thing for PV and HVM guests (including tap devices)
and with stubdomains.

In order to support VLAN tagging and trunking the "bridge" specified in the
configuration can have a special syntax, that is:

BRIDGE_NAME[.VLAN][:TRUNK:TRUNK]

e.g.
- xenbr0.99
 add the VIF to VLAN99 on xenbr0
- xenbr0:99:100:101
 add the VIF to xenbr0 as a trunk port receiving VLANs 99, 100 & 101

Waldi, can you confirm if I have correctly reverse engineered the syntax from
the regexp please ;-)

Signed-off-by: Ian Campbell 
Signed-off-by: Bastian Blank 
Cc: dev@openvswitch.org
---
v3: Handle the remove case as well to properly support tap devices
v2: Correct syntax description in commitlog.
Remove erroneous xl.conf change.
Added Bastian's S-o-b.
Agreed with Ben Pfaff that the Xen tree is a good home for this script.
---
 tools/hotplug/Linux/Makefile|1 +
 tools/hotplug/Linux/vif-openvswitch |   99 +++
 2 files changed, 100 insertions(+), 0 deletions(-)
 create mode 100644 tools/hotplug/Linux/vif-openvswitch

diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
index 0605559..99bf87f 100644
--- a/tools/hotplug/Linux/Makefile
+++ b/tools/hotplug/Linux/Makefile
@@ -13,6 +13,7 @@ XENCOMMONS_SYSCONFIG = init.d/sysconfig.xencommons
 XEN_SCRIPTS = network-bridge vif-bridge
 XEN_SCRIPTS += network-route vif-route
 XEN_SCRIPTS += network-nat vif-nat
+XEN_SCRIPTS += vif-openvswitch
 XEN_SCRIPTS += vif2
 XEN_SCRIPTS += vif-setup
 XEN_SCRIPTS += block
diff --git a/tools/hotplug/Linux/vif-openvswitch 
b/tools/hotplug/Linux/vif-openvswitch
new file mode 100644
index 000..f30e78c
--- /dev/null
+++ b/tools/hotplug/Linux/vif-openvswitch
@@ -0,0 +1,99 @@
+#!/bin/bash
+#
+# ${XEN_SCRIPT_DIR}/vif-openvswitch
+#
+# Script for configuring a vif in openvswitch mode.
+# The hotplugging system will call this script if it is specified either in
+# the device configuration given to Xend, or the default Xend configuration
+# in ${XEN_CONFIG_DIR}/xend-config.sxp.  If the script is specified in
+# neither of those places, then this script is the default.
+#
+# Usage:
+# vif-openvswitch (add|remove|online|offline)
+#
+# Environment vars:
+# vif vif interface name (required).
+# XENBUS_PATH path to this device's details in the XenStore (required).
+#
+# Read from the store:
+# bridge  openvswitch to add the vif to (required).
+# ip  list of IP networks for the vif, space-separated (optional).
+#
+# up:
+# Enslaves the vif interface to the bridge and adds iptables rules
+# for its ip addresses (if any).
+#
+# down:
+# Removes the vif interface from the bridge and removes the iptables
+# rules for its ip addresses (if any).
+#
+
+dir=$(dirname "$0")
+. "$dir/vif-common.sh"
+
+openvswitch_external_id() {
+local dev=$1
+local key=$2
+local value=$3
+
+echo "-- set interface $dev external-ids:\"$key\"=\"$value\""
+}
+
+openvswitch_external_id_all() {
+local dev=$1
+local frontend_id=$(xenstore_read "$XENBUS_PATH/frontend-id")
+local vm_path=$(xenstore_read "/local/domain/${frontend_id}/vm")
+local name=$(xenstore_read "${vm_path}/name")
+openvswitch_external_id $dev "xen-vm-name" "$name"
+local uuid=$(xenstore_read "${vm_path}/uuid")
+openvswitch_external_id $dev "xen-vm-uuid" "$uuid"
+local mac=$(xenstore_read "$XENBUS_PATH/mac")
+openvswitch_external_id $dev "attached-mac" "$mac"
+}
+
+add_to_openvswitch () {
+local dev=$1
+local bridge="$(xenstore_read_default "$XENBUS_PATH/bridge" "$bridge")"
+local tag trunk
+
+if [[ $bridge =~ 
^([^.:]+)(\.([[:digit:]]+))?(:([[:digit:]]+(:[[:digit:]]+)*))?$ ]]; then
+bridge="${BASH_REMATCH[1]}"
+tag="${BASH_REMATCH[3]}"
+trunk="${BASH_REMATCH[5]//:/,}"
+else
+fatal "No valid bridge was specified"
+fi
+
+if [ $trunk ]; then
+local trunk_arg="trunk=$trunk"
+fi
+
+if [ $tag ]; then
+local tag_arg="tag=$tag"
+fi
+
+local vif_details="$(openvswitch_external_id_all $dev)"
+
+ovs-vsctl --timeout=30 -- --if-exists del-port $dev -- add-port "$bridge" 
$dev $tag_arg $trunk_arg $vif_details
+ip link set $dev up
+}
+
+case "$command" in
+add|online)
+setup_virtual_bridge_port $dev
+add_to_openvswitch $dev
+;;
+
+remove|offline)
+ovs-vsctl --timeout=30 -- --if-exists del-port $dev
+;;
+esac
+
+if [ "$type_if" = vif ]; then
+handle_iptable
+fi
+
+log debug "Successful vif-openvswitch $command for $dev."
+if [ "$type_if" = vif -a "$command" = "onli

Re: [ovs-dev] [PATCH V2] hotplug: add openvswitch script

2013-04-23 Thread Ian Campbell
On Mon, 2013-04-22 at 18:20 +0100, Roger Pau Monne wrote:
> On 22/04/13 19:05, Ian Campbell wrote:
> > On Mon, 2013-04-22 at 17:46 +0100, Roger Pau Monne wrote:
> >>> +offline)
> >>
> >> Don't you need a "remove" here?
> >>
> >> remove|offline)
> >>...
> >>
> >> Or it doesn't matter because when the tap device is destroyed everything
> >> is automagically cleaned?
> > 
> > Even if that were the case I wouldn't want to rely on it, so we should
> > try and clean up explicitly.
> > 
> > Is this tap devices which only generates add/remove and not
> > online/offline? Do you happen to know the sequence of events for both
> > types of device?
> 
> It's all in libxl_linux.c, libxl__hotplug_nic in case you want to take a
> look at the implementation, the sequence of events for HVM guests should be:

Ah, yes, I'd forgotten I wasn't worrying about udev anymore.

Anyway, I confirmed that I was leaking the vifX.y-emu devices into the
OVS configuration, since unlike bridges OVS ports are persistent and not
tied to the existence of the actual device. With the change you
suggested things are fine. V3 on its way...

Thanks,
Ian.


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH V3] hotplug: add openvswitch script

2013-04-23 Thread Roger Pau Monné
On 23/04/13 10:22, Ian Campbell wrote:
> Based on Waldi's RFC at
> http://lists.xen.org/archives/html/xen-devel/2012-09/msg00943.html
> 
> To use it set vif.default.script="vif-openvswitch" in /etc/xen/xl.conf or use
> script=vif-openvswitch in the vif configuration.
> 
> Appears to do the right thing for PV and HVM guests (including tap devices)
> and with stubdomains.
> 
> In order to support VLAN tagging and trunking the "bridge" specified in the
> configuration can have a special syntax, that is:
> 
>   BRIDGE_NAME[.VLAN][:TRUNK:TRUNK]
> 
> e.g.
> - xenbr0.99
>  add the VIF to VLAN99 on xenbr0
> - xenbr0:99:100:101
>  add the VIF to xenbr0 as a trunk port receiving VLANs 99, 100 & 101
> 
> Waldi, can you confirm if I have correctly reverse engineered the syntax from
> the regexp please ;-)
> 
> Signed-off-by: Ian Campbell 
> Signed-off-by: Bastian Blank 
> Cc: dev@openvswitch.org
> ---
> v3: Handle the remove case as well to properly support tap devices
> v2: Correct syntax description in commitlog.
> Remove erroneous xl.conf change.
> Added Bastian's S-o-b.
> Agreed with Ben Pfaff that the Xen tree is a good home for this script.
> ---
>  tools/hotplug/Linux/Makefile|1 +
>  tools/hotplug/Linux/vif-openvswitch |   99 
> +++
>  2 files changed, 100 insertions(+), 0 deletions(-)
>  create mode 100644 tools/hotplug/Linux/vif-openvswitch
> 
> diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
> index 0605559..99bf87f 100644
> --- a/tools/hotplug/Linux/Makefile
> +++ b/tools/hotplug/Linux/Makefile
> @@ -13,6 +13,7 @@ XENCOMMONS_SYSCONFIG = init.d/sysconfig.xencommons
>  XEN_SCRIPTS = network-bridge vif-bridge
>  XEN_SCRIPTS += network-route vif-route
>  XEN_SCRIPTS += network-nat vif-nat
> +XEN_SCRIPTS += vif-openvswitch
>  XEN_SCRIPTS += vif2
>  XEN_SCRIPTS += vif-setup
>  XEN_SCRIPTS += block
> diff --git a/tools/hotplug/Linux/vif-openvswitch 
> b/tools/hotplug/Linux/vif-openvswitch
> new file mode 100644
> index 000..f30e78c
> --- /dev/null
> +++ b/tools/hotplug/Linux/vif-openvswitch
> @@ -0,0 +1,99 @@
> +#!/bin/bash
> +#
> +# ${XEN_SCRIPT_DIR}/vif-openvswitch
> +#
> +# Script for configuring a vif in openvswitch mode.
> +# The hotplugging system will call this script if it is specified either in
> +# the device configuration given to Xend, or the default Xend configuration
> +# in ${XEN_CONFIG_DIR}/xend-config.sxp.  If the script is specified in
> +# neither of those places, then this script is the default.
> +#
> +# Usage:
> +# vif-openvswitch (add|remove|online|offline)
> +#
> +# Environment vars:
> +# vif vif interface name (required).
> +# XENBUS_PATH path to this device's details in the XenStore (required).
> +#
> +# Read from the store:
> +# bridge  openvswitch to add the vif to (required).
> +# ip  list of IP networks for the vif, space-separated (optional).
> +#
> +# up:
> +# Enslaves the vif interface to the bridge and adds iptables rules
> +# for its ip addresses (if any).
> +#
> +# down:
> +# Removes the vif interface from the bridge and removes the iptables
> +# rules for its ip addresses (if any).
> +#
> +
> +dir=$(dirname "$0")
> +. "$dir/vif-common.sh"
> +
> +openvswitch_external_id() {
> +local dev=$1
> +local key=$2
> +local value=$3
> +
> +echo "-- set interface $dev external-ids:\"$key\"=\"$value\""
> +}
> +
> +openvswitch_external_id_all() {
> +local dev=$1
> +local frontend_id=$(xenstore_read "$XENBUS_PATH/frontend-id")
> +local vm_path=$(xenstore_read "/local/domain/${frontend_id}/vm")
> +local name=$(xenstore_read "${vm_path}/name")
> +openvswitch_external_id $dev "xen-vm-name" "$name"
> +local uuid=$(xenstore_read "${vm_path}/uuid")
> +openvswitch_external_id $dev "xen-vm-uuid" "$uuid"
> +local mac=$(xenstore_read "$XENBUS_PATH/mac")
> +openvswitch_external_id $dev "attached-mac" "$mac"
> +}
> +
> +add_to_openvswitch () {
> +local dev=$1
> +local bridge="$(xenstore_read_default "$XENBUS_PATH/bridge" "$bridge")"
> +local tag trunk
> +
> +if [[ $bridge =~ 
> ^([^.:]+)(\.([[:digit:]]+))?(:([[:digit:]]+(:[[:digit:]]+)*))?$ ]]; then
> +bridge="${BASH_REMATCH[1]}"
> +tag="${BASH_REMATCH[3]}"
> +trunk="${BASH_REMATCH[5]//:/,}"
> +else
> +fatal "No valid bridge was specified"
> +fi
> +
> +if [ $trunk ]; then
> +local trunk_arg="trunk=$trunk"
> +fi
> +
> +if [ $tag ]; then
> +local tag_arg="tag=$tag"
> +fi
> +
> +local vif_details="$(openvswitch_external_id_all $dev)"
> +
> +ovs-vsctl --timeout=30 -- --if-exists del-port $dev -- add-port 
> "$bridge" $dev $tag_arg $trunk_arg $vif_details
> +ip link set $dev up

Should we check the return value of this two calls, so we can print a
nice e

Re: [ovs-dev] [PATCH V3] hotplug: add openvswitch script

2013-04-23 Thread Roger Pau Monné
On 23/04/13 10:22, Ian Campbell wrote:
> Based on Waldi's RFC at
> http://lists.xen.org/archives/html/xen-devel/2012-09/msg00943.html
> 
> To use it set vif.default.script="vif-openvswitch" in /etc/xen/xl.conf or use
> script=vif-openvswitch in the vif configuration.
> 
> Appears to do the right thing for PV and HVM guests (including tap devices)
> and with stubdomains.
> 
> In order to support VLAN tagging and trunking the "bridge" specified in the
> configuration can have a special syntax, that is:
> 
>   BRIDGE_NAME[.VLAN][:TRUNK:TRUNK]
> 
> e.g.
> - xenbr0.99
>  add the VIF to VLAN99 on xenbr0
> - xenbr0:99:100:101
>  add the VIF to xenbr0 as a trunk port receiving VLANs 99, 100 & 101
> 
> Waldi, can you confirm if I have correctly reverse engineered the syntax from
> the regexp please ;-)
> 
> Signed-off-by: Ian Campbell 
> Signed-off-by: Bastian Blank 
> Cc: dev@openvswitch.org
> ---
> v3: Handle the remove case as well to properly support tap devices
> v2: Correct syntax description in commitlog.
> Remove erroneous xl.conf change.
> Added Bastian's S-o-b.
> Agreed with Ben Pfaff that the Xen tree is a good home for this script.
> ---
>  tools/hotplug/Linux/Makefile|1 +
>  tools/hotplug/Linux/vif-openvswitch |   99 
> +++
>  2 files changed, 100 insertions(+), 0 deletions(-)
>  create mode 100644 tools/hotplug/Linux/vif-openvswitch
> 
> diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
> index 0605559..99bf87f 100644
> --- a/tools/hotplug/Linux/Makefile
> +++ b/tools/hotplug/Linux/Makefile
> @@ -13,6 +13,7 @@ XENCOMMONS_SYSCONFIG = init.d/sysconfig.xencommons
>  XEN_SCRIPTS = network-bridge vif-bridge
>  XEN_SCRIPTS += network-route vif-route
>  XEN_SCRIPTS += network-nat vif-nat
> +XEN_SCRIPTS += vif-openvswitch
>  XEN_SCRIPTS += vif2
>  XEN_SCRIPTS += vif-setup
>  XEN_SCRIPTS += block
> diff --git a/tools/hotplug/Linux/vif-openvswitch 
> b/tools/hotplug/Linux/vif-openvswitch
> new file mode 100644
> index 000..f30e78c
> --- /dev/null
> +++ b/tools/hotplug/Linux/vif-openvswitch
> @@ -0,0 +1,99 @@
> +#!/bin/bash
> +#
> +# ${XEN_SCRIPT_DIR}/vif-openvswitch
> +#
> +# Script for configuring a vif in openvswitch mode.
> +# The hotplugging system will call this script if it is specified either in
> +# the device configuration given to Xend, or the default Xend configuration
> +# in ${XEN_CONFIG_DIR}/xend-config.sxp.  If the script is specified in
> +# neither of those places, then this script is the default.
> +#
> +# Usage:
> +# vif-openvswitch (add|remove|online|offline)
> +#
> +# Environment vars:
> +# vif vif interface name (required).
> +# XENBUS_PATH path to this device's details in the XenStore (required).
> +#
> +# Read from the store:
> +# bridge  openvswitch to add the vif to (required).
> +# ip  list of IP networks for the vif, space-separated (optional).
> +#
> +# up:
> +# Enslaves the vif interface to the bridge and adds iptables rules
> +# for its ip addresses (if any).
> +#
> +# down:
> +# Removes the vif interface from the bridge and removes the iptables
> +# rules for its ip addresses (if any).
> +#

Also, it might be interesting to add something like:

check_tools()
{
if ! type ovs-vsctl > /dev/null 2>&1; then
fatal "Unable to find ovs-vsctl tool"
fi
if ! type ip > /dev/null 2>&1; then
fatal "Unable to find ip tool"
fi
}

In order to check that the needed external tools are there.

> +dir=$(dirname "$0")
> +. "$dir/vif-common.sh"
> +
> +openvswitch_external_id() {
> +local dev=$1
> +local key=$2
> +local value=$3
> +
> +echo "-- set interface $dev external-ids:\"$key\"=\"$value\""
> +}
> +
> +openvswitch_external_id_all() {
> +local dev=$1
> +local frontend_id=$(xenstore_read "$XENBUS_PATH/frontend-id")
> +local vm_path=$(xenstore_read "/local/domain/${frontend_id}/vm")
> +local name=$(xenstore_read "${vm_path}/name")
> +openvswitch_external_id $dev "xen-vm-name" "$name"
> +local uuid=$(xenstore_read "${vm_path}/uuid")
> +openvswitch_external_id $dev "xen-vm-uuid" "$uuid"
> +local mac=$(xenstore_read "$XENBUS_PATH/mac")
> +openvswitch_external_id $dev "attached-mac" "$mac"
> +}
> +
> +add_to_openvswitch () {
> +local dev=$1
> +local bridge="$(xenstore_read_default "$XENBUS_PATH/bridge" "$bridge")"
> +local tag trunk
> +
> +if [[ $bridge =~ 
> ^([^.:]+)(\.([[:digit:]]+))?(:([[:digit:]]+(:[[:digit:]]+)*))?$ ]]; then
> +bridge="${BASH_REMATCH[1]}"
> +tag="${BASH_REMATCH[3]}"
> +trunk="${BASH_REMATCH[5]//:/,}"
> +else
> +fatal "No valid bridge was specified"
> +fi
> +
> +if [ $trunk ]; then
> +local trunk_arg="trunk=$trunk"
> +fi
> +
> +if [ $tag ]; then
> +local tag_arg="

Re: [ovs-dev] [PATCH V3] hotplug: add openvswitch script

2013-04-23 Thread Ian Campbell
On Tue, 2013-04-23 at 09:46 +0100, Roger Pau Monne wrote:
> > +ovs-vsctl --timeout=30 -- --if-exists del-port $dev -- add-port 
> > "$bridge" $dev $tag_arg $trunk_arg $vif_details
> > +ip link set $dev up
> 
> Should we check the return value of this two calls, so we can print a
> nice error message if something fails?

The bridge variant doesn't, but I suppose that's not much of an
argument. There's a do_or_die helper in xen-hotplug-common.sh which logs
on failure, I suppose that would be a good thing to use?

> > +}
> > +
> > +case "$command" in
> > +add|online)
> > +setup_virtual_bridge_port $dev
> > +add_to_openvswitch $dev
> > +;;
> > +
> > +remove|offline)
> > +ovs-vsctl --timeout=30 -- --if-exists del-port $dev
> 
> Same here.

I think this one wants to be a non-fatal error on failure, do allow as
much cleanup as possible to go ahead (e.g. the iptables rule below). I
think do_without_error is the right answer for this one.

I also noticed that we don't set the link down here. Probably the device
is about to go away but I added it for consistency with the up case, and
with the bridge script...

Ian.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH V3] hotplug: add openvswitch script

2013-04-23 Thread Ian Campbell
On Tue, 2013-04-23 at 09:56 +0100, Roger Pau Monne wrote:

> Also, it might be interesting to add something like:
> 
> check_tools()
> {
> if ! type ovs-vsctl > /dev/null 2>&1; then

I think type is a bash-ism but I don't know the portable alternative and
this script uses #!/bin/bash anyhow.

> fatal "Unable to find ovs-vsctl tool"
> fi
> if ! type ip > /dev/null 2>&1; then
> fatal "Unable to find ip tool"
> fi
> }
> 
> In order to check that the needed external tools are there.

Can't hurt I suppose. I made it check only in the online case, since
like I said before I'd like remove to try to do as much as it can.


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH V3] hotplug: add openvswitch script

2013-04-23 Thread Bastian Blank
On Tue, Apr 23, 2013 at 10:42:15AM +0100, Ian Campbell wrote:
> On Tue, 2013-04-23 at 09:56 +0100, Roger Pau Monne wrote:
> > Also, it might be interesting to add something like:
> > if ! type ovs-vsctl > /dev/null 2>&1; then
> I think type is a bash-ism but I don't know the portable alternative and
> this script uses #!/bin/bash anyhow.

You may want to use "command -v $bla > /dev/null", it is a bit more
portable.

Bastian

-- 
Star Trek Lives!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH V4] hotplug: add openvswitch script

2013-04-23 Thread Ian Campbell
Based on Waldi's RFC at
http://lists.xen.org/archives/html/xen-devel/2012-09/msg00943.html

To use it set vif.default.script="vif-openvswitch" in /etc/xen/xl.conf or use
script=vif-openvswitch in the vif configuration.

Appears to do the right thing for PV and HVM guests (including tap devices)
and with stubdomains.

In order to support VLAN tagging and trunking the "bridge" specified in the
configuration can have a special syntax, that is:

BRIDGE_NAME[.VLAN][:TRUNK:TRUNK]

e.g.
- xenbr0.99
 add the VIF to VLAN99 on xenbr0
- xenbr0:99:100:101
 add the VIF to xenbr0 as a trunk port receiving VLANs 99, 100 & 101

Signed-off-by: Ian Campbell 
Signed-off-by: Bastian Blank 
Cc: dev@openvswitch.org
---
v4: Handle errors when adding or removing the device, with the former being
fatal. Tested by chmod -x /usr/bin/ovs-vsctl.
Bring the device down after removing it from the bridge.
Check for ovs-vsctl and ip on the path. Tested by renaming the binary.
v3: Handle the remove case as well to properly support tap devices
v2: Correct syntax description in commitlog.
Remove erroneous xl.conf change.
Added Bastian's S-o-b.
Agreed with Ben Pfaff that the Xen tree is a good home for this script.
---
 tools/hotplug/Linux/Makefile|1 +
 tools/hotplug/Linux/vif-openvswitch |  113 +++
 2 files changed, 114 insertions(+), 0 deletions(-)
 create mode 100644 tools/hotplug/Linux/vif-openvswitch

diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
index 0605559..99bf87f 100644
--- a/tools/hotplug/Linux/Makefile
+++ b/tools/hotplug/Linux/Makefile
@@ -13,6 +13,7 @@ XENCOMMONS_SYSCONFIG = init.d/sysconfig.xencommons
 XEN_SCRIPTS = network-bridge vif-bridge
 XEN_SCRIPTS += network-route vif-route
 XEN_SCRIPTS += network-nat vif-nat
+XEN_SCRIPTS += vif-openvswitch
 XEN_SCRIPTS += vif2
 XEN_SCRIPTS += vif-setup
 XEN_SCRIPTS += block
diff --git a/tools/hotplug/Linux/vif-openvswitch 
b/tools/hotplug/Linux/vif-openvswitch
new file mode 100644
index 000..a8e4fc7
--- /dev/null
+++ b/tools/hotplug/Linux/vif-openvswitch
@@ -0,0 +1,113 @@
+#!/bin/bash
+#
+# ${XEN_SCRIPT_DIR}/vif-openvswitch
+#
+# Script for configuring a vif in openvswitch mode.
+# The hotplugging system will call this script if it is specified either in
+# the device configuration given to Xend, or the default Xend configuration
+# in ${XEN_CONFIG_DIR}/xend-config.sxp.  If the script is specified in
+# neither of those places, then this script is the default.
+#
+# Usage:
+# vif-openvswitch (add|remove|online|offline)
+#
+# Environment vars:
+# vif vif interface name (required).
+# XENBUS_PATH path to this device's details in the XenStore (required).
+#
+# Read from the store:
+# bridge  openvswitch to add the vif to (required).
+# ip  list of IP networks for the vif, space-separated (optional).
+#
+# up:
+# Enslaves the vif interface to the bridge and adds iptables rules
+# for its ip addresses (if any).
+#
+# down:
+# Removes the vif interface from the bridge and removes the iptables
+# rules for its ip addresses (if any).
+#
+
+dir=$(dirname "$0")
+. "$dir/vif-common.sh"
+
+check_tools()
+{
+if ! command -v ovs-vsctl > /dev/null 2>&1; then
+fatal "Unable to find ovs-vsctl tool"
+fi
+if ! command -v ip > /dev/null 2>&1; then
+fatal "Unable to find ip tool"
+fi
+}
+openvswitch_external_id() {
+local dev=$1
+local key=$2
+local value=$3
+
+echo "-- set interface $dev external-ids:\"$key\"=\"$value\""
+}
+
+openvswitch_external_id_all() {
+local dev=$1
+local frontend_id=$(xenstore_read "$XENBUS_PATH/frontend-id")
+local vm_path=$(xenstore_read "/local/domain/${frontend_id}/vm")
+local name=$(xenstore_read "${vm_path}/name")
+openvswitch_external_id $dev "xen-vm-name" "$name"
+local uuid=$(xenstore_read "${vm_path}/uuid")
+openvswitch_external_id $dev "xen-vm-uuid" "$uuid"
+local mac=$(xenstore_read "$XENBUS_PATH/mac")
+openvswitch_external_id $dev "attached-mac" "$mac"
+}
+
+add_to_openvswitch () {
+local dev=$1
+local bridge="$(xenstore_read_default "$XENBUS_PATH/bridge" "$bridge")"
+local tag trunk
+
+if [[ $bridge =~ 
^([^.:]+)(\.([[:digit:]]+))?(:([[:digit:]]+(:[[:digit:]]+)*))?$ ]]; then
+bridge="${BASH_REMATCH[1]}"
+tag="${BASH_REMATCH[3]}"
+trunk="${BASH_REMATCH[5]//:/,}"
+else
+fatal "No valid bridge was specified"
+fi
+
+if [ $trunk ]; then
+local trunk_arg="trunk=$trunk"
+fi
+
+if [ $tag ]; then
+local tag_arg="tag=$tag"
+fi
+
+local vif_details="$(openvswitch_external_id_all $dev)"
+
+do_or_die ovs-vsctl --timeout=30 \
+-- --if-exists del-port $dev \
+-- add-port "$bridge" $dev $tag_arg $trunk_arg $vif_detail

Re: [ovs-dev] [PATCH V3] hotplug: add openvswitch script

2013-04-23 Thread Ian Campbell
On Tue, 2013-04-23 at 10:58 +0100, Bastian Blank wrote:
> On Tue, Apr 23, 2013 at 10:42:15AM +0100, Ian Campbell wrote:
> > On Tue, 2013-04-23 at 09:56 +0100, Roger Pau Monne wrote:
> > > Also, it might be interesting to add something like:
> > > if ! type ovs-vsctl > /dev/null 2>&1; then
> > I think type is a bash-ism but I don't know the portable alternative and
> > this script uses #!/bin/bash anyhow.
> 
> You may want to use "command -v $bla > /dev/null", it is a bit more
> portable.

Thanks for the hint, V4 incoming...

Ian.


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 04/22] use ax_check_openssl.m4 instead of a direct use of pkg-config.

2013-04-23 Thread Ed Maste
On 22 April 2013 09:20, YAMAMOTO Takashi  wrote:
> on some platforms, including NetBSD-6, openssl is available but
> its pkg-config files (*.pc) are not.  according to Ed Maste,
> this is the case for FreeBSD as well.

That's correct.  This should be a nice convenience for anyone building
on FreeBSD from a git checkout.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 15/22] add minimal NetBSD support

2013-04-23 Thread Ed Maste
> Ed, do you want to look over any of these changes?  They affect code
> for which you are a primary author, so if you would prefer to see
> anything done a different way, etc., for maintainability purposes then
> it would be valuable to get that in review.

Yep - I'm looking at them now.  I've reviewed and unit-tested the ones
you've committed so far and will review patches 15 through 22.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 17/22] netdev-bsd: NetBSD: keep a AF_LINK socket open

2013-04-23 Thread Ed Maste
On 22 April 2013 09:20, YAMAMOTO Takashi  wrote:
> this will be used for get_stats and set_etheraddr
>
> Signed-off-by: YAMAMOTO Takashi 
> ---
>  lib/netdev-bsd.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
> index ba351f5..e63ac3f 100644
> --- a/lib/netdev-bsd.c
> +++ b/lib/netdev-bsd.c
> @@ -119,6 +119,11 @@ enum {
>  /* An AF_INET socket (used for ioctl operations). */
>  static int af_inet_sock = -1;
>
> +#if defined(__NetBSD__)
> +/* AF_LINK socket used for netdev_bsd_get_stats and set_etheraddr */
> +static int af_link_sock = -1;
> +#endif /* defined(__NetBSD__) */
> +
>  #define PCAP_SNAPLEN 2048
>
>
> @@ -185,9 +190,17 @@ netdev_bsd_init(void)
>
>  af_inet_sock = socket(AF_INET, SOCK_DGRAM, 0);
>  status = af_inet_sock >= 0 ? 0 : errno;
> -
>  if (status) {
>  VLOG_ERR("failed to create inet socket: %s", strerror(status));
> +return status;
> +}
> +
> +af_link_sock = socket(AF_LINK, SOCK_DGRAM, 0);

Missing #if defined(__NetBSD__) guard here.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 20/22] some NetBSD related documentation changes

2013-04-23 Thread Ed Maste
On 22 April 2013 09:20, YAMAMOTO Takashi  wrote:
>
> Signed-off-by: YAMAMOTO Takashi 

Acked-by: Ed Maste 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 22/22] acinclude.m4: update a comment for NetBSD support

2013-04-23 Thread Ed Maste
On 22 April 2013 09:20, YAMAMOTO Takashi  wrote:
>
> Signed-off-by: YAMAMOTO Takashi 

Acked-by: Ed Maste 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH V4] hotplug: add openvswitch script

2013-04-23 Thread Roger Pau Monné
On 23/04/13 12:00, Ian Campbell wrote:
> Based on Waldi's RFC at
> http://lists.xen.org/archives/html/xen-devel/2012-09/msg00943.html
> 
> To use it set vif.default.script="vif-openvswitch" in /etc/xen/xl.conf or use
> script=vif-openvswitch in the vif configuration.
> 
> Appears to do the right thing for PV and HVM guests (including tap devices)
> and with stubdomains.
> 
> In order to support VLAN tagging and trunking the "bridge" specified in the
> configuration can have a special syntax, that is:
> 
>   BRIDGE_NAME[.VLAN][:TRUNK:TRUNK]
> 
> e.g.
> - xenbr0.99
>  add the VIF to VLAN99 on xenbr0
> - xenbr0:99:100:101
>  add the VIF to xenbr0 as a trunk port receiving VLANs 99, 100 & 101
> 
> Signed-off-by: Ian Campbell 
> Signed-off-by: Bastian Blank 

Acked-by: Roger Pau Monné 

> Cc: dev@openvswitch.org
> ---
> v4: Handle errors when adding or removing the device, with the former being
> fatal. Tested by chmod -x /usr/bin/ovs-vsctl.
> Bring the device down after removing it from the bridge.
> Check for ovs-vsctl and ip on the path. Tested by renaming the binary.
> v3: Handle the remove case as well to properly support tap devices
> v2: Correct syntax description in commitlog.
> Remove erroneous xl.conf change.
> Added Bastian's S-o-b.
> Agreed with Ben Pfaff that the Xen tree is a good home for this script.
> ---
>  tools/hotplug/Linux/Makefile|1 +
>  tools/hotplug/Linux/vif-openvswitch |  113 
> +++
>  2 files changed, 114 insertions(+), 0 deletions(-)
>  create mode 100644 tools/hotplug/Linux/vif-openvswitch
> 
> diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
> index 0605559..99bf87f 100644
> --- a/tools/hotplug/Linux/Makefile
> +++ b/tools/hotplug/Linux/Makefile
> @@ -13,6 +13,7 @@ XENCOMMONS_SYSCONFIG = init.d/sysconfig.xencommons
>  XEN_SCRIPTS = network-bridge vif-bridge
>  XEN_SCRIPTS += network-route vif-route
>  XEN_SCRIPTS += network-nat vif-nat
> +XEN_SCRIPTS += vif-openvswitch
>  XEN_SCRIPTS += vif2
>  XEN_SCRIPTS += vif-setup
>  XEN_SCRIPTS += block
> diff --git a/tools/hotplug/Linux/vif-openvswitch 
> b/tools/hotplug/Linux/vif-openvswitch
> new file mode 100644
> index 000..a8e4fc7
> --- /dev/null
> +++ b/tools/hotplug/Linux/vif-openvswitch
> @@ -0,0 +1,113 @@
> +#!/bin/bash
> +#
> +# ${XEN_SCRIPT_DIR}/vif-openvswitch
> +#
> +# Script for configuring a vif in openvswitch mode.
> +# The hotplugging system will call this script if it is specified either in
> +# the device configuration given to Xend, or the default Xend configuration
> +# in ${XEN_CONFIG_DIR}/xend-config.sxp.  If the script is specified in
> +# neither of those places, then this script is the default.
> +#
> +# Usage:
> +# vif-openvswitch (add|remove|online|offline)
> +#
> +# Environment vars:
> +# vif vif interface name (required).
> +# XENBUS_PATH path to this device's details in the XenStore (required).
> +#
> +# Read from the store:
> +# bridge  openvswitch to add the vif to (required).
> +# ip  list of IP networks for the vif, space-separated (optional).
> +#
> +# up:
> +# Enslaves the vif interface to the bridge and adds iptables rules
> +# for its ip addresses (if any).
> +#
> +# down:
> +# Removes the vif interface from the bridge and removes the iptables
> +# rules for its ip addresses (if any).
> +#
> +
> +dir=$(dirname "$0")
> +. "$dir/vif-common.sh"
> +
> +check_tools()
> +{
> +if ! command -v ovs-vsctl > /dev/null 2>&1; then
> +fatal "Unable to find ovs-vsctl tool"
> +fi
> +if ! command -v ip > /dev/null 2>&1; then
> +fatal "Unable to find ip tool"
> +fi
> +}
> +openvswitch_external_id() {
> +local dev=$1
> +local key=$2
> +local value=$3
> +
> +echo "-- set interface $dev external-ids:\"$key\"=\"$value\""
> +}
> +
> +openvswitch_external_id_all() {
> +local dev=$1
> +local frontend_id=$(xenstore_read "$XENBUS_PATH/frontend-id")
> +local vm_path=$(xenstore_read "/local/domain/${frontend_id}/vm")
> +local name=$(xenstore_read "${vm_path}/name")
> +openvswitch_external_id $dev "xen-vm-name" "$name"
> +local uuid=$(xenstore_read "${vm_path}/uuid")
> +openvswitch_external_id $dev "xen-vm-uuid" "$uuid"
> +local mac=$(xenstore_read "$XENBUS_PATH/mac")
> +openvswitch_external_id $dev "attached-mac" "$mac"
> +}
> +
> +add_to_openvswitch () {
> +local dev=$1
> +local bridge="$(xenstore_read_default "$XENBUS_PATH/bridge" "$bridge")"
> +local tag trunk
> +
> +if [[ $bridge =~ 
> ^([^.:]+)(\.([[:digit:]]+))?(:([[:digit:]]+(:[[:digit:]]+)*))?$ ]]; then
> +bridge="${BASH_REMATCH[1]}"
> +tag="${BASH_REMATCH[3]}"
> +trunk="${BASH_REMATCH[5]//:/,}"
> +else
> +fatal "No valid bridge was specified"
> +fi
> +
> +if [ $tr

Re: [ovs-dev] [PATCH] FAQ: Explain how to drop packets.

2013-04-23 Thread Kyle Mestery (kmestery)
On Apr 22, 2013, at 11:15 PM, Ben Pfaff  wrote:
> This question keeps coming up.
> 
> Signed-off-by: Ben Pfaff 


Looks good to me Ben.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v4] debian: force-reload-kmod while package upgrading.

2013-04-23 Thread Gurucharan Shetty
Currently, when we upgrade openvswitch packages, we do a restart
of userspace daemons automatically. This does not replace the
kernel module.

But almost everytime, we want to use the new kernel module
that comes with the new version. This means that we need to
manually do a "force-reload-kmod". This step, reloads the
kernel module and also restarts the userspace daemons. This gives
us a total of two restarts of userspace daemons. This is quite
expensive in a hypervisor with hundreds of VMs sending real traffic.
This also hurts the controller as it gets two reconnections in a short
amount of time.

With this patch, during a package upgrade, if the kernel module
on disk is different than the one that is loaded, we will
automatically do a force-reload-kmod while openvswitch-switch
is installed. If not, we will just do a "restart" like before.

One can install the kernel package first and then install the userspace
packages in 2 separate steps to enforce a single 'force-reload-kmod'.

If anyone wants to just restart the userspace package instead of
force-reload-kmod, they can set the value of OVS_FORCE_RELOAD_KMOD=no
while installing the package.
Ex: OVS_FORCE_RELOAD_KMOD=no dpkg -i openvswitch-switch*

Signed-off-by: Gurucharan Shetty 
---
 debian/openvswitch-switch.init |   31 ---
 debian/openvswitch-switch.postinst |7 ---
 utilities/ovs-lib.in   |4 
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/debian/openvswitch-switch.init b/debian/openvswitch-switch.init
index d84c1b6..84aa450 100755
--- a/debian/openvswitch-switch.init
+++ b/debian/openvswitch-switch.init
@@ -75,10 +75,35 @@ stop () {
 }
 
 restart () {
-# OVS_RESTART_SAVE_FLOWS can be set by package postinst script.
-if [ "$OVS_RESTART_SAVE_FLOWS" = "yes" ] || \
-   [ "$1" = "--save-flows=yes" ]; then
+# OVS_FORCE_RELOAD_KMOD can be set by package postinst script.
+if [ "$1" = "--save-flows=yes" ] || \
+[ "${OVS_FORCE_RELOAD_KMOD}" = "no" ]; then
 start restart
+elif [ "${OVS_FORCE_RELOAD_KMOD}" = "yes" ]; then
+depmod -a
+
+if [ -e /sys/module/openvswitch ]; then
+LOADED_SRCVERSION=`cat /sys/module/openvswitch/srcversion`
+LOADED_VERSION=`cat /sys/module/openvswitch/version`
+elif [ -e /sys/module/openvswitch_mod ]; then
+LOADED_SRCVERSION=`cat /sys/module/openvswitch_mod/srcversion`
+LOADED_VERSION=`cat /sys/module/openvswitch_mod/version`
+fi
+SRCVERSION=`modinfo -F srcversion openvswitch 2>/dev/null`
+VERSION=`modinfo -F version openvswitch 2>/dev/null`
+
+ovs_ctl_log "Package upgrading:\n"\
+"Loaded version: ${LOADED_VERSION} 
${LOADED_SRCVERSION}.\n"\
+"Version on disk: ${VERSION} ${SRCVERSION}."
+
+# If the kernel module was previously loaded and it is different than
+# the kernel module on disk, then do a 'force-reload-kmod'.
+if [ -n "${LOADED_SRCVERSION}" ] && [ -n "${SRCVERSION}" ] && \
+[ "${SRCVERSION}" != "${LOADED_SRCVERSION}" ]; then
+start force-reload-kmod
+else
+start restart
+fi
 else
 stop
 start
diff --git a/debian/openvswitch-switch.postinst 
b/debian/openvswitch-switch.postinst
index ac6ed65..2464572 100755
--- a/debian/openvswitch-switch.postinst
+++ b/debian/openvswitch-switch.postinst
@@ -49,9 +49,10 @@ esac
 OVS_MISSING_KMOD_OK=yes
 export OVS_MISSING_KMOD_OK
 
-# Save and restore openflow flows during a package upgrade.
-OVS_RESTART_SAVE_FLOWS=yes
-export OVS_RESTART_SAVE_FLOWS
+# force-reload-kmod during upgrade. If a user wants to override this,
+# they can set the variable OVS_FORCE_RELOAD_KMOD=no while installing.
+[ -z "${OVS_FORCE_RELOAD_KMOD}" ] && OVS_FORCE_RELOAD_KMOD=yes || true
+export OVS_FORCE_RELOAD_KMOD
 
 #DEBHELPER#
 
diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index f7b5bd4..1684ddc 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -37,6 +37,10 @@ else
 dbdir='@DBDIR@'
 fi
 
+ovs_ctl_log () {
+echo "$@" >> "${logdir}/ovs-ctl.log"
+}
+
 ovs_ctl () {
 case "$@" in
 *"=strace"*)
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] debian: force-reload-kmod while package upgrading.

2013-04-23 Thread Gurucharan Shetty
On Mon, Apr 22, 2013 at 9:26 PM, Ben Pfaff  wrote:

> On Mon, Apr 22, 2013 at 05:10:21PM -0700, Gurucharan Shetty wrote:
> > On Mon, Apr 22, 2013 at 2:19 PM, Ben Pfaff  wrote:
> >
> > > On Wed, Apr 17, 2013 at 03:11:12PM -0700, Gurucharan Shetty wrote:
> > > > Currently, when we upgrade openvswitch packages, we do a restart
> > > > of userspace daemons automatically. This does not replace the
> > > > kernel module.
> > > >
> > > > But almost everytime, we want to use the new kernel module
> > > > that comes with the new version. This means that we need to
> > > > manually do a "force-reload-kmod". This step, reloads the
> > > > kernel module and also restarts the userspace daemons. This gives
> > > > us a total of two restarts of userspace daemons. This is quite
> > > > expensive in a hypervisor with hundreds of VMs sending real traffic.
> > > > This also hurts the controller as it gets two reconnections in a
> short
> > > > amount of time.
> > > >
> > > > With this patch, during a package upgrade, if the kernel module
> > > > on disk is different than the one that is loaded, we will
> > > > automatically do a force-reload-kmod while openvswitch-switch
> > > > is installed. If not, we will just do a "restart" like before.
> > > >
> > > > One can install the kernel package first and then install the
> userspace
> > > > packages in 2 separate steps to enforce a single 'force-reload-kmod'.
> > > >
> > > > If anyone wants to just restart the userspace package instead of
> > > > force-reload-kmod, they can set the value of OVS_FORCE_RELOAD_KMOD=no
> > > > while installing the package.
> > > > Ex: OVS_FORCE_RELOAD_KMOD=no dpkg -i openvswitch-switch*
> > > >
> > > > Signed-off-by: Gurucharan Shetty 
> > >
> > > This looks good.
> > >
> > > The one thing that it makes me wonder is whether we should print (or
> > > just log) anything about what decision we're making and why.  If
> > > something gets screwed up on 1 of 1000 hypervisors, then this might be
> > > valuable information in the post-mortem.  (Customer: "WTF did this
> > > upgrade fail?"  me: "...hmm, looks like you somehow installed a kernel
> > > module a year too old." versus "...hmm, dunno.")  But it all depends
> > > on how likely you think these problems are.
> > >
> >
> > Yes. Logging makes sense. I will send v3. I plan to use
> > /sys/module/openvswitch/version
> > instead of srcversion as logging that gives more meaning.
>
> I agree that it's more meaningful.  But my assumption is that srcversion
> changes whenever one modifies the module (I guess it's a hash of the
> source code?), whereas version might not.  That could be a real surprise
> sometimes.
>
> Perhaps one could consider comparing and logging both version and
> srcversion?
>
Okay. I will compare srcversion. Log both srcversion and version. Sending a
v4.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4] debian: force-reload-kmod while package upgrading.

2013-04-23 Thread Ben Pfaff
On Tue, Apr 23, 2013 at 08:37:40AM -0700, Gurucharan Shetty wrote:
> Currently, when we upgrade openvswitch packages, we do a restart
> of userspace daemons automatically. This does not replace the
> kernel module.
> 
> But almost everytime, we want to use the new kernel module
> that comes with the new version. This means that we need to
> manually do a "force-reload-kmod". This step, reloads the
> kernel module and also restarts the userspace daemons. This gives
> us a total of two restarts of userspace daemons. This is quite
> expensive in a hypervisor with hundreds of VMs sending real traffic.
> This also hurts the controller as it gets two reconnections in a short
> amount of time.
> 
> With this patch, during a package upgrade, if the kernel module
> on disk is different than the one that is loaded, we will
> automatically do a force-reload-kmod while openvswitch-switch
> is installed. If not, we will just do a "restart" like before.
> 
> One can install the kernel package first and then install the userspace
> packages in 2 separate steps to enforce a single 'force-reload-kmod'.
> 
> If anyone wants to just restart the userspace package instead of
> force-reload-kmod, they can set the value of OVS_FORCE_RELOAD_KMOD=no
> while installing the package.
> Ex: OVS_FORCE_RELOAD_KMOD=no dpkg -i openvswitch-switch*
> 
> Signed-off-by: Gurucharan Shetty 

Looks good, thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] FAQ: Explain how to drop packets.

2013-04-23 Thread Ben Pfaff
On Tue, Apr 23, 2013 at 02:10:58PM +, Kyle Mestery (kmestery) wrote:
> On Apr 22, 2013, at 11:15 PM, Ben Pfaff  wrote:
> > This question keeps coming up.
> > 
> > Signed-off-by: Ben Pfaff 
> 
> 
> Looks good to me Ben.

Thanks, applied to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4] debian: force-reload-kmod while package upgrading.

2013-04-23 Thread Gurucharan Shetty
On Tue, Apr 23, 2013 at 9:06 AM, Ben Pfaff  wrote:

> On Tue, Apr 23, 2013 at 08:37:40AM -0700, Gurucharan Shetty wrote:
> > Currently, when we upgrade openvswitch packages, we do a restart
> > of userspace daemons automatically. This does not replace the
> > kernel module.
> >
> > But almost everytime, we want to use the new kernel module
> > that comes with the new version. This means that we need to
> > manually do a "force-reload-kmod". This step, reloads the
> > kernel module and also restarts the userspace daemons. This gives
> > us a total of two restarts of userspace daemons. This is quite
> > expensive in a hypervisor with hundreds of VMs sending real traffic.
> > This also hurts the controller as it gets two reconnections in a short
> > amount of time.
> >
> > With this patch, during a package upgrade, if the kernel module
> > on disk is different than the one that is loaded, we will
> > automatically do a force-reload-kmod while openvswitch-switch
> > is installed. If not, we will just do a "restart" like before.
> >
> > One can install the kernel package first and then install the userspace
> > packages in 2 separate steps to enforce a single 'force-reload-kmod'.
> >
> > If anyone wants to just restart the userspace package instead of
> > force-reload-kmod, they can set the value of OVS_FORCE_RELOAD_KMOD=no
> > while installing the package.
> > Ex: OVS_FORCE_RELOAD_KMOD=no dpkg -i openvswitch-switch*
> >
> > Signed-off-by: Gurucharan Shetty 
>
> Looks good, thanks.
>
Thank you, Ben. I pushed this to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] FAQ: Describe how to connect bridges with patch ports.

2013-04-23 Thread Ben Pfaff
I keep seeing this question.

Signed-off-by: Ben Pfaff 
---
 FAQ |   37 +
 1 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/FAQ b/FAQ
index e7249cf..f8faa2f 100644
--- a/FAQ
+++ b/FAQ
@@ -325,6 +325,43 @@ A: No.  ERSPAN is an undocumented proprietary protocol.  
As an
alternative, Open vSwitch supports mirroring to a GRE tunnel (see
above).
 
+Q: How do I connect two bridges?
+
+A: First, why do you want to do this?  Two connected bridges are not
+   much different from a single bridge, so you might as well just have
+   a single bridge with all your ports on it.
+
+   If you still want to connect two bridges, you can use a pair of
+   patch ports.  The follow example creates bridges br0 and br1, adds
+   eth0 and tap0 to br0, adds tap1 to br1, and then connects br0 and
+   br1 with a pair of patch ports.
+
+   ovs-vsctl add-br br0
+   ovs-vsctl add-port br0 eth0
+   ovs-vsctl add-port br0 tap0
+   ovs-vsctl add-br br1
+   ovs-vsctl add-port br1 tap1
+   ovs-vsctl \
+   -- add-port br0 patch0 \
+   -- set interface patch0 type=patch options:peer=patch1 \
+   -- add-port br1 patch1 \
+   -- set interface patch1 type=patch options:peer=patch0
+
+   Bridges connected with patch ports are much like a single bridge.
+   For instance, if the example above also added eth1 to br1, and both
+   eth0 and eth1 happened to be connected to the same next-hop switch,
+   then you could loop your network just as you would if you added
+   eth0 and eth1 to the same bridge (see the "Configuration Problems"
+   section below for more information).
+
+   If you are using Open vSwitch 1.9 or an earlier version, then you
+   need to be using the kernel module bundled with Open vSwitch rather
+   than the one that is integrated into Linux 3.3 and later, because
+   Open vSwitch 1.9 and earlier versions need kernel support for patch
+   ports.  This also means that in Open vSwitch 1.9 and earlier, patch
+   ports will not work with the userspace datapath, only with the
+   kernel module.
+
 Q: Why are there so many different ways to dump flows?
 
 A: Open vSwitch uses different kinds of flows for different purposes:
-- 
1.7.2.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 15/22] add minimal NetBSD support

2013-04-23 Thread Ben Pfaff
On Tue, Apr 23, 2013 at 08:57:28AM -0400, Ed Maste wrote:
> > Ed, do you want to look over any of these changes?  They affect code
> > for which you are a primary author, so if you would prefer to see
> > anything done a different way, etc., for maintainability purposes then
> > it would be valuable to get that in review.
> 
> Yep - I'm looking at them now.  I've reviewed and unit-tested the ones
> you've committed so far and will review patches 15 through 22.

Thanks.

Yamamoto-san, when Ed is done reviewing, will you please repost the
updated patches?  Thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 02/22] deal with platforms where backtrace() is in a different library than libc.

2013-04-23 Thread Ben Pfaff
On Tue, Apr 23, 2013 at 01:30:22AM +, YAMAMOTO Takashi wrote:
> hi,
> 
> > On Mon, Apr 22, 2013 at 10:20:00PM +0900, YAMAMOTO Takashi wrote:
> >> execinfo for NetBSD and ubacktrace for uClibc.
> >> i don't know if the latter is relevant to Open vSwitch, though.
> >> 
> >> Signed-off-by: YAMAMOTO Takashi 
> > 
> > I'm pretty sure this change is going to break (or at least reduce
> > functionality of) lib/timeval.c on GNU/Linux because it checks for
> > HAVE_EXECINFO_H and I think that this change removes that check.
> 
> thanks for reviewing.
> 
> it seems HAVE_EXECINFO_H is provided by another execinfo.h check in
> configure.ac.  (why having the check in two places?)
> 
> my change seems to break the "Defines HAVE_BACKTRACE" part, though.
> 
> is there any reason why you check HAVE_EXECINFO_H in lib/timeval.c
> and HAVE_BACKTRACE in lib/backtrace.c?

I didn't realized that we had the check in two places.  I'll take a
closer look soon.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] what is the current state of this?

2013-04-23 Thread christopher barry
This thread discusses correct configuration on debian. It seems to have
basically trailed off. I would like the correct method to create bridges
out of my interfaces, and have them automagically startup when the box
is rebooted. Surprisingly, the answer is still elusive.

http://openvswitch.org/pipermail/discuss/2011-October/005925.html


Thanks,
-C

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH branch-1.4] ofproto-dpif: Fix memory leak sending packets to controller.

2013-04-23 Thread Ben Pfaff
In the case where execute_controller_action() returned true to
handle_flow_miss(), indicating that the packet had been sent to the
controller, nothing ever freed the packet, causing a memory leak.

One plausible solution seemed to be to turn over ownership of the packet to
execute_controller_action(), by passing true instead of false as its
'clone' argument.  Another is to add an else case to the "if" statement
that calls execute_controller_action() to free the packet.

However, neither of these solutions is actually correct, for a subtle
reason.  The block of memory that includes the packet also includes the
key for the flow miss.  At the end of handle_flow_miss(), past the end of
the loop, the key is normally needed to install the flow in the datapath.
Thus, by destroying the packet we potentially corrupt the key, and this
has actually been seen while testing memory leak fixes.

This commit uses a different approach: instead of trying to destroy the
packets one at a time when it becomes possible, it destroys all the packets
together at the end.

Reported-by: Zoltan Kiss 
Signed-off-by: Ben Pfaff 
---
 ofproto/ofproto-dpif.c |   18 ++
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 2949085..c2bcf6d 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2009, 2010, 2011, 2012 Nicira Networks.
+ * Copyright (c) 2009, 2010, 2011, 2012, 2013 Nicira Networks.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -2516,10 +2516,9 @@ handle_flow_miss(struct ofproto_dpif *ofproto, struct 
flow_miss *miss,
miss->key_fitness, miss->key, miss->key_len,
miss->initial_tci);
 
-LIST_FOR_EACH_SAFE (packet, next_packet, list_node, &miss->packets) {
+LIST_FOR_EACH (packet, list_node, &miss->packets) {
 struct dpif_flow_stats stats;
 
-list_remove(&packet->list_node);
 ofproto->n_matches++;
 
 if (facet->rule->up.cr.priority == FAIL_OPEN_PRIORITY) {
@@ -2710,11 +2709,8 @@ handle_miss_upcalls(struct ofproto_dpif *ofproto, struct 
dpif_upcall *upcalls,
 /* Process each element in the to-do list, constructing the set of
  * operations to batch. */
 n_ops = 0;
-HMAP_FOR_EACH_SAFE (miss, next_miss, hmap_node, &todo) {
+HMAP_FOR_EACH (miss, hmap_node, &todo) {
 handle_flow_miss(ofproto, miss, flow_miss_ops, &n_ops);
-ofpbuf_list_delete(&miss->packets);
-hmap_remove(&todo, &miss->hmap_node);
-free(miss);
 }
 assert(n_ops <= ARRAY_SIZE(flow_miss_ops));
 hmap_destroy(&todo);
@@ -2737,7 +2733,6 @@ handle_miss_upcalls(struct ofproto_dpif *ofproto, struct 
dpif_upcall *upcalls,
 if (op->subfacet->actions != execute->actions) {
 free((struct nlattr *) execute->actions);
 }
-ofpbuf_delete((struct ofpbuf *) execute->packet);
 break;
 
 case DPIF_OP_FLOW_PUT:
@@ -2748,6 +2743,13 @@ handle_miss_upcalls(struct ofproto_dpif *ofproto, struct 
dpif_upcall *upcalls,
 break;
 }
 }
+
+HMAP_FOR_EACH_SAFE (miss, next_miss, hmap_node, &todo) {
+ofpbuf_list_delete(&miss->packets);
+hmap_remove(&todo, &miss->hmap_node);
+free(miss);
+}
+hmap_destroy(&todo);
 }
 
 static void
-- 
1.7.2.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/1] [branch-1.4] [ofproto-dpif] Memory leak at specific PACKET_INs

2013-04-23 Thread Ben Pfaff
I posted a fix.  Please review it:
http://openvswitch.org/pipermail/dev/2013-April/026864.html

On Mon, Apr 22, 2013 at 05:13:28PM -0700, Ben Pfaff wrote:
> I think I see the problem.  It is subtle.  I'll write up a fix
> tomorrow morning.
> 
> On Mon, Apr 22, 2013 at 08:45:52PM +0100, Zoltan Kiss wrote:
> > I found one thing which might be related: these flow_del's are
> > called by facet_remove, and before them there are usually an another
> > flow_del called by facet_unexpected, which looks almost the same,
> > except the hardware address look like:
> > sha=01:0a:00:4e:00:00,tha=00:00:ff:ff:ff:ff
> > But the captures never show such ARP packets. Is it possible that
> > these fields get corrupted when they were installed to the datapath?
> > This src-target hw address combination are found often by
> > facet_unexpected, and never appears in the captures.
> > 
> > Regards,
> > 
> > Zoli
> > 
> > On 22/04/13 19:13, Zoltan Kiss wrote:
> > >Hi,
> > >
> > >On 16/04/13 23:53, Zoltan Kiss wrote:
> > >>On 15/04/13 18:15, Ben Pfaff wrote:
> > >>>On Mon, Apr 15, 2013 at 03:59:52PM +0100, Zoltan Kiss wrote:
> > When the packet is sent to the controller due to an userspace rule
> > (and not
> > a kernel-space flow), execute_controller_action is invoked with
> > clone=true,
> > so handle_flow_miss retains ownership of the packet buffer. But if it
> > returns
> > true (which means the packet had only a PACKET_IN action), nothing
> > frees up
> > the buffer.
> > >>>
> > >>>I think you're right.  But in that case, wouldn't it solve the problem
> > >>>in a better way (doing less memory allocation and copying) by passing
> > >>>clone=false, instead of passing clone=true and then freeing the packet
> > >>>in the caller?
> > >>
> > >>It sounds reasonable, and I was thinking about that, but I was worried
> > >>about the side-effects. Now I've tried it, and it seems it cause
> > >>problems indeed. Broadcast ARP packets are causing problem here:
> > >>
> > >>dpif|WARN|Dropped 26 log messages in last 1 seconds (most recently, 1
> > >>seconds ago) due to excessive rate
> > >>dpif|WARN|system@xenbr0: failed to flow_del (No such file or directory)
> > >>in_port(1),eth(src=ab:cd:ef:12:34:56,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.0.0.1,tip=10.0.0.2,op=1,sha=ab:cd:ef:12:34:56,tha=00:00:00:00:00:00)
> > >>
> > >>
> > >>
> > >>These messages are coming continuously if I install that joker rule to
> > >>userspace. And it doesn't happen with the original clone=true version. I
> > >>haven't found out yet why this happens, it shouldn't really change
> > >>anything but the time when the packet is freed.
> > >
> > >I've tried to find out why these warnings come with clone=false, but I
> > >didn't succeed yet. I've checked this code path:
> > >
> > >handle_flow_miss
> > >  execute_controller_action
> > >   send_packet_in_action
> > >connmgr_send_packet_in (this is where clone makes difference as we
> > >pass rw_packet)
> > > schedule_packet_in
> > >  ofputil_encode_packet_in (this is where we either dupe the buffer
> > >or use the original)
> > >  pinsched_send (in my tests there were no pinscheduler involved
> > >   do_send_packet_in (after this the code won't know about the
> > >content of the packet)
> > >
> > >But I couldn't find any place where it did matter whether we clone and
> > >pass a copy (and free the original immediately after
> > >execute_controller_action), or just give away the original buffer. And
> > >frankly, I'm running out of ideas where else to check. Does anyone touch
> > >that buffer in parallel? Any ideas?
> > >
> > >Regards,
> > >
> > >Zoli
> > >
> > >___
> > >dev mailing list
> > >dev@openvswitch.org
> > >http://openvswitch.org/mailman/listinfo/dev
> > 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] what is the current state of this?

2013-04-23 Thread Gurucharan Shetty
On Tue, Apr 23, 2013 at 10:09 AM, christopher barry wrote:

> This thread discusses correct configuration on debian. It seems to have
> basically trailed off. I would like the correct method to create bridges
> out of my interfaces, and have them automagically startup when the box
> is rebooted. Surprisingly, the answer is still elusive.
>
> http://openvswitch.org/pipermail/discuss/2011-October/005925.html

I don't think there is a perfect way that solves every issue.
If you do not plan to use NFS through a OVS interface (Trying to mount NFS
through a OVS interface will fail if NFS starts before OVS), then you can
just use ovs-vsctl to create bridges. The state is maintained across
reboots as long as you do not delete the database (But you cannot
automatically call dhcp or configure static ip addresses).

If you want to configure through /etc/network/interfaces file, there is a
README file in the repo: debian/openvswitch-switch.README.Debian. If you
use branch 1.10, openvswitch startup script automatically reads the
interfaces file and creates bridges. While shutting down, it deletes the
bridges.

Thanks,
Guru



>
>
>
> Thanks,
> -C
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] 答复: ????: ????: [ovs-discuss] [ovs]deleting interfaces connnected to ports results in ovs-vswitchd memory growing

2013-04-23 Thread Ben Pfaff
On Tue, Apr 23, 2013 at 01:56:17AM +, pengyi Peng(Yi) wrote:
> I find that when ovs-vswitchd memory is abnormally growing, backlog
> in struct rpc is growing fast too. Almost every time in jsonrpc_run,
> stream_send returns -EAGAIN, the msg holds and waits next time to
> send. The backlog pluses the strlen of next msg. So the backlog can
> be as big as 1228252. But after all ports are deleted, the backlog
> will slowly decrease and in the end be zero again. However, at this
> time, with top command, the memory of ovs-vswitchd doesn't change at
> all. It is still bigger than the memory before all interfaces are
> deleted.

None of the above is unexpected.  In particular, it is normal for
Unixs implementation of malloc() to hold on to allocated memory even
after it has been freed.

> And last night, I tested another situation. 
> First I create 200 ports which try to connect to interfaces that don't exist.
> (ovs-vsctl --timeout=5 --no-wait add-port br-0 tap-no-exist$i 
> vlan_mode=access tag=0 -- set interface tap-no-exist$i type=system)
> 
> Second, I create 1000 ports connected to interfaces that exist.
> ( tunctl -t vif$i
> ifconfig vif$i up
> ovs-vsctl --timeout=5 --no-wait add-port br-0 vif$i vlan_mode=access tag=0 -- 
> set interface vif$i type=system)
> 
> After all ports finish connecting to interfaces, the memory of
> ovs-vswitchd is 530832KB. Today it changes to 52KB. But the
> backlog in struct rpc is still very big(455532592).

It is unusual to see the backlog grow so high.  We haven't observed
that behavior before when we've created and destroyed many interfaces,
or at least this is the first time it's been reported to me.  Can you
figure out why ovs-vswitchd is trying to commit so many transactions?
It would be better for it to do avoid doing that.  Maybe we can do
something to coalesce transactions or to skip some of them.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH net-next v3 1/2] genl: Allow concurrent genl callbacks.

2013-04-23 Thread Pravin B Shelar
All genl callbacks are serialized by genl-mutex. This can become
bottleneck in multi threaded case.
Following patch adds an parameter to genl_family so that a
particular family can get concurrent netlink callback without
genl_lock held.
New rw-sem is used to protect genl callback from genl family unregister.
in case of parallel_ops genl-family read-lock is taken for callbacks and
write lock is taken for register or unregistration for any family.
In case of locked genl family semaphore and gel-mutex is locked for
any openration.

Signed-off-by: Pravin B Shelar 
---
v2-v3:
 - rename lockless flag to parallel_ops
 - simplified genl-dump locking.
v1-v2:
 - No change.
---
 include/net/genetlink.h |1 +
 net/netlink/genetlink.c |  114 +++
 2 files changed, 77 insertions(+), 38 deletions(-)

diff --git a/include/net/genetlink.h b/include/net/genetlink.h
index bdfbe68..93024a4 100644
--- a/include/net/genetlink.h
+++ b/include/net/genetlink.h
@@ -50,6 +50,7 @@ struct genl_family {
unsigned intversion;
unsigned intmaxattr;
boolnetnsok;
+   boolparallel_ops;
int (*pre_doit)(struct genl_ops *ops,
struct sk_buff *skb,
struct genl_info *info);
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 5a55be3..2f72598 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -16,10 +16,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
 static DEFINE_MUTEX(genl_mutex); /* serialization of message processing */
+static DECLARE_RWSEM(cb_lock);
 
 void genl_lock(void)
 {
@@ -41,6 +43,18 @@ int lockdep_genl_is_held(void)
 EXPORT_SYMBOL(lockdep_genl_is_held);
 #endif
 
+static void genl_lock_all(void)
+{
+   down_write(&cb_lock);
+   genl_lock();
+}
+
+static void genl_unlock_all(void)
+{
+   genl_unlock();
+   up_write(&cb_lock);
+}
+
 #define GENL_FAM_TAB_SIZE  16
 #define GENL_FAM_TAB_MASK  (GENL_FAM_TAB_SIZE - 1)
 
@@ -144,7 +158,7 @@ int genl_register_mc_group(struct genl_family *family,
BUG_ON(grp->name[0] == '\0');
BUG_ON(memchr(grp->name, '\0', GENL_NAMSIZ) == NULL);
 
-   genl_lock();
+   genl_lock_all();
 
/* special-case our own group */
if (grp == ¬ify_grp)
@@ -213,7 +227,7 @@ int genl_register_mc_group(struct genl_family *family,
 
genl_ctrl_event(CTRL_CMD_NEWMCAST_GRP, grp);
  out:
-   genl_unlock();
+   genl_unlock_all();
return err;
 }
 EXPORT_SYMBOL(genl_register_mc_group);
@@ -255,9 +269,9 @@ static void __genl_unregister_mc_group(struct genl_family 
*family,
 void genl_unregister_mc_group(struct genl_family *family,
  struct genl_multicast_group *grp)
 {
-   genl_lock();
+   genl_lock_all();
__genl_unregister_mc_group(family, grp);
-   genl_unlock();
+   genl_unlock_all();
 }
 EXPORT_SYMBOL(genl_unregister_mc_group);
 
@@ -303,9 +317,9 @@ int genl_register_ops(struct genl_family *family, struct 
genl_ops *ops)
if (ops->policy)
ops->flags |= GENL_CMD_CAP_HASPOL;
 
-   genl_lock();
+   genl_lock_all();
list_add_tail(&ops->ops_list, &family->ops_list);
-   genl_unlock();
+   genl_unlock_all();
 
genl_ctrl_event(CTRL_CMD_NEWOPS, ops);
err = 0;
@@ -334,16 +348,16 @@ int genl_unregister_ops(struct genl_family *family, 
struct genl_ops *ops)
 {
struct genl_ops *rc;
 
-   genl_lock();
+   genl_lock_all();
list_for_each_entry(rc, &family->ops_list, ops_list) {
if (rc == ops) {
list_del(&ops->ops_list);
-   genl_unlock();
+   genl_unlock_all();
genl_ctrl_event(CTRL_CMD_DELOPS, ops);
return 0;
}
}
-   genl_unlock();
+   genl_unlock_all();
 
return -ENOENT;
 }
@@ -373,7 +387,7 @@ int genl_register_family(struct genl_family *family)
INIT_LIST_HEAD(&family->ops_list);
INIT_LIST_HEAD(&family->mcast_groups);
 
-   genl_lock();
+   genl_lock_all();
 
if (genl_family_find_byname(family->name)) {
err = -EEXIST;
@@ -394,7 +408,7 @@ int genl_register_family(struct genl_family *family)
goto errout_locked;
}
 
-   if (family->maxattr) {
+   if (family->maxattr && !family->parallel_ops) {
family->attrbuf = kmalloc((family->maxattr+1) *
sizeof(struct nlattr *), GFP_KERNEL);
if (family->attrbuf == NULL) {
@@ -405,14 +419,14 @@ int genl_register_family(struct genl_family *family)
family->attrbuf = NULL;
 
list_add_tail(&family->family_list, genl_family_chain(family->id));
-   

[ovs-dev] [PATCH net-next v3 2/2] openvswitch: Use parallel_ops genl.

2013-04-23 Thread Pravin B Shelar
OVS locking was recently changed to have private OVS lock which
simplified overall locking.  Therefore there is no need to have
another global genl lock to protect OVS data structures.  Following
patch uses of parallel_ops genl family for OVS.  This also allows
more granual OVS locking using ovs_mutex for protecting OVS data
structures, which gives more concurrencey.  E.g multiple genl
operations OVS_PACKET_CMD_EXECUTE can run in parallel, etc.

Signed-off-by: Pravin B Shelar 
---
v2-v3:
 - No change.
v1-v2:
 - Updated commit msg according comment From Jesse.
---
 net/openvswitch/datapath.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index d2f9f2e..74a5fe6 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -270,7 +270,8 @@ static struct genl_family dp_packet_genl_family = {
.name = OVS_PACKET_FAMILY,
.version = OVS_PACKET_VERSION,
.maxattr = OVS_PACKET_ATTR_MAX,
-   .netnsok = true
+   .netnsok = true,
+   .parallel_ops = true,
 };
 
 int ovs_dp_upcall(struct datapath *dp, struct sk_buff *skb,
@@ -836,7 +837,8 @@ static struct genl_family dp_flow_genl_family = {
.name = OVS_FLOW_FAMILY,
.version = OVS_FLOW_VERSION,
.maxattr = OVS_FLOW_ATTR_MAX,
-   .netnsok = true
+   .netnsok = true,
+   .parallel_ops = true,
 };
 
 static struct genl_multicast_group ovs_dp_flow_multicast_group = {
@@ -1269,7 +1271,8 @@ static struct genl_family dp_datapath_genl_family = {
.name = OVS_DATAPATH_FAMILY,
.version = OVS_DATAPATH_VERSION,
.maxattr = OVS_DP_ATTR_MAX,
-   .netnsok = true
+   .netnsok = true,
+   .parallel_ops = true,
 };
 
 static struct genl_multicast_group ovs_dp_datapath_multicast_group = {
@@ -1623,7 +1626,8 @@ static struct genl_family dp_vport_genl_family = {
.name = OVS_VPORT_FAMILY,
.version = OVS_VPORT_VERSION,
.maxattr = OVS_VPORT_ATTR_MAX,
-   .netnsok = true
+   .netnsok = true,
+   .parallel_ops = true,
 };
 
 struct genl_multicast_group ovs_dp_vport_multicast_group = {
-- 
1.7.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 02/22] deal with platforms where backtrace() is in a different library than libc.

2013-04-23 Thread Ben Pfaff
On Tue, Apr 23, 2013 at 09:34:13AM -0700, Ben Pfaff wrote:
> On Tue, Apr 23, 2013 at 01:30:22AM +, YAMAMOTO Takashi wrote:
> > hi,
> > 
> > > On Mon, Apr 22, 2013 at 10:20:00PM +0900, YAMAMOTO Takashi wrote:
> > >> execinfo for NetBSD and ubacktrace for uClibc.
> > >> i don't know if the latter is relevant to Open vSwitch, though.
> > >> 
> > >> Signed-off-by: YAMAMOTO Takashi 
> > > 
> > > I'm pretty sure this change is going to break (or at least reduce
> > > functionality of) lib/timeval.c on GNU/Linux because it checks for
> > > HAVE_EXECINFO_H and I think that this change removes that check.
> > 
> > thanks for reviewing.
> > 
> > it seems HAVE_EXECINFO_H is provided by another execinfo.h check in
> > configure.ac.  (why having the check in two places?)
> > 
> > my change seems to break the "Defines HAVE_BACKTRACE" part, though.
> > 
> > is there any reason why you check HAVE_EXECINFO_H in lib/timeval.c
> > and HAVE_BACKTRACE in lib/backtrace.c?
> 
> I didn't realized that we had the check in two places.  I'll take a
> closer look soon.

I applied this patch to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] timeval: Check for HAVE_BACKTRACE instead of HAVE_EXECINFO_H.

2013-04-23 Thread Ben Pfaff
Other code in the tree uses HAVE_BACKTRACE and then blindly includes
 if it is present, so this doesn't make anything worse.

Once we do that, HAVE_EXECINFO_H has no further users, so this commit also
removes the check for 

Reported-by: YAMAMOTO Takashi 
Signed-off-by: Ben Pfaff 
---
 configure.ac  |2 +-
 lib/timeval.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index 79fb46e..9f42941 100644
--- a/configure.ac
+++ b/configure.ac
@@ -61,7 +61,7 @@ OVS_CHECK_STRTOK_R
 AC_CHECK_MEMBERS([struct stat.st_mtim.tv_nsec, struct stat.st_mtimensec],
   [], [], [[#include ]])
 AC_CHECK_FUNCS([mlockall strnlen strsignal getloadavg statvfs setmntent])
-AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h 
execinfo.h])
+AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h])
 
 OVS_CHECK_PKIDIR
 OVS_CHECK_RUNDIR
diff --git a/lib/timeval.c b/lib/timeval.c
index 6e41514..163de1e 100644
--- a/lib/timeval.c
+++ b/lib/timeval.c
@@ -37,7 +37,7 @@
 
 /* backtrace() from  is really useful, but it is not signal safe
  * everywhere, such as on x86-64.  */
-#if HAVE_EXECINFO_H && !defined __x86_64__
+#if HAVE_BACKTRACE && !defined __x86_64__
 #  define USE_BACKTRACE 1
 #  include 
 #else
-- 
1.7.2.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] FAQ: Describe how to connect bridges with patch ports.

2013-04-23 Thread Kyle Mestery (kmestery)
On Apr 23, 2013, at 11:30 AM, Ben Pfaff  wrote:
> I keep seeing this question.
> 
> Signed-off-by: Ben Pfaff 


Looks good to me Ben, very succinct explanation.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] datapath: Use openvswitch_handle_frame hook in >=RHEL6.4 to live side by side with bridging

2013-04-23 Thread Jesse Gross
On Fri, Apr 19, 2013 at 7:01 AM, Thomas Graf  wrote:
> diff --git a/datapath/vport-netdev.c b/datapath/vport-netdev.c
> index f8948fa..c0e3c02 100644
> --- a/datapath/vport-netdev.c
> +++ b/datapath/vport-netdev.c
> @@ -153,10 +161,16 @@ static struct vport *netdev_create(const struct 
> vport_parms *parms)
> }
>
> rtnl_lock();
> +#ifdef HAVE_RHEL_OVS_HOOK
> +   netdev_vport->dev->ax25_ptr = vport;
> +   atomic_inc(&nr_bridges);
> +   openvswitch_handle_frame_hook = netdev_frame_hook;

Shouldn't these assignments use rcu_assign_pointer?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] datapath: Account for RHEL6.4 backports in compat layer

2013-04-23 Thread Jesse Gross
On Fri, Apr 19, 2013 at 7:01 AM, Thomas Graf  wrote:
> diff --git a/datapath/linux/compat/include/asm/percpu.h 
> b/datapath/linux/compat/include/asm/percpu.h
> index 404b937..65bce08 100644
> --- a/datapath/linux/compat/include/asm/percpu.h
> +++ b/datapath/linux/compat/include/asm/percpu.h
> @@ -3,7 +3,7 @@
>
>  #include_next 
>
> -#ifndef this_cpu_ptr
> +#if !defined this_cpu_ptr && !defined HAVE_THIS_CPU_PTR

This one surprises me - I would expect that the check for the
definition should already catch any backports. Can you describe what
the issue is?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next 1/2] net: More fine-grained support for encapsulated GSO features

2013-04-23 Thread Joseph Gasparakis


On Mon, 22 Apr 2013, Simon Horman wrote:

> "net: Add support for hardware-offloaded encapsulation" introduced
> the encapsulation field of struct skb, which when set provides hints
> that GSO should handle an skb that encapsulates a packet.
> 
> This patch adds an encapsulation_features field which provides
> a hint to dev dev_hard_start_xmit() that harware-offload encapsulation
> features should be used. Previously this hint was provided by the
> encapsulation field.
> 
> The other uses of the encapsulation field are left unchanged.
> 
> The two in-tree locations that set the encapsulation have been updated to
> also set encapsulation_field.
> 
> The motivation for this is to provide support segmentation of GSO MPLS skbs.
> This may be necessary when a non-MPLS GSO skb is turned into an MPLS GSO
> skb by Open vSwtich and its MPLS push action.
> 
> In this case it  harware-offload encapsulation features should be used,
> actually to be more exact software segmentation should be selected, but
> other hints provided by the encapsulation field are not applicable.
> 
> Cc: Joseph Gasparakis 
> Cc: Peter P Waskiewicz Jr 
> Cc: Alexander Duyck 
> Signed-off-by: Simon Horman 
> ---
>  include/linux/skbuff.h | 9 -
>  net/core/dev.c | 2 +-
>  net/core/skbuff.c  | 1 +
>  net/ipv4/gre.c | 1 +
>  net/ipv4/ipip.c| 1 +
>  5 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 2e0ced1..d9ec1de 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -494,7 +494,14 @@ struct sk_buff {
>* headers if needed
>*/
>   __u8encapsulation:1;
> - /* 7/9 bit hole (depending on ndisc_nodetype presence) */
> + /* Encapsulation protocol and NIC drivers should use
> +  * this flag to indicate to each other if the skb contains
> +  * encapsulated packet and GSO should use encapsulation features
> +  * instead of standard features for the netdev. This is typically
> +  * a subset of cases where skb->encapsulation is set.
> +  */
> + __u8encapsulation_features:1;
> + /* 6/8 bit hole (depending on ndisc_nodetype presence) */
>   kmemcheck_bitfield_end(flags2);
>  
>  #ifdef CONFIG_NET_DMA
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9e26b8d..53236c5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2498,7 +2498,7 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct 
> net_device *dev,
>* hardware encapsulation features instead of standard
>* features for the netdev
>*/
> - if (skb->encapsulation)
> + if (skb->encapsulation_features)
>   features &= dev->hw_enc_features;
>  
>   if (netif_needs_gso(skb, features)) {
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 898cf5c..f23d136 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -708,6 +708,7 @@ static void __copy_skb_header(struct sk_buff *new, const 
> struct sk_buff *old)
>   new->l4_rxhash  = old->l4_rxhash;
>   new->no_fcs = old->no_fcs;
>   new->encapsulation  = old->encapsulation;
> + new->encapsulation_features = old->encapsulation_features;
>  #ifdef CONFIG_XFRM
>   new->sp = secpath_get(old->sp);
>  #endif
> diff --git a/net/ipv4/gre.c b/net/ipv4/gre.c
> index 0ae998b..8420f29 100644
> --- a/net/ipv4/gre.c
> +++ b/net/ipv4/gre.c
> @@ -157,6 +157,7 @@ static struct sk_buff *gre_gso_segment(struct sk_buff 
> *skb,
>   }
>  
>   skb->encapsulation = 0;
> + skb->encapsulation_features = 0;
>  
>   if (unlikely(!pskb_may_pull(skb, ghl)))
>   goto out;
> diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
> index 77bfcce..a6db3c0 100644
> --- a/net/ipv4/ipip.c
> +++ b/net/ipv4/ipip.c
> @@ -220,6 +220,7 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, 
> struct net_device *dev)
>   if (likely(!skb->encapsulation)) {
>   skb_reset_inner_headers(skb);
>   skb->encapsulation = 1;
> + skb->encapsulation_features = 1;
>   }
>  
>   ip_tunnel_xmit(skb, dev, tiph);
> 

Any particular reason to introduce skb->encapsulation_features instead of 
using the existing skb->encapsulation? Also I don't see it used in your 
second patch either.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] FAQ: Describe how to connect bridges with patch ports.

2013-04-23 Thread Justin Pettit
On Apr 23, 2013, at 9:30 AM, Ben Pfaff  wrote:

> +   If you still want to connect two bridges, you can use a pair of
> +   patch ports.  The follow example creates bridges br0 and br1, adds

s/follow/following/

Otherwise, looks good to me.

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] datapath: Account for RHEL6.4 backports in compat layer

2013-04-23 Thread Thomas Graf

On 04/23/2013 10:42 PM, Jesse Gross wrote:

On Fri, Apr 19, 2013 at 7:01 AM, Thomas Graf  wrote:

diff --git a/datapath/linux/compat/include/asm/percpu.h 
b/datapath/linux/compat/include/asm/percpu.h
index 404b937..65bce08 100644
--- a/datapath/linux/compat/include/asm/percpu.h
+++ b/datapath/linux/compat/include/asm/percpu.h
@@ -3,7 +3,7 @@

  #include_next 

-#ifndef this_cpu_ptr
+#if !defined this_cpu_ptr && !defined HAVE_THIS_CPU_PTR


This one surprises me - I would expect that the check for the
definition should already catch any backports. Can you describe what
the issue is?


The reason for that is that RHEL defines this_cpu_ptr in a different
header file and not in asm/percpu.h. I found what I proposed to be
the easiest workaround for that.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] datapath: Account for RHEL6.4 backports in compat layer

2013-04-23 Thread Jesse Gross
On Tue, Apr 23, 2013 at 2:27 PM, Thomas Graf  wrote:
> On 04/23/2013 10:42 PM, Jesse Gross wrote:
>>
>> On Fri, Apr 19, 2013 at 7:01 AM, Thomas Graf  wrote:
>>>
>>> diff --git a/datapath/linux/compat/include/asm/percpu.h
>>> b/datapath/linux/compat/include/asm/percpu.h
>>> index 404b937..65bce08 100644
>>> --- a/datapath/linux/compat/include/asm/percpu.h
>>> +++ b/datapath/linux/compat/include/asm/percpu.h
>>> @@ -3,7 +3,7 @@
>>>
>>>   #include_next 
>>>
>>> -#ifndef this_cpu_ptr
>>> +#if !defined this_cpu_ptr && !defined HAVE_THIS_CPU_PTR
>>
>>
>> This one surprises me - I would expect that the check for the
>> definition should already catch any backports. Can you describe what
>> the issue is?
>
>
> The reason for that is that RHEL defines this_cpu_ptr in a different
> header file and not in asm/percpu.h. I found what I proposed to be
> the easiest workaround for that.

OK, thanks.

I was about to apply this but it adds a new warning (on a non-RHEL platform):
/home/jgross/openvswitch/datapath/linux/netdevice.c:5:13: warning:
‘can_checksum_protocol’ defined but not used [-Wunused-function]

This happens since the code in netdevice.c was previously under a
single #ifdef and now the blocks can be independently used.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] timeval: Check for HAVE_BACKTRACE instead of HAVE_EXECINFO_H.

2013-04-23 Thread Justin Pettit
Looks good.

--Justin


On Apr 23, 2013, at 11:07 AM, Ben Pfaff  wrote:

> Other code in the tree uses HAVE_BACKTRACE and then blindly includes
>  if it is present, so this doesn't make anything worse.
> 
> Once we do that, HAVE_EXECINFO_H has no further users, so this commit also
> removes the check for 
> 
> Reported-by: YAMAMOTO Takashi 
> Signed-off-by: Ben Pfaff 
> ---
> configure.ac  |2 +-
> lib/timeval.c |2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 79fb46e..9f42941 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -61,7 +61,7 @@ OVS_CHECK_STRTOK_R
> AC_CHECK_MEMBERS([struct stat.st_mtim.tv_nsec, struct stat.st_mtimensec],
>   [], [], [[#include ]])
> AC_CHECK_FUNCS([mlockall strnlen strsignal getloadavg statvfs setmntent])
> -AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h 
> execinfo.h])
> +AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h])
> 
> OVS_CHECK_PKIDIR
> OVS_CHECK_RUNDIR
> diff --git a/lib/timeval.c b/lib/timeval.c
> index 6e41514..163de1e 100644
> --- a/lib/timeval.c
> +++ b/lib/timeval.c
> @@ -37,7 +37,7 @@
> 
> /* backtrace() from  is really useful, but it is not signal safe
>  * everywhere, such as on x86-64.  */
> -#if HAVE_EXECINFO_H && !defined __x86_64__
> +#if HAVE_BACKTRACE && !defined __x86_64__
> #  define USE_BACKTRACE 1
> #  include 
> #else
> -- 
> 1.7.2.5
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] datapath: Account for RHEL6.4 backports in compat layer

2013-04-23 Thread Thomas Graf

On 04/24/2013 12:06 AM, Jesse Gross wrote:

OK, thanks.

I was about to apply this but it adds a new warning (on a non-RHEL platform):
/home/jgross/openvswitch/datapath/linux/netdevice.c:5:13: warning:
‘can_checksum_protocol’ defined but not used [-Wunused-function]

This happens since the code in netdevice.c was previously under a
single #ifdef and now the blocks can be independently used.


Right, I'll respin the patch and move it into the #ifdef containing
the only user.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] datapath: Use openvswitch_handle_frame hook in >=RHEL6.4 to live side by side with bridging

2013-04-23 Thread Thomas Graf

On 04/23/2013 10:39 PM, Jesse Gross wrote:

On Fri, Apr 19, 2013 at 7:01 AM, Thomas Graf  wrote:

diff --git a/datapath/vport-netdev.c b/datapath/vport-netdev.c
index f8948fa..c0e3c02 100644
--- a/datapath/vport-netdev.c
+++ b/datapath/vport-netdev.c
@@ -153,10 +161,16 @@ static struct vport *netdev_create(const struct 
vport_parms *parms)
 }

 rtnl_lock();
+#ifdef HAVE_RHEL_OVS_HOOK
+   netdev_vport->dev->ax25_ptr = vport;
+   atomic_inc(&nr_bridges);
+   openvswitch_handle_frame_hook = netdev_frame_hook;


Shouldn't these assignments use rcu_assign_pointer?


Are you worried about a missing wmb? The hook is only used if
IFF_OVS_DATAPATH is set on skb->dev and there should be a wmb when
the device is actually added to the list of vports I guess.

I have no problem adding it though.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] datapath: Use openvswitch_handle_frame hook in >=RHEL6.4 to live side by side with bridging

2013-04-23 Thread Jesse Gross
On Tue, Apr 23, 2013 at 3:28 PM, Thomas Graf  wrote:
> On 04/23/2013 10:39 PM, Jesse Gross wrote:
>>
>> On Fri, Apr 19, 2013 at 7:01 AM, Thomas Graf  wrote:
>>>
>>> diff --git a/datapath/vport-netdev.c b/datapath/vport-netdev.c
>>> index f8948fa..c0e3c02 100644
>>> --- a/datapath/vport-netdev.c
>>> +++ b/datapath/vport-netdev.c
>>> @@ -153,10 +161,16 @@ static struct vport *netdev_create(const struct
>>> vport_parms *parms)
>>>  }
>>>
>>>  rtnl_lock();
>>> +#ifdef HAVE_RHEL_OVS_HOOK
>>> +   netdev_vport->dev->ax25_ptr = vport;
>>> +   atomic_inc(&nr_bridges);
>>> +   openvswitch_handle_frame_hook = netdev_frame_hook;
>>
>>
>> Shouldn't these assignments use rcu_assign_pointer?
>
>
> Are you worried about a missing wmb? The hook is only used if
> IFF_OVS_DATAPATH is set on skb->dev and there should be a wmb when
> the device is actually added to the list of vports I guess.
>
> I have no problem adding it though.

I agree that in practice there are other memory barriers that will
protect this. However, since the locks that they are associated with
aren't really protecting this, it makes it more difficult to reason
about than necessary. Since we're using RCU it seems like a good idea
to use the whole mechanism.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next 2/2] net: Loosen constraints for recalculating checksum in skb_segment()

2013-04-23 Thread Jesse Gross
On Mon, Apr 22, 2013 at 7:19 PM, Simon Horman  wrote:
> In the case where a non-MPLS GSO skb becomes an MPLS GSO skb, via
> Open vSwitch's push MPLS action it is desirable to provide segmentation
> in software. In this case the original protocol of the skb may have allowed
> its checksumming to be offloaded but this may no longer be supported now
> the skb is MPLS. Actually it seems to me that this is the likely case.
>
> In order to allow the checksum to be updated in this case loosen
> the rules for recalculating the checksum on in skb_segment().
>
> N.B.: I must confess that I am a little unsure of the details of
> the implementation of skb_checksum(). But I have observed that this
> is necessary as skb_checksum() hits the following:
>
> if (!hsize && i >= nfrags) {
> ...
> fskb = fskb->next;
> ...
> }
> ...
> if (fskb != skb_shinfo(skb)->frag_list)
> ...
>
> Signed-off-by: Simon Horman 

I'm surprised at the need for changes here since neither vlans nor
tunneling protocols needed something similar. Do you know what is
special about MPLS that is triggering this?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2.26] datapath: Add basic MPLS support to kernel

2013-04-23 Thread Jesse Gross
On Tue, Apr 23, 2013 at 12:58 AM, Simon Horman  wrote:
> On Tue, Apr 23, 2013 at 07:41:37AM +, Rajahalme, Jarno (NSN - FI/Espoo) 
> wrote:
>>
>> On Apr 23, 2013, at 4:51 , ext Simon Horman wrote:
>>
>> > On Mon, Apr 22, 2013 at 01:55:43PM +, Rajahalme, Jarno (NSN - 
>> > FI/Espoo) wrote:
>> >>
>> >> Here the skb_network_header is changed to point to the L3 header. Is it
>> >> significant that in some cases (?) mpls_stack_depth may remain at zero,
>> >> even when a MPLS header was in fact added? (See above).
>> >
>> > With the current code I believe there are the following cases:
>> >
>> > Input: non-MPLS skb: Output: network header and mac_len correspond to the
>> > beginning of the L3 headers
>> > Input: MPLS: Output: network header and mac_len correspond to the
>> > end of the L2 headers.
>> >
>> > In the case of MPLS output the end of the L2 headers and the beginning
>> > of the L3 headers will differ.
>> >
>> >
>> > As far as I know the network header and mac_len only need to correspond to
>> > the beginning of the L3 header if GSO segmentation will occur (actually,
>> > some proposed changes to the network stack are required, see "[PATCH 0/2]
>> > Small Modifications to GSO to allow segmentation of MPLS").  That only
>> > occurs if the skb is GSO. Which in turn can only occur if the recieved
>> > packet is non-MPLS. This is because the linux kernel doesn't support
>> > MPLS offloads on receive (or anywhere else for that matter).
>> >
>> > In the case that we have a non-MPLS skb the stack depth starts at zero and
>> > is tracked. This is used to update the network header and mac_len.
>> > Otherwise the stack depth is unknown and the network header and mac_len are
>> > left as-is, corresponding to the end of the L2 headers.
>> >
>> > Actually, it is possible to tighten up the if clause to be the following,
>> > as it is only necessary to update the network header and mac_len for GSO 
>> > skbs.
>> >
>> > if (mpls_stack_depth && skb_is_gso(skb)) {
>> > ...
>> > }
>> >
>> > It is possible for us to find and track the MPLS stack depth for all cases
>> > and to update the network header and mac_len. However I don't think that
>> > there is any run-time benefit and it seems expensive to find out what the
>> > original stack depth was - I believe it would require parsing the MPLS
>> > entire stack for each packet.
>> >
>>
>> Thanks for explaining this.
>>
>> I think it would be better to keep updating the the network_header and 
>> mac_len for the Non-MPLS input packets regardless of the GSO status of the 
>> skb. It would be more consistent and less surprising.
>
> I agree entirely that it would be more consistent and less surprising.
> But I'm not sure if the cost is worth it.
>
> Jesse, do you have an opinion on this?

In general, I would tend to agree with Jarno that keeping this
consistent would be significantly easier to understand. I think the
cost is probably not particularly high.

However, I also think that having different meanings for the layer
pointers inside and outside of OVS is not particularly ideal since it
makes the overall system harder to understand. Using network header
for the start of the MPLS stack might not be great since it means that
we couldn't really take advantage of any actual hardware offloading in
the future. Maybe we could use mac_len for that purpose and that would
keep things more consistent?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev