Re: [ovs-dev] [PATCH 1/3] bond: Pseudo-randomly choose initial slave.
On Mon, Feb 04, 2013 at 07:13:34PM -0800, Ethan Jackson wrote: > Traditionally, (unless balancing was turned off), the bonding code > chose the active slave for all flows which hadn't previously been > allocated to another. This worked fine in theory because traffic > would eventually be balanced away if there was congestion. > > Instead of the historical approach, this patch uses a hash to > choose the initial slave, giving what behaves like a random > allocation. This performs better on switches using a long > rebalance interval, is less confusing to users, and removes a > special case needed for dealing with non-balanced bonds. > > Signed-off-by: Ethan Jackson This doesn't change anything in the common case, only in the case where the first randomly chosen slave (via hmap_random_node()) is disabled. Is that corner case worth worrying about? It can only make a difference in a bond with three or more slaves. If it is worth dealing with, why pass a hash to choose_stb_slave() instead of just a random number? (And why not just use choose_stb_slave() by itself instead of as a second pass?) > --- a/lib/bond.c > +++ b/lib/bond.c > @@ -1442,15 +1442,12 @@ choose_output_slave(const struct bond *bond, const > struct flow *flow, > } > /* Fall Through. */ > case BM_SLB: > -if (!bond_is_balanced(bond)) { > -return choose_stb_slave(bond, bond_hash(bond, flow, vlan)); > -} > e = lookup_bond_entry(bond, flow, vlan); > if (!e->slave || !e->slave->enabled) { > e->slave = CONTAINER_OF(hmap_random_node(&bond->slaves), > struct bond_slave, hmap_node); > if (!e->slave->enabled) { > -e->slave = bond->active_slave; > +e->slave = choose_stb_slave(bond, bond_hash(bond, flow, > vlan)); > } > e->tag = tag_create_random(); > } > -- > 1.7.9.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/3] bond: Remove stable bond mode.
On Mon, Feb 04, 2013 at 07:13:35PM -0800, Ethan Jackson wrote: > Stable bond mode, along with autopath, were trying to implement > functionality close to what we get from the bundle action. > Unfortunately, they are quite clunky, and generally less useful > than bundle, so they're being removed. > > Signed-off-by: Ethan Jackson The comment on the function now called choose_hash_slave() seems rather shrill. I don't think the performance is an issue. choose_hash_slave() is now just trying to choose a random slave, right? Then we can use a simpler algorithm that doesn't need any extra input: /* Randomly chooses and returns an enabled slave. Returns NULL if no slave is * enabled. */ static struct bond_slave * choose_random_slave(const struct bond *bond) { struct bond_slave *best, *slave; int n; n = 0; best = NULL; HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) { if (slave->enabled && !random_range(++n)) { best = slave; } } return best; } Thanks, Ben. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] RFC: Pass more packet and flow key info to userspace.
On Jan 31, 2013, at 3:19 , ext Jesse Gross wrote: > On Wed, Jan 30, 2013 at 3:14 AM, Rajahalme, Jarno (NSN - FI/Espoo) > wrote: >> >> Otherwise not so much use of the layer pointers, but >> dpif_flow_stats_extract() >> does packet_get_tcp_flags(), which needs the layer pointers. >> dpif_flow_stats_extract() is called on each miss packet (but the collected >> tcp_flags are used only with facets, for netflow and facet->tcp_flags). >> Other use is for controller action and special processing (stpp/cfm/lacp). >> >> So it seems that deferred layer pointer setting could be effective only if >> the >> stats extraction would be changed to skip tcp_flags for the cases where they >> are not going to be used (i.e. when not making facets, as might be expected >> under heavy load?). > > That's a good point about the TCP flags. > > I think it's actually not right that we don't use the TCP flags when > we don't create facets under load. Not creating facets is an > optimization and that shouldn't affect the results that are seen > through NetFlow. Fixing this seem non-trivial, as the nf_flow resides within struct facet. Without a facet I do not think there is any memory of specific flows seen, so there is no basis for reporting such flows vie netflow or any other means. The design seems to be that tracking of short flows is not essential under load, see this comment for flow_miss_should_make_facet(): /* Figures out whether a flow that missed in 'ofproto', whose details are in * 'miss', is likely to be worth tracking in detail in userspace and (usually) * installing a datapath flow. The answer is usually "yes" (a return value of * true). However, for short flows the cost of bookkeeping is much higher than * the benefits, so when the datapath holds a large number of flows we impose * some heuristics to decide which flows are likely to be worth tracking. */ If, however, netflow should not be affected, then a facet should be installed for every flow. In that case maybe the installation of kernel flows could be avoided for short flows under load? > > I suspect that there are quite a lot of cases where NetFlow isn't used > (NXAST_FIN_TIMEOUT is the other case that comes to mind that uses the > flags) so it might be worthwhile to turn off collection when those > features aren't in use. It seems possible that netflow is turned on at runtime. Would it be acceptable that netflow data would be missing for the time before netflow was turned on? > >> Also, I noticed that the facets hmap uses the hash from flow_miss, and >> according to the comments that should be flow_hash(flow, 0). However, >> the only place the hash originates in the code is at handle_miss_upcalls(), >> so again, there is no harm in using a kernel provided hash instead. But the >> comments need to be updated if this change is adopted. > > It would probably be good to benchmark the results of these two > changes independently to see which is more significant. > > In general, I'm nervous about putting optimizations into the > userspace/kernel interface because it exposes implementation > information. Keeping a neutral format can actually make it easier to > optimize in the future. In particular, I know that Ben is thinking > about grouping together flows at a coarser granularity than the kernel > uses, which would make passing the hash up not that useful. While I would not see a problem in passing a key hash computed with an undisclosed algorithm up to userspace, I see your point. In my tests it seems that about 1/3 of the benefit is attainable with deferred layer pointer computation within the userspace. Jarno ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 3/3] nicira-ext: Remove the autopath action.
On Mon, Feb 04, 2013 at 07:13:36PM -0800, Ethan Jackson wrote: > The autopath action was attempting to achieve functionality similar > to the bundle action, but was significantly clunkier, more > difficult to understand, more difficult to use, and less reliable. > This patch removes it. > > Signed-off-by: Ethan Jackson Looks good, thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Tue, Feb 05, 2013 at 10:13:18AM +0900, Simon Horman wrote: > On Tue, Feb 05, 2013 at 09:00:35AM +0900, Simon Horman wrote: > > On Mon, Feb 04, 2013 at 10:09:20AM -0800, Ben Pfaff wrote: > > > On Fri, Jan 25, 2013 at 04:22:07PM +0900, Simon Horman wrote: > > > > This patch implements use-space datapath and non-datapath code > > > > to match and use the datapath API set out in Leo Alterman's patch > > > > "user-space datapath: Add basic MPLS support to kernel". > > > > > > > > The resulting MPLS implementation supports: > > > > * Pushing a single MPLS label > > > > * Poping a single MPLS label > > > > * Modifying an MPLS lable using set-field or load actions > > > > that act on the label value, tc and bos bit. > > > > * There is no support for manipulating the TTL > > > > this is considered future work. > > > > > > > > The single-level push pop limitation is implemented by processing > > > > push, pop and set-field/load actions in order and discarding information > > > > that would require multiple levels of push/pop to be supported. > > > > > > > > e.g. > > > >push,push -> the first push is discarded > > > >pop,pop -> the first pop is discarded > > > > > > > > This patch is based heavily on work by Ravi K. > > > > > > > > Cc: Ravi K > > > > Reviewed-by: Isaku Yamahata > > > > Signed-off-by: Simon Horman > > > > > > I resolved a few merge conflicts, then I went through this fairly > > > carefully and made a few changes, mostly cosmetic. I'm appending an > > > incremental and then a full revised patch. > > > > > > Here's a little commentary on the less trivial changes I made: > > > > > >- I think that eth_mpls_depth() was capable of counting an MPLS label > > > that was off the end of the packet as part of the depth (for > > > example, if there was one MPLS label without BOS followed by the > > > end of the packet, I believe that it would count this as depth 2). > > > This wasn't a problem for current callers but it seemed like a bad > > > idea, so I changed it. Please check that it looks correct. > > > > > >- I think mf_random_value() should have used ->u8 for TC and BOS so I > > > changed it. > > > > Thanks, both of those changes seem reasonable to me. I'll review > > these and the changes in your diff more thoroughly and get back to you ASAP. > > I have reviewed all the changes and they look good to me. > Please feel free to apply. I can then rebase the remaining patches on > master. I fixed a few indentation errors I noticed at the last minute and applied this to master. Thanks! > > > I have one question before I commit this: it seems that we could support > > > a series of more than one pop_mpls action, why do we limit the number of > > > pops to just one? > > > > I think that I agree with you. I'll take a look and see how practical > > it would be to remove that restriction. > > I'll start looking into this in more depth. But I'd rather handle any > updates as a subsequent change to this patch. Thanks, that sounds fine. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] ovs-ctl: Add support for built-in (non-modular) kernel support.
Reported-by: Cong Wang Signed-off-by: Ben Pfaff --- Cong, will you verify that this makes the init script work OK for your kernel with OVS built-in? Thanks, Ben. utilities/ovs-ctl.in |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in index 8288e13..63dbbca 100755 --- a/utilities/ovs-ctl.in +++ b/utilities/ovs-ctl.in @@ -35,6 +35,12 @@ insert_openvswitch_mod_if_required () { test -e /sys/module/openvswitch -o -e /sys/module/openvswitch_mod && \ return 0 +# If "ovs-dpctl show" succeeds, even though the kernel module doesn't +# appear to be loaded, then Open vSwitch support is built into the +# kernel instead of modular (or the kernel module is loaded but sysfs +# is not mounted). +ovs-dpctl show -voff >/dev/null 2>&1 && return 0 + # Load openvswitch. If that's successful then we're done. action "Inserting openvswitch module" modprobe openvswitch && return 0 -- 1.7.2.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 1/3] bond: Pseudo-randomly choose initial slave.
Upon reflection, I think you're right. I've just dropped this patch. Ethan On Tue, Feb 5, 2013 at 8:40 AM, Ben Pfaff wrote: > On Mon, Feb 04, 2013 at 07:13:34PM -0800, Ethan Jackson wrote: >> Traditionally, (unless balancing was turned off), the bonding code >> chose the active slave for all flows which hadn't previously been >> allocated to another. This worked fine in theory because traffic >> would eventually be balanced away if there was congestion. >> >> Instead of the historical approach, this patch uses a hash to >> choose the initial slave, giving what behaves like a random >> allocation. This performs better on switches using a long >> rebalance interval, is less confusing to users, and removes a >> special case needed for dealing with non-balanced bonds. >> >> Signed-off-by: Ethan Jackson > > This doesn't change anything in the common case, only in the case where > the first randomly chosen slave (via hmap_random_node()) is disabled. > Is that corner case worth worrying about? It can only make a difference > in a bond with three or more slaves. If it is worth dealing with, why > pass a hash to choose_stb_slave() instead of just a random number? (And > why not just use choose_stb_slave() by itself instead of as a second > pass?) > >> --- a/lib/bond.c >> +++ b/lib/bond.c >> @@ -1442,15 +1442,12 @@ choose_output_slave(const struct bond *bond, const >> struct flow *flow, >> } >> /* Fall Through. */ >> case BM_SLB: >> -if (!bond_is_balanced(bond)) { >> -return choose_stb_slave(bond, bond_hash(bond, flow, vlan)); >> -} >> e = lookup_bond_entry(bond, flow, vlan); >> if (!e->slave || !e->slave->enabled) { >> e->slave = CONTAINER_OF(hmap_random_node(&bond->slaves), >> struct bond_slave, hmap_node); >> if (!e->slave->enabled) { >> -e->slave = bond->active_slave; >> +e->slave = choose_stb_slave(bond, bond_hash(bond, flow, >> vlan)); >> } >> e->tag = tag_create_random(); >> } >> -- >> 1.7.9.5 >> >> ___ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/3] bond: Remove stable bond mode.
Since I've dropped the first patch, I think choose_hash_slave() is no longer necessary. In this version of the patch, I've removed it. --- NEWS |1 + lib/bond.c | 87 +++- lib/bond.h |4 +- ofproto/ofproto-dpif.c | 12 ++ ofproto/ofproto.h |1 - vswitchd/bridge.c | 23 ++-- vswitchd/vswitch.ovsschema |6 +-- vswitchd/vswitch.xml | 26 - 8 files changed, 17 insertions(+), 143 deletions(-) diff --git a/NEWS b/NEWS index 186e1a3..58c0fcc 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,6 @@ post-v1.10.0 - +- Stable bond mode has been removed. v1.10.0 - xx xxx diff --git a/lib/bond.c b/lib/bond.c index 06680ee..ef7d28d 100644 --- a/lib/bond.c +++ b/lib/bond.c @@ -75,9 +75,6 @@ struct bond_slave { struct list bal_node; /* In bond_rebalance()'s 'bals' list. */ struct list entries;/* 'struct bond_entry's assigned here. */ uint64_t tx_bytes; /* Sum across 'tx_bytes' of entries. */ - -/* BM_STABLE specific bonding info. */ -uint32_t stb_id;/* ID used for 'stb_slaves' ordering. */ }; /* A bond, that is, a set of network devices grouped to improve performance or @@ -104,9 +101,6 @@ struct bond { long long int next_rebalance; /* Next rebalancing time. */ bool send_learning_packets; -/* BM_STABLE specific bonding info. */ -tag_type stb_tag; /* Tag associated with this bond. */ - /* Legacy compatibility. */ long long int next_fake_iface_update; /* LLONG_MAX if disabled. */ @@ -147,8 +141,6 @@ bond_mode_from_string(enum bond_mode *balance, const char *s) *balance = BM_TCP; } else if (!strcmp(s, bond_mode_to_string(BM_SLB))) { *balance = BM_SLB; -} else if (!strcmp(s, bond_mode_to_string(BM_STABLE))) { -*balance = BM_STABLE; } else if (!strcmp(s, bond_mode_to_string(BM_AB))) { *balance = BM_AB; } else { @@ -165,8 +157,6 @@ bond_mode_to_string(enum bond_mode balance) { return "balance-tcp"; case BM_SLB: return "balance-slb"; -case BM_STABLE: -return "stable"; case BM_AB: return "active-backup"; } @@ -187,7 +177,6 @@ bond_create(const struct bond_settings *s) bond = xzalloc(sizeof *bond); hmap_init(&bond->slaves); bond->no_slaves_tag = tag_create_random(); -bond->stb_tag = tag_create_random(); bond->next_fake_iface_update = LLONG_MAX; bond_reconfigure(bond, s); @@ -256,12 +245,6 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s) if (bond->balance != s->balance) { bond->balance = s->balance; revalidate = true; - -if (bond->balance == BM_STABLE) { -VLOG_WARN_ONCE("Stable bond mode is deprecated and may be removed" - " in February 2013. Please email" - " dev@openvswitch.org with concerns."); -} } if (bond->basis != s->basis) { @@ -303,17 +286,12 @@ bond_slave_set_netdev__(struct bond_slave *slave, struct netdev *netdev) * bond. If 'slave_' already exists within 'bond' then this function * reconfigures the existing slave. * - * 'stb_id' is used in BM_STABLE bonds to guarantee consistent slave choices - * across restarts and distributed vswitch instances. It should be unique per - * slave, and preferably consistent across restarts and reconfigurations. - * * 'netdev' must be the network device that 'slave_' represents. It is owned * by the client, so the client must not close it before either unregistering * 'slave_' or destroying 'bond'. */ void -bond_slave_register(struct bond *bond, void *slave_, uint32_t stb_id, -struct netdev *netdev) +bond_slave_register(struct bond *bond, void *slave_, struct netdev *netdev) { struct bond_slave *slave = bond_slave_lookup(bond, slave_); @@ -331,11 +309,6 @@ bond_slave_register(struct bond *bond, void *slave_, uint32_t stb_id, bond_enable_slave(slave, netdev_get_carrier(netdev), NULL); } -if (slave->stb_id != stb_id) { -slave->stb_id = stb_id; -bond->bond_revalidate = true; -} - bond_slave_set_netdev__(slave, netdev); free(slave->name); @@ -438,17 +411,12 @@ bond_run(struct bond *bond, struct tag_set *tags, enum lacp_status lacp_status) } if (bond->bond_revalidate) { -bond->bond_revalidate = false; +struct bond_slave *slave; +bond->bond_revalidate = false; bond_entry_reset(bond); -if (bond->balance != BM_STABLE) { -struct bond_slave *slave; - -HMAP_FOR_EACH (slave, hmap_node, &bond->slaves) { -tag_set_add(tags, slave->tag); -} -} else { -tag_set_add(tags, bond->stb_tag); +HMAP_FOR_EACH
Re: [ovs-dev] [test log scan v2 1/6] tests: Fix error path in netflow test.
Acked-by: Ethan Jackson On Fri, Feb 1, 2013 at 10:48 AM, Ben Pfaff wrote: > Otherwise, if the test bails out before finishing, the test-netflow daemon > doesn't get killed and the test hangs. > > Signed-off-by: Ben Pfaff > --- > tests/ofproto-dpif.at |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at > index 3eec947..f007381 100644 > --- a/tests/ofproto-dpif.at > +++ b/tests/ofproto-dpif.at > @@ -1010,7 +1010,7 @@ ovs-vsctl \ > --id=@nf create NetFlow targets=\"127.0.0.1:$NETFLOW_PORT\" \ > engine_id=1 engine_type=2 active_timeout=10 add-id-to-interface=false > > -ON_EXIT([kill `test-netflow.pid`]) > +ON_EXIT([kill `cat test-netflow.pid`]) > AT_CHECK([test-netflow --detach --no-chdir --pidfile $NETFLOW_PORT:127.0.0.1 > > netflow.log])AT_CAPTURE_FILE([netflow.log]) > > AT_CHECK([ovs-appctl time/stop]) > -- > 1.7.2.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [test log scan v2 2/6] poll-loop: Reduce high-CPU log messages from WARN to INFO.
This has annoyed me for a while, thanks. Acked-by: Ethan Jackson On Fri, Feb 1, 2013 at 10:48 AM, Ben Pfaff wrote: > These can happen occasionally in normal circumstances. > > Signed-off-by: Ben Pfaff > --- > lib/poll-loop.c |4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/poll-loop.c b/lib/poll-loop.c > index fca1dfa..9855306 100644 > --- a/lib/poll-loop.c > +++ b/lib/poll-loop.c > @@ -165,8 +165,8 @@ log_wakeup(const char *where, const struct pollfd > *pollfd, int timeout) > cpu_usage = get_cpu_usage(); > if (VLOG_IS_DBG_ENABLED()) { > level = VLL_DBG; > -} else if (cpu_usage > 50 && !VLOG_DROP_WARN(&rl)) { > -level = VLL_WARN; > +} else if (cpu_usage > 50 && !VLOG_DROP_INFO(&rl)) { > +level = VLL_INFO; > } else { > return; > } > -- > 1.7.2.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [test log scan v2 3/6] bond: Reduce log level from WARN to INFO for interface status updates.
Acked-by: Ethan Jackson On Fri, Feb 1, 2013 at 10:48 AM, Ben Pfaff wrote: > An interface coming up or going down isn't a big deal. > > Signed-off-by: Ben Pfaff > --- > lib/bond.c |6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/lib/bond.c b/lib/bond.c > index 06680ee..2e151eb 100644 > --- a/lib/bond.c > +++ b/lib/bond.c > @@ -1302,12 +1302,12 @@ bond_enable_slave(struct bond_slave *slave, bool > enable, struct tag_set *tags) > if (enable != slave->enabled) { > slave->enabled = enable; > if (!slave->enabled) { > -VLOG_WARN("interface %s: disabled", slave->name); > +VLOG_INFO("interface %s: disabled", slave->name); > if (tags) { > tag_set_add(tags, slave->tag); > } > } else { > -VLOG_WARN("interface %s: enabled", slave->name); > +VLOG_INFO("interface %s: enabled", slave->name); > slave->tag = tag_create_random(); > } > > @@ -1512,7 +1512,7 @@ bond_choose_active_slave(struct bond *bond, struct > tag_set *tags) > > bond->send_learning_packets = true; > } else if (old_active_slave) { > -VLOG_WARN_RL(&rl, "bond %s: all interfaces disabled", bond->name); > +VLOG_INFO_RL(&rl, "bond %s: all interfaces disabled", bond->name); > } > } > > -- > 1.7.2.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [test log scan v2 4/6] tests: Set explicit bond mode in LACP test.
This seems fine to me. The default bond mode has been active-backup for quite a while, so it may simply make sense to remove this warning in the post 1.10 release. Acked-by: Ethan Jackson On Fri, Feb 1, 2013 at 10:48 AM, Ben Pfaff wrote: > This avoids a log warning: > > bridge|WARN|port bond: Using the default bond_mode active-backup. > Note that in previous versions, the default bond_mode was balance-slb > > This warning is harmless, but I'm trying to add checks for "warn" and > higher severity log messages to the tests, so it makes sense to get rid of > this one. > > Signed-off-by: Ben Pfaff > --- > tests/lacp.at |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/tests/lacp.at b/tests/lacp.at > index 408b1ec..49f3bb1 100644 > --- a/tests/lacp.at > +++ b/tests/lacp.at > @@ -40,7 +40,7 @@ AT_CLEANUP > AT_SETUP([lacp - multi port config]) > OVS_VSWITCHD_START([dnl > add-bond br0 bond p1 p2 --\ > -set Port bond lacp=active \ > +set Port bond lacp=active bond-mode=active-backup \ > other_config:lacp-time="fast" \ > other_config:lacp-system-id=11:22:33:44:55:66 \ > other_config:lacp-system-priority=54321 --\ > -- > 1.7.2.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/3] bond: Remove stable bond mode.
On Tue, Feb 05, 2013 at 12:56:43PM -0800, Ethan Jackson wrote: > Since I've dropped the first patch, I think choose_hash_slave() is no longer > necessary. In this version of the patch, I've removed it. Looks good, thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [test log scan v2 5/6] timeval: Don't issue poll interval warnings when we warp time.
Acked-by: Ethan Jackson On Fri, Feb 1, 2013 at 10:48 AM, Ben Pfaff wrote: > It's not meaningful in that case. > > Signed-off-by: Ben Pfaff > --- > lib/timeval.c |4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/timeval.c b/lib/timeval.c > index d91305c..4ffb756 100644 > --- a/lib/timeval.c > +++ b/lib/timeval.c > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2008, 2009, 2010, 2011, 2012 Nicira, Inc. > + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013 Nicira, Inc. > * > * Licensed under the Apache License, Version 2.0 (the "License"); > * you may not use this file except in compliance with the License. > @@ -516,7 +516,7 @@ log_poll_interval(long long int last_wakeup) > { > long long int interval = time_msec() - last_wakeup; > > -if (interval >= 1000) { > +if (interval >= 1000 && !warp_offset.tv_sec && !warp_offset.tv_nsec) { > const struct rusage *last_rusage = get_recent_rusage(); > struct rusage rusage; > > -- > 1.7.2.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [test log scan v2 6/6] ofproto-macros: Fail a test that logs a WARN or higher level message.
Looks like a massive win to me. Acked-by: Ethan Jackson On Fri, Feb 1, 2013 at 10:48 AM, Ben Pfaff wrote: > It is necessary to whitelist a couple of tests that appear to legitimately > have WARN messages. > > Signed-off-by: Ben Pfaff > --- > tests/ofproto-macros.at | 20 +++- > tests/tunnel.at |4 ++-- > 2 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at > index b9356d3..9a6d5ab 100644 > --- a/tests/ofproto-macros.at > +++ b/tests/ofproto-macros.at > @@ -64,8 +64,26 @@ m4_define([OVS_VSWITCHD_START], > AT_CHECK([ovs-vsctl -- add-br br0 -- set bridge br0 datapath-type=dummy > other-config:datapath-id=fedcba9876543210 > other-config:hwaddr=aa:55:aa:55:00:00 > protocols=[[OpenFlow10,OpenFlow12,OpenFlow13]] fail-mode=secure -- $1 > m4_if([$2], [], [], [| perl $srcdir/uuidfilt.pl])], [0], [$2]) > ]) > > +m4_divert_push([PREPARE_TESTS]) > +check_logs () { > +sed -n "$1 > +/|WARN|/p > +/|ERR|/p > +/|EMER|/p" ovs-vswitchd.log ovsdb-server.log > +} > +m4_divert_pop([PREPARE_TESTS]) > + > +# OVS_VSWITCHD_STOP([WHITELIST]) > +# > +# Gracefully stops ovs-vswitchd and ovsdb-server, checking their log files > +# for messages with severity WARN or higher and signaling an error if any > +# is present. The optional WHITELIST may contain shell-quoted "sed" > +# commands to delete any warnings that are actually expected, e.g.: > +# > +# OVS_VSWITCHD_STOP(["/expected error/d"]) > m4_define([OVS_VSWITCHD_STOP], > - [AT_CHECK([ovs-appctl -t ovs-vswitchd exit]) > + [AT_CHECK([check_logs $1]) > + AT_CHECK([ovs-appctl -t ovs-vswitchd exit]) > AT_CHECK([ovs-appctl -t ovsdb-server exit])]) > > # ADD_OF_PORTS(BRIDGE, OF_PORT[, OF_PORT...]) > diff --git a/tests/tunnel.at b/tests/tunnel.at > index c708b30..55fd5b3 100644 > --- a/tests/tunnel.at > +++ b/tests/tunnel.at > @@ -64,7 +64,7 @@ Invalid flow > ovs-appctl: ovs-vswitchd: server returned an error > ]) > > -OVS_VSWITCHD_STOP > +OVS_VSWITCHD_STOP(["/receive tunnel port not found/d"]) > AT_CLEANUP > > AT_SETUP([tunnel - ECN decapsulation]) > @@ -256,7 +256,7 @@ AT_CHECK([ovs-appctl ofproto/trace br0 > 'tunnel(tun_id=0xf,src=1.1.1.1,dst=2.2.2. > Invalid flow > ovs-appctl: ovs-vswitchd: server returned an error > ]) > -OVS_VSWITCHD_STOP > +OVS_VSWITCHD_STOP(["/receive tunnel port not found/d"]) > AT_CLEANUP > > AT_SETUP([tunnel - key match]) > -- > 1.7.2.5 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 3/3] nicira-ext: Remove the autopath action.
Thanks for the reviews, I'll merge the last two patches of this series shortly. Ethan On Tue, Feb 5, 2013 at 8:55 AM, Ben Pfaff wrote: > On Mon, Feb 04, 2013 at 07:13:36PM -0800, Ethan Jackson wrote: >> The autopath action was attempting to achieve functionality similar >> to the bundle action, but was significantly clunkier, more >> difficult to understand, more difficult to use, and less reliable. >> This patch removes it. >> >> Signed-off-by: Ethan Jackson > > Looks good, thanks. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] RFC: Pass more packet and flow key info to userspace.
On Tue, Feb 5, 2013 at 8:53 AM, Rajahalme, Jarno (NSN - FI/Espoo) wrote > On Jan 31, 2013, at 3:19 , ext Jesse Gross wrote: >> On Wed, Jan 30, 2013 at 3:14 AM, Rajahalme, Jarno (NSN - FI/Espoo) >> wrote: >>> >>> Otherwise not so much use of the layer pointers, but >>> dpif_flow_stats_extract() >>> does packet_get_tcp_flags(), which needs the layer pointers. >>> dpif_flow_stats_extract() is called on each miss packet (but the collected >>> tcp_flags are used only with facets, for netflow and facet->tcp_flags). >>> Other use is for controller action and special processing (stpp/cfm/lacp). >>> >>> So it seems that deferred layer pointer setting could be effective only if >>> the >>> stats extraction would be changed to skip tcp_flags for the cases where they >>> are not going to be used (i.e. when not making facets, as might be expected >>> under heavy load?). >> >> That's a good point about the TCP flags. >> >> I think it's actually not right that we don't use the TCP flags when >> we don't create facets under load. Not creating facets is an >> optimization and that shouldn't affect the results that are seen >> through NetFlow. > > Fixing this seem non-trivial, as the nf_flow resides within struct facet. > Without a > facet I do not think there is any memory of specific flows seen, so there is > no > basis for reporting such flows vie netflow or any other means. The design > seems > to be that tracking of short flows is not essential under load, see this > comment for > flow_miss_should_make_facet(): > > /* Figures out whether a flow that missed in 'ofproto', whose details are in > * 'miss', is likely to be worth tracking in detail in userspace and (usually) > * installing a datapath flow. The answer is usually "yes" (a return value of > * true). However, for short flows the cost of bookkeeping is much higher > than > * the benefits, so when the datapath holds a large number of flows we impose > * some heuristics to decide which flows are likely to be worth tracking. */ > > If, however, netflow should not be affected, then a facet should be installed > for > every flow. In that case maybe the installation of kernel flows could be > avoided > for short flows under load? Not creating facets was intended purely as a performance optimization - i.e. is the cost of installing the flow in the kernel greater or less than the cost of processing packets in userspace? My guess is that the side effects on NetFlow weren't considered since that's not the primary use case of facets. It's possible that we could create a facet in userspace to track NetFlow information but still not install the flow in the kernel. My guess is that this will reduce the benefit of the optimization somewhat since allocating facets has a cost but I don't know by how much. Our facets are also unusually fine grained compared to the flows usually reported by NetFlow. We could consider creating a NetFlow-specific structure, which could be allocated less frequently and only used when NetFlow is enabled. >> >> I suspect that there are quite a lot of cases where NetFlow isn't used >> (NXAST_FIN_TIMEOUT is the other case that comes to mind that uses the >> flags) so it might be worthwhile to turn off collection when those >> features aren't in use. > > It seems possible that netflow is turned on at runtime. Would it be acceptable > that netflow data would be missing for the time before netflow was turned on? Yes, that seems fine to me. >> >>> Also, I noticed that the facets hmap uses the hash from flow_miss, and >>> according to the comments that should be flow_hash(flow, 0). However, >>> the only place the hash originates in the code is at handle_miss_upcalls(), >>> so again, there is no harm in using a kernel provided hash instead. But the >>> comments need to be updated if this change is adopted. >> >> It would probably be good to benchmark the results of these two >> changes independently to see which is more significant. >> >> In general, I'm nervous about putting optimizations into the >> userspace/kernel interface because it exposes implementation >> information. Keeping a neutral format can actually make it easier to >> optimize in the future. In particular, I know that Ben is thinking >> about grouping together flows at a coarser granularity than the kernel >> uses, which would make passing the hash up not that useful. > > While I would not see a problem in passing a key hash computed with an > undisclosed algorithm up to userspace, I see your point. > > In my tests it seems that about 1/3 of the benefit is attainable with deferred > layer pointer computation within the userspace. Ben has been doing work optimizing flow setups, so it might be good for him to comment on this and how it fits in with his current ideas. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] stream-unix: append ovs_rundir to socket
If socket path specified is relative to ovs_rundir(), append the directory name to in unix_open. Signed-off-by: Pavithra Ramesh --- lib/stream-unix.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/lib/stream-unix.c b/lib/stream-unix.c index 6ed7648..8764b6c 100644 --- a/lib/stream-unix.c +++ b/lib/stream-unix.c @@ -42,9 +42,19 @@ static int unix_open(const char *name, char *suffix, struct stream **streamp, uint8_t dscp OVS_UNUSED) { -const char *connect_path = suffix; +const char *ovsDir = ovs_rundir(); +char *new_path, *connect_path = suffix; int fd; +if ((!strstr(suffix, ovsDir)) { +/* alloc one byte for '/' and one for null terminator */ +new_path = (char *)xmalloc(strlen(ovsDir) + strlen(suffix) + 2); +strncpy(new_path, ovsDir, strlen(ovsDir)); +new_path[strlen(ovsDir)] = '/'; +new_path[strlen(ovsDir) + 1] = '\0'; +strncat(new_path, suffix, strlen(suffix)); +connect_path = new_path; +} fd = make_unix_socket(SOCK_STREAM, true, NULL, connect_path); if (fd < 0) { VLOG_DBG("%s: connection failed (%s)", connect_path, strerror(-fd)); -- 1.7.0.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] stream-unix: append ovs_rundir to socket
This isn't a full review of the patch, but have you considered using xasprintf() instead of doing the string manipulation by hand? Ethan On Tue, Feb 5, 2013 at 5:07 PM, Pavithra Ramesh wrote: > If socket path specified is relative to ovs_rundir(), > append the directory name to in unix_open. > > Signed-off-by: Pavithra Ramesh > --- > lib/stream-unix.c | 12 +++- > 1 files changed, 11 insertions(+), 1 deletions(-) > > diff --git a/lib/stream-unix.c b/lib/stream-unix.c > index 6ed7648..8764b6c 100644 > --- a/lib/stream-unix.c > +++ b/lib/stream-unix.c > @@ -42,9 +42,19 @@ static int > unix_open(const char *name, char *suffix, struct stream **streamp, >uint8_t dscp OVS_UNUSED) > { > -const char *connect_path = suffix; > +const char *ovsDir = ovs_rundir(); > +char *new_path, *connect_path = suffix; > int fd; > > +if ((!strstr(suffix, ovsDir)) { > +/* alloc one byte for '/' and one for null terminator */ > +new_path = (char *)xmalloc(strlen(ovsDir) + strlen(suffix) + 2); > +strncpy(new_path, ovsDir, strlen(ovsDir)); > +new_path[strlen(ovsDir)] = '/'; > +new_path[strlen(ovsDir) + 1] = '\0'; > +strncat(new_path, suffix, strlen(suffix)); > +connect_path = new_path; > +} > fd = make_unix_socket(SOCK_STREAM, true, NULL, connect_path); > if (fd < 0) { > VLOG_DBG("%s: connection failed (%s)", connect_path, strerror(-fd)); > -- > 1.7.0.4 > > ___ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] stream-unix: append ovs_rundir to socket
Oh, also I think this may introduce a memory leak. I.E. in the case where connect_path is malloced, it never gets freed. Ethan On Tue, Feb 5, 2013 at 5:08 PM, Ethan Jackson wrote: > This isn't a full review of the patch, but have you considered using > xasprintf() instead of doing the string manipulation by hand? > > Ethan > > On Tue, Feb 5, 2013 at 5:07 PM, Pavithra Ramesh wrote: >> If socket path specified is relative to ovs_rundir(), >> append the directory name to in unix_open. >> >> Signed-off-by: Pavithra Ramesh >> --- >> lib/stream-unix.c | 12 +++- >> 1 files changed, 11 insertions(+), 1 deletions(-) >> >> diff --git a/lib/stream-unix.c b/lib/stream-unix.c >> index 6ed7648..8764b6c 100644 >> --- a/lib/stream-unix.c >> +++ b/lib/stream-unix.c >> @@ -42,9 +42,19 @@ static int >> unix_open(const char *name, char *suffix, struct stream **streamp, >>uint8_t dscp OVS_UNUSED) >> { >> -const char *connect_path = suffix; >> +const char *ovsDir = ovs_rundir(); >> +char *new_path, *connect_path = suffix; >> int fd; >> >> +if ((!strstr(suffix, ovsDir)) { >> +/* alloc one byte for '/' and one for null terminator */ >> +new_path = (char *)xmalloc(strlen(ovsDir) + strlen(suffix) + 2); >> +strncpy(new_path, ovsDir, strlen(ovsDir)); >> +new_path[strlen(ovsDir)] = '/'; >> +new_path[strlen(ovsDir) + 1] = '\0'; >> +strncat(new_path, suffix, strlen(suffix)); >> +connect_path = new_path; >> +} >> fd = make_unix_socket(SOCK_STREAM, true, NULL, connect_path); >> if (fd < 0) { >> VLOG_DBG("%s: connection failed (%s)", connect_path, strerror(-fd)); >> -- >> 1.7.0.4 >> >> ___ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 01/16] User-Space MPLS actions and matches
On Tue, Feb 05, 2013 at 09:19:06AM -0800, Ben Pfaff wrote: > On Tue, Feb 05, 2013 at 10:13:18AM +0900, Simon Horman wrote: > > On Tue, Feb 05, 2013 at 09:00:35AM +0900, Simon Horman wrote: > > > On Mon, Feb 04, 2013 at 10:09:20AM -0800, Ben Pfaff wrote: > > > > On Fri, Jan 25, 2013 at 04:22:07PM +0900, Simon Horman wrote: > > > > > This patch implements use-space datapath and non-datapath code > > > > > to match and use the datapath API set out in Leo Alterman's patch > > > > > "user-space datapath: Add basic MPLS support to kernel". > > > > > > > > > > The resulting MPLS implementation supports: > > > > > * Pushing a single MPLS label > > > > > * Poping a single MPLS label > > > > > * Modifying an MPLS lable using set-field or load actions > > > > > that act on the label value, tc and bos bit. > > > > > * There is no support for manipulating the TTL > > > > > this is considered future work. > > > > > > > > > > The single-level push pop limitation is implemented by processing > > > > > push, pop and set-field/load actions in order and discarding > > > > > information > > > > > that would require multiple levels of push/pop to be supported. > > > > > > > > > > e.g. > > > > >push,push -> the first push is discarded > > > > >pop,pop -> the first pop is discarded > > > > > > > > > > This patch is based heavily on work by Ravi K. > > > > > > > > > > Cc: Ravi K > > > > > Reviewed-by: Isaku Yamahata > > > > > Signed-off-by: Simon Horman > > > > > > > > I resolved a few merge conflicts, then I went through this fairly > > > > carefully and made a few changes, mostly cosmetic. I'm appending an > > > > incremental and then a full revised patch. > > > > > > > > Here's a little commentary on the less trivial changes I made: > > > > > > > >- I think that eth_mpls_depth() was capable of counting an MPLS label > > > > that was off the end of the packet as part of the depth (for > > > > example, if there was one MPLS label without BOS followed by the > > > > end of the packet, I believe that it would count this as depth 2). > > > > This wasn't a problem for current callers but it seemed like a bad > > > > idea, so I changed it. Please check that it looks correct. > > > > > > > >- I think mf_random_value() should have used ->u8 for TC and BOS so I > > > > changed it. > > > > > > Thanks, both of those changes seem reasonable to me. I'll review > > > these and the changes in your diff more thoroughly and get back to you > > > ASAP. > > > > I have reviewed all the changes and they look good to me. > > Please feel free to apply. I can then rebase the remaining patches on > > master. > > I fixed a few indentation errors I noticed at the last minute and > applied this to master. Thanks! Great, thanks! I'll rebase the remaining patches ASAP and... > > > > I have one question before I commit this: it seems that we could support > > > > a series of more than one pop_mpls action, why do we limit the number of > > > > pops to just one? > > > > > > I think that I agree with you. I'll take a look and see how practical > > > it would be to remove that restriction. > > > > I'll start looking into this in more depth. But I'd rather handle any > > updates as a subsequent change to this patch. > > Thanks, that sounds fine. ... investigate this. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] stream-unix: append ovs_rundir to socket
On Tue, Feb 05, 2013 at 05:07:53PM -0800, Pavithra Ramesh wrote: > If socket path specified is relative to ovs_rundir(), > append the directory name to in unix_open. > > Signed-off-by: Pavithra Ramesh Hi Pavithra, I thought that the goal that we discussed in person was this: if the path is not absolute (does not begin with "/") then interpret it relative to ovs_rundir(). This does something quite different: if the path does not include ovs_rundir(), it adds ovs_rundir() as a prefix. Please also read Ethan's comments. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ovs-ctl: Add support for built-in (non-modular) kernel support.
On 02/06/2013 03:14 AM, Ben Pfaff wrote: Reported-by: Cong Wang Signed-off-by: Ben Pfaff --- Cong, will you verify that this makes the init script work OK for your kernel with OVS built-in? I got: # ovs-dpctl show -voff ovs-dpctl: processing "off": unknown module "off" # modprobe openvswitch # echo $? 1 # dmesg | grep openv [ 13.368538] openvswitch: Open vSwitch switching datapath I double checked my .config: % grep OPENV .config CONFIG_OPENVSWITCH=y ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] ovs-ctl: Add support for built-in (non-modular) kernel support.
BTW, I modified ovs-ctl.in manually, skipped inserting the kernel modules, ovs service start successfully and everything works fine so far. I am using openvswitch in fedora 17: # rpm -q openvswitch openvswitch-1.4.2-1.fc17.x86_64 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev