[ovs-dev] Is there any way to simulate the tunnel traffic in test cases to test IPFIX feature?

2014-06-03 Thread Wenyu Zhang

I am trying to simulate traffic in test cases to test IPFIX feature. 
  
For normal traffic, I can use "ovs-appctl netdev-dummy/receive " to simulate 
it, just refer to the netflow and sflow cases in tests/ofproto-dpif.at 
And I can capture the IPFIX packets on lo interface as expected. 
  
But for tunnel traffic, I haven't find a workable way to generate this kind of 
traffic.  
For example,  I try to use "ovs-appctl ofproto/trace ovs-dummy" command with 
"-generate" option.  
And we can see that the "sample" actions has been inserted as expected, but 
these actions are not executed, and no IPFIX packets can be captured on lo 
interface. 
  
Anyone knows about the usage of this command? Or other way to generate the 
tunnel traffic to trigger IPFIX sample action? Thanks a lot. 
  
Following is the script to simulate tunnel traffic, but it doesn't work for 
IPFIX. 


AT_SETUP([ofproto-dpif - IPFIX tunnel packet sampling]) 
   OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \ 
   options:remote_ip=1.1.1.1 ofport_request=1\ 
   -- add-port br0 p2 -- set Interface p2 type=gre \ 
   options:local_ip=2.2.2.2 options:remote_ip=1.1.1.1 \ 
   ofport_request=2 \ 
   -- add-port br0 p3 -- set Interface p3 type=gre \ 
  options:remote_ip=2.2.2.2 ofport_request=3]) 
   AT_DATA([flows.txt], [dnl 
actions=IN_PORT 
]) 

  ovs-appctl time/stop 
  ON_EXIT([kill `cat test-ipfix.pid`]) 
  AT_CHECK([ovstest test-ipfix --log-file --detach --no-chdir --pidfile 
0:127.0.0.1 > ipfix.log], [0], [], [ignore]) 
  AT_CAPTURE_FILE([ipfix.log]) 
  IPFIX_PORT=`parse_listening_port < test-ipfix.log` 

  ovs-vsctl \ 
 set Bridge br0 ipfix=@i -- \ 
 --id=@i create IPFIX targets=\"127.0.0.1:$IPFIX_PORT\" \ 
   sampling=1 obs_domain_id=123 obs_point_id=456 cache_active_timeout=1 
cache_max_flows=0 other_config:enable-tunnel-sampling=true 

   AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) 
   AT_CHECK([ovs-appctl dpif/show | tail -n +3], [0], [dnl 
    br0 65534/100: (dummy) 
    p1 1/1: (gre: remote_ip=1.1.1.1) 
    p2 2/1: (gre: local_ip=2.2.2.2, remote_ip=1.1.1.1) 
    p3 3/1: (gre: remote_ip=2.2.2.2) 
]) 


 dnl remote_ip 
   AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 
'tunnel(tun_id=0x0,src=1.1.1.1,dst=1.2.3.4,tos=0x0,ttl=64,flags()),in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
 -generate], [0], [stdout]) 
   AT_CHECK([tail -1 stdout], [0], 
    [Datapath actions: 
sample(sample=100.0%,actions(userspace(pid=0,ipfix(output_port=4294967295,set(tunnel(tun_id=0x0,src=0.0.0.0,dst=1.1.1.1,tos=0x0,ttl=64,flags(df))),sample(sample=100.0%,actions(userspace(pid=0,ipfix(output_port=1),tunnel_out_port=1))),1
   
 Bests, 
Wenyu 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/2] ofproto: Remove ofproto_refresh_rule().

2014-06-03 Thread Joe Stringer
The only user of this function was removed in the previous patch, so
remove it.

Signed-off-by: Joe Stringer 
---
 ofproto/ofproto-dpif.c |   12 
 ofproto/ofproto-provider.h |2 --
 ofproto/ofproto.c  |   41 -
 3 files changed, 55 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 06be234..22ba14e 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -364,18 +364,6 @@ ofproto_dpif_flow_mod(struct ofproto_dpif *ofproto,
 ofproto_flow_mod(&ofproto->up, fm);
 }
 
-/* Resets the modified time for 'rule' or an equivalent rule. If 'rule' is not
- * in the classifier, but an equivalent rule is, unref 'rule' and ref the new
- * rule. Otherwise if 'rule' is no longer installed in the classifier,
- * reinstall it.
- *
- * Returns the rule whose modified time has been reset. */
-struct rule_dpif *
-ofproto_dpif_refresh_rule(struct rule_dpif *rule)
-{
-return rule_dpif_cast(ofproto_refresh_rule(&rule->up));
-}
-
 /* Appends 'pin' to the queue of "packet ins" to be sent to the controller.
  * Takes ownership of 'pin' and pin->packet. */
 void
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index ff539b9..7741044 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -1716,8 +1716,6 @@ BUILD_ASSERT_DECL(OFPROTO_POSTPONE < OFPERR_OFS);
 
 int ofproto_flow_mod(struct ofproto *, struct ofputil_flow_mod *)
 OVS_EXCLUDED(ofproto_mutex);
-struct rule *ofproto_refresh_rule(struct rule *rule)
-OVS_EXCLUDED(ofproto_mutex);
 void ofproto_add_flow(struct ofproto *, const struct match *,
   unsigned int priority,
   const struct ofpact *ofpacts, size_t ofpacts_len)
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 24a709b..f34cfde 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -1978,47 +1978,6 @@ ofproto_flow_mod(struct ofproto *ofproto, struct 
ofputil_flow_mod *fm)
 return handle_flow_mod__(ofproto, NULL, fm, NULL);
 }
 
-/* Resets the modified time for 'rule' or an equivalent rule. If 'rule' is not
- * in the classifier, but an equivalent rule is, unref 'rule' and ref the new
- * rule. Otherwise if 'rule' is no longer installed in the classifier,
- * reinstall it.
- *
- * Returns the rule whose modified time has been reset. */
-struct rule *
-ofproto_refresh_rule(struct rule *rule)
-{
-const struct oftable *table = &rule->ofproto->tables[rule->table_id];
-const struct cls_rule *cr = &rule->cr;
-struct rule *r;
-
-/* do_add_flow() requires that the rule is not installed. We lock the
- * ofproto_mutex here so that another thread cannot add the flow before
- * we get a chance to add it.*/
-ovs_mutex_lock(&ofproto_mutex);
-
-fat_rwlock_rdlock(&table->cls.rwlock);
-r = rule_from_cls_rule(classifier_find_rule_exactly(&table->cls, cr));
-if (r != rule) {
-ofproto_rule_ref(r);
-}
-fat_rwlock_unlock(&table->cls.rwlock);
-
-if (!r) {
-do_add_flow(rule->ofproto, NULL, NULL, 0, rule);
-} else if  (r != rule) {
-ofproto_rule_unref(rule);
-rule = r;
-}
-ovs_mutex_unlock(&ofproto_mutex);
-
-/* Refresh the modified time for the rule. */
-ovs_mutex_lock(&rule->mutex);
-rule->modified = MAX(rule->modified, time_msec());
-ovs_mutex_unlock(&rule->mutex);
-
-return rule;
-}
-
 /* Searches for a rule with matching criteria exactly equal to 'target' in
  * ofproto's table 0 and, if it finds one, deletes it.
  *
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Joe Stringer
Caching the results of xlate_learn was previously dependent on the state
of the 'may_learn' flag. This meant that if the caller did not specify
that this flow may learn, then a learn entry would not be cached.
However, the xlate_cache tends to be used on a recurring basis, so
failing to cache the learn entry can provide unexpected behaviour later
on, particularly in corner cases.

Such a corner case occurred previously:-
* Revalidation was requested.
* A flow with a learn action was dumped.
* The flow had no packets.
* The flow's corresponding xcache was cleared, and the flow revalidated.
* The flow went on to receive packets after the xcache is re-created.

In this case, the xcache would be re-created, but would not refresh the
timeouts on the learnt flow until the next time it was cleared, even if
it received more traffic. This would cause flows to time out sooner than
expected. Symptoms of this bug may include unexpected forwarding
behaviour or extraneous statistics being attributed to the wrong flow.

This patch fixes the issue by caching the entire flow_mod, including
actions, upon translating an xlate_learn action. This is used to perform
a flow_mod from scratch with the original flow, rather than simply
refreshing the rule that was created during the creation of the xcache.

Bug #1252997.

Reported-by: Scott Hendricks 
Signed-off-by: Joe Stringer 
---
 ofproto/ofproto-dpif-xlate.c |   58 ++
 1 file changed, 30 insertions(+), 28 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 068d54f..e8f3dbf 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -265,7 +265,9 @@ struct xc_entry {
 uint16_t vid;
 } bond;
 struct {
-struct rule_dpif *rule;
+struct ofproto_dpif *ofproto;
+struct ofputil_flow_mod *fm;
+struct ofpbuf *ofpacts;
 } learn;
 struct {
 struct ofproto_dpif *ofproto;
@@ -2977,34 +2979,38 @@ xlate_bundle_action(struct xlate_ctx *ctx,
 }
 
 static void
-xlate_learn_action(struct xlate_ctx *ctx,
-   const struct ofpact_learn *learn)
+xlate_learn_action__(struct xlate_ctx *ctx, const struct ofpact_learn *learn,
+ struct ofputil_flow_mod *fm, struct ofpbuf *ofpacts)
 {
-uint64_t ofpacts_stub[1024 / 8];
-struct ofputil_flow_mod fm;
-struct ofpbuf ofpacts;
+learn_execute(learn, &ctx->xin->flow, fm, ofpacts);
+if (ctx->xin->may_learn) {
+ofproto_dpif_flow_mod(ctx->xbridge->ofproto, fm);
+}
+}
 
+static void
+xlate_learn_action(struct xlate_ctx *ctx, const struct ofpact_learn *learn)
+{
 ctx->xout->has_learn = true;
-
 learn_mask(learn, &ctx->xout->wc);
 
-if (!ctx->xin->may_learn) {
-return;
-}
-
-ofpbuf_use_stub(&ofpacts, ofpacts_stub, sizeof ofpacts_stub);
-learn_execute(learn, &ctx->xin->flow, &fm, &ofpacts);
-ofproto_dpif_flow_mod(ctx->xbridge->ofproto, &fm);
-ofpbuf_uninit(&ofpacts);
-
 if (ctx->xin->xcache) {
 struct xc_entry *entry;
 
 entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
-/* Lookup the learned rule, taking a reference on it.  The reference
- * is released when this cache entry is deleted. */
-rule_dpif_lookup(ctx->xbridge->ofproto, &ctx->xin->flow, NULL,
- &entry->u.learn.rule, true, NULL);
+entry->u.learn.ofproto = ctx->xbridge->ofproto;
+entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);
+entry->u.learn.ofpacts = ofpbuf_new(1024);
+xlate_learn_action__(ctx, learn, entry->u.learn.fm,
+ entry->u.learn.ofpacts);
+} else if (ctx->xin->may_learn) {
+uint64_t ofpacts_stub[1024 / 8];
+struct ofputil_flow_mod fm;
+struct ofpbuf ofpacts;
+
+ofpbuf_use_stub(&ofpacts, ofpacts_stub, sizeof ofpacts_stub);
+xlate_learn_action__(ctx, learn, &fm, &ofpacts);
+ofpbuf_uninit(&ofpacts);
 }
 }
 
@@ -3911,12 +3917,8 @@ xlate_push_stats(struct xlate_cache *xcache, bool 
may_learn,
 break;
 case XC_LEARN:
 if (may_learn) {
-struct rule_dpif *rule = entry->u.learn.rule;
-
-/* Reset the modified time for a rule that is equivalent to
- * the currently cached rule.  If the rule is not the exact
- * rule we have cached, update the reference that we have. */
-entry->u.learn.rule = ofproto_dpif_refresh_rule(rule);
+ofproto_dpif_flow_mod(entry->u.learn.ofproto,
+  entry->u.learn.fm);
 }
 break;
 case XC_NORMAL:
@@ -3989,8 +3991,8 @@ xlate_cache_clear(struct xlate_cache *xcache)
 mbridge_unref(entry->u.mirror.mbridge);
 break;
 case XC_LEARN:
-/* 'u.learn.rule' is th

Re: [ovs-dev] [PATCH 2/3] lib/ovs-rcu: Evaluate parameters before ovsrcu_set and ovsrcu_init.

2014-06-03 Thread Jarno Rajahalme
Thank you for the reviews. Series applied with an incremental shown below,

  Jarno

On Jun 2, 2014, at 9:24 PM, YAMAMOTO Takashi  wrote:

>> ovsrcu_set() and ovsrcu_init() look like functions, so users are
>> justified in expecting the arguments to be evalueated before any of
> 
> evaluated
> 

Thanks.

>> 
>> +#define ovsrcu_set__(VAR, VALUE, ORDER) \
>> +({  \
>> +typeof(VAR) var__ = (VAR);  \
>> +typeof(VALUE) value__ = (VALUE);\
>> +memory_order order__ = (ORDER); \
>> +\
>> +atomic_store_explicit(&var__->p, value__, order__); \
> 
> ovs-atomic-gcc4+ version of atomic_store_explicit also has
> a variable named order__.  is it safe?
> 

Good catch, applied with the following incremental:

diff --git a/lib/ovs-rcu.h b/lib/ovs-rcu.h
index e27dfad..87651d8 100644
--- a/lib/ovs-rcu.h
+++ b/lib/ovs-rcu.h
@@ -136,14 +136,14 @@
  * any of the body of the atomic_store_explicit.  Since the type of
  * 'VAR' is not fixed, we cannot use an inline function to get
  * function semantics for this. */
-#define ovsrcu_set__(VAR, VALUE, ORDER) \
-({  \
-typeof(VAR) var__ = (VAR);  \
-typeof(VALUE) value__ = (VALUE);\
-memory_order order__ = (ORDER); \
-\
-atomic_store_explicit(&var__->p, value__, order__); \
-(void *) 0; \
+#define ovsrcu_set__(VAR, VALUE, ORDER) \
+({  \
+typeof(VAR) ovsrcu_var = (VAR); \
+typeof(VALUE) ovsrcu_value = (VALUE);   \
+memory_order ovsrcu_order = (ORDER);\
+\
+atomic_store_explicit(&ovsrcu_var->p, ovsrcu_value, ovsrcu_order); \
+(void *) 0; \
 })
 #else  /* not GNU C */
 struct ovsrcu_pointer { ATOMIC(void *) p; };

> otherwise,
> Acked-by: YAMAMOTO Takashi 
> 
> YAMAMOTO Takashi

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Jarno Rajahalme
Ryan,

See comments below.

  Jarno

On Jun 2, 2014, at 3:54 PM, Ryan Wilson  wrote:

> When a bridge of datatype type netdev receives a packet, it copies the
> packet from the NIC to a buffer in userspace. Currently, when making
> an upcall, the packet is again copied to the upcall's buffer. However,
> this extra copy is not necessary when the datapath exists in userspace
> as the upcall can directly access the packet data.
> 
> This patch eliminates this extra copy of the packet data in most cases.
> In cases where the packet may still be used later by callers of
> dp_netdev_execute_actions, making a copy of the packet data is still
> necessary.
> ---
> lib/dpif-netdev.c |   15 +--
> lib/ofpbuf.c  |   18 ++
> lib/ofpbuf.h  |2 ++
> 3 files changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 91c83d6..7cb0478 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct ofpbuf 
> *packet,
>miniflow_hash_5tuple(&key.flow, 0)
>% dp->n_handlers,
>DPIF_UC_MISS, &key.flow, NULL);
> -ofpbuf_delete(packet);
> }
> }
> 
> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
> ofpbuf *packet,
> if (userdata) {
> buf_size += NLA_ALIGN(userdata->nla_len);
> }
> -buf_size += ofpbuf_size(packet);
> ofpbuf_init(buf, buf_size);
> 
> /* Put ODP flow. */
> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev *dp, 
> struct ofpbuf *packet,
>   NLA_ALIGN(userdata->nla_len));
> }
> 
> -ofpbuf_set_data(&upcall->packet,
> -ofpbuf_put(buf, ofpbuf_data(packet), 
> ofpbuf_size(packet)));
> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
> +ofpbuf_set(&upcall->packet, packet);
> 

Did you consider the possibility to just assign the packet instead? We do this 
elsewhere, so it should work here too:

upcall->packet = *packet;

> seq_change(q->seq);
> 
> error = 0;
> } else {
> dp_netdev_count_packet(dp, DP_STAT_LOST);
> +ofpbuf_delete(packet);
> error = ENOBUFS;
> }
> ovs_mutex_unlock(&q->mutex);
> @@ -2120,19 +2117,17 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
> break;
> 
> case OVS_ACTION_ATTR_USERSPACE: {
> +struct ofpbuf *userspace_packet;
> const struct nlattr *userdata;
> 
> userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
> +userspace_packet = may_steal ? packet : ofpbuf_clone(packet);
> 
> -dp_netdev_output_userspace(aux->dp, packet,
> +dp_netdev_output_userspace(aux->dp, userspace_packet,
>miniflow_hash_5tuple(aux->key, 0)
>% aux->dp->n_handlers,
>DPIF_UC_ACTION, aux->key,
>userdata);
> -
> -if (may_steal) {
> -ofpbuf_delete(packet);
> -}
> break;
> }
> 
> diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c
> index 1f4b61d..ade940b 100644
> --- a/lib/ofpbuf.c
> +++ b/lib/ofpbuf.c
> @@ -555,3 +555,21 @@ ofpbuf_resize_l2(struct ofpbuf *b, int increment)
> ofpbuf_adjust_layer_offset(&b->l2_5_ofs, increment);
> return b->frame;
> }
> +
> +/* Sets ofpbuf 'dst' equal to 'src'. Note that 'dst' and 'src' share the
> + * same buffer, so calling ofpbuf_delete('dst') will also delete 'src'. */
> +void
> +ofpbuf_set(struct ofpbuf *dst, struct ofpbuf *src)
> +{
> +ofpbuf_set_base(dst, ofpbuf_base(src));
> +ofpbuf_set_data(dst, ofpbuf_data(src));
> +ofpbuf_set_size(dst, ofpbuf_size(src));
> +
> +dst->allocated = src->allocated;
> +dst->frame = src->frame;
> +dst->l2_5_ofs = src->l2_5_ofs;
> +dst->l3_ofs = src->l3_ofs;
> +dst->l4_ofs = src->l4_ofs;
> +dst->source = src->source;
> +dst->list_node = src->list_node;
> +}

This would be unnecessary if the plain assignment works also for DPDK.

> diff --git a/lib/ofpbuf.h b/lib/ofpbuf.h
> index 85be899..ec08c27 100644
> --- a/lib/ofpbuf.h
> +++ b/lib/ofpbuf.h
> @@ -101,6 +101,8 @@ static inline const void *ofpbuf_get_udp_payload(const 
> struct ofpbuf *);
> static inline const void *ofpbuf_get_sctp_payload(const struct ofpbuf *);
> static inline const void *ofpbuf_get_icmp_payload(const struct ofpbuf *);
> 
> +void ofpbuf_set(struct ofpbuf *, struct ofpbuf *);
> +
> void ofpbuf_use(struct ofpbuf *, void *, size_t);
> void ofpbuf_use_stack(struct ofpbuf *, void *, size_t);
> void ofpbuf_use_stub(struct ofpbuf *, void *, size_t);
> -- 
> 1.7.9.5
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/l

[ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Ryan Wilson
When a bridge of datatype type netdev receives a packet, it copies the
packet from the NIC to a buffer in userspace. Currently, when making
an upcall, the packet is again copied to the upcall's buffer. However,
this extra copy is not necessary when the datapath exists in userspace
as the upcall can directly access the packet data.

This patch eliminates this extra copy of the packet data in most cases.
In cases where the packet may still be used later by callers of
dp_netdev_execute_actions, making a copy of the packet data is still
necessary.

Signed-off-by: Ryan Wilson 
---
v2: Addressed Jarno's comment to use direct pointer assignment for
upcall->packet instead of ofpbuf_set().
---
 lib/dpif-netdev.c |   15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 91c83d6..c89ae20 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct ofpbuf 
*packet,
miniflow_hash_5tuple(&key.flow, 0)
% dp->n_handlers,
DPIF_UC_MISS, &key.flow, NULL);
-ofpbuf_delete(packet);
 }
 }
 
@@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
ofpbuf *packet,
 if (userdata) {
 buf_size += NLA_ALIGN(userdata->nla_len);
 }
-buf_size += ofpbuf_size(packet);
 ofpbuf_init(buf, buf_size);
 
 /* Put ODP flow. */
@@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
ofpbuf *packet,
   NLA_ALIGN(userdata->nla_len));
 }
 
-ofpbuf_set_data(&upcall->packet,
-ofpbuf_put(buf, ofpbuf_data(packet), 
ofpbuf_size(packet)));
-ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
+upcall->packet = *packet;
 
 seq_change(q->seq);
 
 error = 0;
 } else {
 dp_netdev_count_packet(dp, DP_STAT_LOST);
+ofpbuf_delete(packet);
 error = ENOBUFS;
 }
 ovs_mutex_unlock(&q->mutex);
@@ -2120,19 +2117,17 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
 break;
 
 case OVS_ACTION_ATTR_USERSPACE: {
+struct ofpbuf *userspace_packet;
 const struct nlattr *userdata;
 
 userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
+userspace_packet = may_steal ? packet : ofpbuf_clone(packet);
 
-dp_netdev_output_userspace(aux->dp, packet,
+dp_netdev_output_userspace(aux->dp, userspace_packet,
miniflow_hash_5tuple(aux->key, 0)
% aux->dp->n_handlers,
DPIF_UC_ACTION, aux->key,
userdata);
-
-if (may_steal) {
-ofpbuf_delete(packet);
-}
 break;
 }
 
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Alex Wang
I confirm this patch fix the bug, but have a high level concern,


Assume another learnt flow with hard_timeout=10 and was never hit after
install, now, if there is frequent change to the flow table and
need_revalidate
is constantly 'true', this learnt flow will never timeout...  since we
always
redo the flow_mod during revalidation,

Let's discuss further on this scenario, I'm thinking about solutions,

Thanks,
Alex Wang,




On Tue, Jun 3, 2014 at 2:59 AM, Joe Stringer  wrote:

> Caching the results of xlate_learn was previously dependent on the state
> of the 'may_learn' flag. This meant that if the caller did not specify
> that this flow may learn, then a learn entry would not be cached.
> However, the xlate_cache tends to be used on a recurring basis, so
> failing to cache the learn entry can provide unexpected behaviour later
> on, particularly in corner cases.
>
> Such a corner case occurred previously:-
> * Revalidation was requested.
> * A flow with a learn action was dumped.
> * The flow had no packets.
> * The flow's corresponding xcache was cleared, and the flow revalidated.
> * The flow went on to receive packets after the xcache is re-created.
>
> In this case, the xcache would be re-created, but would not refresh the
> timeouts on the learnt flow until the next time it was cleared, even if
> it received more traffic. This would cause flows to time out sooner than
> expected. Symptoms of this bug may include unexpected forwarding
> behaviour or extraneous statistics being attributed to the wrong flow.
>
> This patch fixes the issue by caching the entire flow_mod, including
> actions, upon translating an xlate_learn action. This is used to perform
> a flow_mod from scratch with the original flow, rather than simply
> refreshing the rule that was created during the creation of the xcache.
>
> Bug #1252997.
>
> Reported-by: Scott Hendricks 
> Signed-off-by: Joe Stringer 
> ---
>  ofproto/ofproto-dpif-xlate.c |   58
> ++
>  1 file changed, 30 insertions(+), 28 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 068d54f..e8f3dbf 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -265,7 +265,9 @@ struct xc_entry {
>  uint16_t vid;
>  } bond;
>  struct {
> -struct rule_dpif *rule;
> +struct ofproto_dpif *ofproto;
> +struct ofputil_flow_mod *fm;
> +struct ofpbuf *ofpacts;
>  } learn;
>  struct {
>  struct ofproto_dpif *ofproto;
> @@ -2977,34 +2979,38 @@ xlate_bundle_action(struct xlate_ctx *ctx,
>  }
>
>  static void
> -xlate_learn_action(struct xlate_ctx *ctx,
> -   const struct ofpact_learn *learn)
> +xlate_learn_action__(struct xlate_ctx *ctx, const struct ofpact_learn
> *learn,
> + struct ofputil_flow_mod *fm, struct ofpbuf *ofpacts)
>  {
> -uint64_t ofpacts_stub[1024 / 8];
> -struct ofputil_flow_mod fm;
> -struct ofpbuf ofpacts;
> +learn_execute(learn, &ctx->xin->flow, fm, ofpacts);
> +if (ctx->xin->may_learn) {
> +ofproto_dpif_flow_mod(ctx->xbridge->ofproto, fm);
> +}
> +}
>
> +static void
> +xlate_learn_action(struct xlate_ctx *ctx, const struct ofpact_learn
> *learn)
> +{
>  ctx->xout->has_learn = true;
> -
>  learn_mask(learn, &ctx->xout->wc);
>
> -if (!ctx->xin->may_learn) {
> -return;
> -}
> -
> -ofpbuf_use_stub(&ofpacts, ofpacts_stub, sizeof ofpacts_stub);
> -learn_execute(learn, &ctx->xin->flow, &fm, &ofpacts);
> -ofproto_dpif_flow_mod(ctx->xbridge->ofproto, &fm);
> -ofpbuf_uninit(&ofpacts);
> -
>  if (ctx->xin->xcache) {
>  struct xc_entry *entry;
>
>  entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
> -/* Lookup the learned rule, taking a reference on it.  The
> reference
> - * is released when this cache entry is deleted. */
> -rule_dpif_lookup(ctx->xbridge->ofproto, &ctx->xin->flow, NULL,
> - &entry->u.learn.rule, true, NULL);
> +entry->u.learn.ofproto = ctx->xbridge->ofproto;
> +entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);
> +entry->u.learn.ofpacts = ofpbuf_new(1024);
> +xlate_learn_action__(ctx, learn, entry->u.learn.fm,
> + entry->u.learn.ofpacts);
> +} else if (ctx->xin->may_learn) {
> +uint64_t ofpacts_stub[1024 / 8];
> +struct ofputil_flow_mod fm;
> +struct ofpbuf ofpacts;
> +
> +ofpbuf_use_stub(&ofpacts, ofpacts_stub, sizeof ofpacts_stub);
> +xlate_learn_action__(ctx, learn, &fm, &ofpacts);
> +ofpbuf_uninit(&ofpacts);
>  }
>  }
>
> @@ -3911,12 +3917,8 @@ xlate_push_stats(struct xlate_cache *xcache, bool
> may_learn,
>  break;
>  case XC_LEARN:
>  if (may_learn) {
> -struct rule_dpif *rule = entry->u.le

Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Ryan Wilson 76511
Hey Jarno,

I didn't consider that possibility, but that is a much better solution
which makes for a much cleaner patch. I just posted a new patch using this
direct pointer assignment. Looks like I'm a bit rusty on my C :P

Thanks for the review!

Ryan

On 6/3/14 10:10 AM, "Jarno Rajahalme"  wrote:

>Ryan,
>
>See comments below.
>
>  Jarno
>
>On Jun 2, 2014, at 3:54 PM, Ryan Wilson  wrote:
>
>> When a bridge of datatype type netdev receives a packet, it copies the
>> packet from the NIC to a buffer in userspace. Currently, when making
>> an upcall, the packet is again copied to the upcall's buffer. However,
>> this extra copy is not necessary when the datapath exists in userspace
>> as the upcall can directly access the packet data.
>> 
>> This patch eliminates this extra copy of the packet data in most cases.
>> In cases where the packet may still be used later by callers of
>> dp_netdev_execute_actions, making a copy of the packet data is still
>> necessary.
>> ---
>> lib/dpif-netdev.c |   15 +--
>> lib/ofpbuf.c  |   18 ++
>> lib/ofpbuf.h  |2 ++
>> 3 files changed, 25 insertions(+), 10 deletions(-)
>> 
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 91c83d6..7cb0478 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct
>>ofpbuf *packet,
>>miniflow_hash_5tuple(&key.flow, 0)
>>% dp->n_handlers,
>>DPIF_UC_MISS, &key.flow, NULL);
>> -ofpbuf_delete(packet);
>> }
>> }
>> 
>> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp,
>>struct ofpbuf *packet,
>> if (userdata) {
>> buf_size += NLA_ALIGN(userdata->nla_len);
>> }
>> -buf_size += ofpbuf_size(packet);
>> ofpbuf_init(buf, buf_size);
>> 
>> /* Put ODP flow. */
>> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev
>>*dp, struct ofpbuf *packet,
>>   NLA_ALIGN(userdata->nla_len));
>> }
>> 
>> -ofpbuf_set_data(&upcall->packet,
>> -ofpbuf_put(buf, ofpbuf_data(packet),
>>ofpbuf_size(packet)));
>> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
>> +ofpbuf_set(&upcall->packet, packet);
>> 
>
>Did you consider the possibility to just assign the packet instead? We do
>this elsewhere, so it should work here too:
>
>upcall->packet = *packet;
>
>> seq_change(q->seq);
>> 
>> error = 0;
>> } else {
>> dp_netdev_count_packet(dp, DP_STAT_LOST);
>> +ofpbuf_delete(packet);
>> error = ENOBUFS;
>> }
>> ovs_mutex_unlock(&q->mutex);
>> @@ -2120,19 +2117,17 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
>> break;
>> 
>> case OVS_ACTION_ATTR_USERSPACE: {
>> +struct ofpbuf *userspace_packet;
>> const struct nlattr *userdata;
>> 
>> userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
>> +userspace_packet = may_steal ? packet : ofpbuf_clone(packet);
>> 
>> -dp_netdev_output_userspace(aux->dp, packet,
>> +dp_netdev_output_userspace(aux->dp, userspace_packet,
>>miniflow_hash_5tuple(aux->key, 0)
>>% aux->dp->n_handlers,
>>DPIF_UC_ACTION, aux->key,
>>userdata);
>> -
>> -if (may_steal) {
>> -ofpbuf_delete(packet);
>> -}
>> break;
>> }
>> 
>> diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c
>> index 1f4b61d..ade940b 100644
>> --- a/lib/ofpbuf.c
>> +++ b/lib/ofpbuf.c
>> @@ -555,3 +555,21 @@ ofpbuf_resize_l2(struct ofpbuf *b, int increment)
>> ofpbuf_adjust_layer_offset(&b->l2_5_ofs, increment);
>> return b->frame;
>> }
>> +
>> +/* Sets ofpbuf 'dst' equal to 'src'. Note that 'dst' and 'src' share
>>the
>> + * same buffer, so calling ofpbuf_delete('dst') will also delete
>>'src'. */
>> +void
>> +ofpbuf_set(struct ofpbuf *dst, struct ofpbuf *src)
>> +{
>> +ofpbuf_set_base(dst, ofpbuf_base(src));
>> +ofpbuf_set_data(dst, ofpbuf_data(src));
>> +ofpbuf_set_size(dst, ofpbuf_size(src));
>> +
>> +dst->allocated = src->allocated;
>> +dst->frame = src->frame;
>> +dst->l2_5_ofs = src->l2_5_ofs;
>> +dst->l3_ofs = src->l3_ofs;
>> +dst->l4_ofs = src->l4_ofs;
>> +dst->source = src->source;
>> +dst->list_node = src->list_node;
>> +}
>
>This would be unnecessary if the plain assignment works also for DPDK.
>
>> diff --git a/lib/ofpbuf.h b/lib/ofpbuf.h
>> index 85be899..ec08c27 100644
>> --- a/lib/ofpbuf.h
>> +++ b/lib/ofpbuf.h
>> @@ -101,6 +101,8 @@ static inline const void
>>*ofpbuf_get_udp_payload(const struct ofpbuf *);
>> static inline const void *ofpbuf_get_sctp_payload(const struct ofpbuf
>>*);
>> static inline cons

Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Pravin Shelar
On Tue, Jun 3, 2014 at 10:33 AM, Ryan Wilson  wrote:
> When a bridge of datatype type netdev receives a packet, it copies the
> packet from the NIC to a buffer in userspace. Currently, when making
> an upcall, the packet is again copied to the upcall's buffer. However,
> this extra copy is not necessary when the datapath exists in userspace
> as the upcall can directly access the packet data.
>
> This patch eliminates this extra copy of the packet data in most cases.
> In cases where the packet may still be used later by callers of
> dp_netdev_execute_actions, making a copy of the packet data is still
> necessary.
>
> Signed-off-by: Ryan Wilson 
> ---
> v2: Addressed Jarno's comment to use direct pointer assignment for
> upcall->packet instead of ofpbuf_set().
> ---
>  lib/dpif-netdev.c |   15 +--
>  1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 91c83d6..c89ae20 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct ofpbuf 
> *packet,
> miniflow_hash_5tuple(&key.flow, 0)
> % dp->n_handlers,
> DPIF_UC_MISS, &key.flow, NULL);
> -ofpbuf_delete(packet);
>  }
>  }
>
> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
> ofpbuf *packet,
>  if (userdata) {
>  buf_size += NLA_ALIGN(userdata->nla_len);
>  }
> -buf_size += ofpbuf_size(packet);
>  ofpbuf_init(buf, buf_size);
>
>  /* Put ODP flow. */
> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev *dp, 
> struct ofpbuf *packet,
>NLA_ALIGN(userdata->nla_len));
>  }
>
> -ofpbuf_set_data(&upcall->packet,
> -ofpbuf_put(buf, ofpbuf_data(packet), 
> ofpbuf_size(packet)));
> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
> +upcall->packet = *packet;
>

This would not work with DPDK. ofpbuf from dpdk is special case where
ofpbuf and data are allocated from same memory object. Therefore
moving ofpbuf->data is nontrivial.

To make it work we need atleast following covered.
1. Define flag for source of ofpbuf header. So that we can track
header and data independently.
2. Fix dpdk ofpbuf destructor to free correct dpdk memory object.

Thanks,
Pravin.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Jarno Rajahalme
I think Pravin will need to take a look in case he has something in the 
pipeline that might conflict with this.

On my part:

Acked-by: Jarno Rajahalme 

  Jarno

On Jun 3, 2014, at 10:33 AM, Ryan Wilson  wrote:

> When a bridge of datatype type netdev receives a packet, it copies the
> packet from the NIC to a buffer in userspace. Currently, when making
> an upcall, the packet is again copied to the upcall's buffer. However,
> this extra copy is not necessary when the datapath exists in userspace
> as the upcall can directly access the packet data.
> 
> This patch eliminates this extra copy of the packet data in most cases.
> In cases where the packet may still be used later by callers of
> dp_netdev_execute_actions, making a copy of the packet data is still
> necessary.
> 
> Signed-off-by: Ryan Wilson 
> ---
> v2: Addressed Jarno's comment to use direct pointer assignment for
> upcall->packet instead of ofpbuf_set().
> ---
> lib/dpif-netdev.c |   15 +--
> 1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 91c83d6..c89ae20 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct ofpbuf 
> *packet,
>miniflow_hash_5tuple(&key.flow, 0)
>% dp->n_handlers,
>DPIF_UC_MISS, &key.flow, NULL);
> -ofpbuf_delete(packet);
> }
> }
> 
> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
> ofpbuf *packet,
> if (userdata) {
> buf_size += NLA_ALIGN(userdata->nla_len);
> }
> -buf_size += ofpbuf_size(packet);
> ofpbuf_init(buf, buf_size);
> 
> /* Put ODP flow. */
> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev *dp, 
> struct ofpbuf *packet,
>   NLA_ALIGN(userdata->nla_len));
> }
> 
> -ofpbuf_set_data(&upcall->packet,
> -ofpbuf_put(buf, ofpbuf_data(packet), 
> ofpbuf_size(packet)));
> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
> +upcall->packet = *packet;
> 
> seq_change(q->seq);
> 
> error = 0;
> } else {
> dp_netdev_count_packet(dp, DP_STAT_LOST);
> +ofpbuf_delete(packet);
> error = ENOBUFS;
> }
> ovs_mutex_unlock(&q->mutex);
> @@ -2120,19 +2117,17 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
> break;
> 
> case OVS_ACTION_ATTR_USERSPACE: {
> +struct ofpbuf *userspace_packet;
> const struct nlattr *userdata;
> 
> userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
> +userspace_packet = may_steal ? packet : ofpbuf_clone(packet);
> 
> -dp_netdev_output_userspace(aux->dp, packet,
> +dp_netdev_output_userspace(aux->dp, userspace_packet,
>miniflow_hash_5tuple(aux->key, 0)
>% aux->dp->n_handlers,
>DPIF_UC_ACTION, aux->key,
>userdata);
> -
> -if (may_steal) {
> -ofpbuf_delete(packet);
> -}
> break;
> }
> 
> -- 
> 1.7.9.5
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] acinclude.m4: dpdk: use AC_SEARCH_LIBS and -ldl

2014-06-03 Thread Daniele Di Proietto
Hi Ben,

thanks for the suggestion.
I’m about to send another patch that makes use of AC_LINK_IFELSE directly.

Daniele

On May 12, 2014, at 5:55 PM, Ben Pfaff  wrote:

> On Mon, May 12, 2014 at 12:47:47PM -0700, Daniele Di Proietto wrote:
>> On some systems libintel_dpdk.a fails to link with libopenvswitch
>> unless -ldl is used. This should address the issue
>> 
>> Signed-off-by: Daniele Di Proietto 
> 
> Hmm.
> 
> I have never noticed "unset" used in this way in Autoconf tests.  I
> suspect that it means that we are misusing AC_SEARCH_LIBS.  Probably,
> that means that we should instead use AC_LINK_IFELSE directly

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] acinclude.m4: dpdk: link with -ldl if necessary

2014-06-03 Thread Daniele Di Proietto
On some systems libintel_dpdk.a fails to link with libopenvswitch
unless -ldl is used. This should address the issue

Signed-off-by: Daniele Di Proietto 
---
 acinclude.m4 | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index bd71a82..69d65f0 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -170,10 +170,34 @@ AC_DEFUN([OVS_CHECK_DPDK], [
 
 DPDK_INCLUDE=$RTE_SDK/include
 DPDK_LIB_DIR=$RTE_SDK/lib
-DPDK_LIBS="$DPDK_LIB_DIR/libintel_dpdk.a"
 
-LIBS="$DPDK_LIBS $LIBS"
-CPPFLAGS="-I$DPDK_INCLUDE $CPPFLAGS"
+LDFLAGS="$LDFLAGS -L$DPDK_LIB_DIR"
+CFLAGS="$CFLAGS -I$DPDK_INCLUDE"
+
+# On some systems we have to add -ldl to link with dpdk
+#
+# This code, at first, tries to link without -ldl (""),
+# then adds it and tries again.
+# Before each attempt the search cache must be unset,
+# otherwise autoconf will stick with the old result
+
+found=false
+save_LIBS=$LIBS
+for extras in "" "-ldl"; do
+LIBS="-lintel_dpdk $extras $save_LIBS"
+AC_LINK_IFELSE(
+   [AC_LANG_PROGRAM([#include 
+ #include ],
+[int rte_argc; char ** rte_argv;
+ rte_eal_init(rte_argc, rte_argv);])],
+   [found=true])
+if $found; then
+break
+fi
+done
+if $found; then :; else
+AC_MSG_ERROR([cannot link with dpdk])
+fi
 
 AC_DEFINE([DPDK_NETDEV], [1], [System uses the DPDK module.])
   else
-- 
2.0.0.rc2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/1] netdev-dpdk-vhost / add dpdk vhost ports

2014-06-03 Thread Pravin Shelar
On Tue, Jun 3, 2014 at 3:20 AM, maryam.tahhan  wrote:
> This patch implements the vhost-net offload API.  It adds support for
> a new port type to userspace datapath called dpdkvhost. This allows KVM
> (QEMU) to offload the servicing of virtio-net devices to it's associated
> dpdkvhost port. Instructions for use are in INSTALL.DPDK.
>
> This has been tested on Intel multi-core platforms and with clients that
> have virtio-net interfaces.
>

Thanks for the patch.
I have not done with reviewing this patch, but have a high level
comment. I see some common code in netdev-dpdk and netdev-dpdk-vhost,
Is it possible to reuse that code?
something like Gerald did for dpdk rings patch.



> Signed-off-by: maryam.tahhan 
> ---
>  INSTALL.DPDK  |  190 +
>  Makefile.am   |   44 +-
>  configure.ac  |1 +
>  lib/automake.mk   |9 +-
>  lib/netdev-bsd.c  |1 -
>  lib/netdev-dpdk-vhost.c   | 1219 
> +
>  lib/netdev-dpdk-vhost.h   |   62 ++
>  lib/netdev-dpdk.c |   62 +-
>  lib/netdev-dpdk.h |   13 +
>  lib/netdev.c  |5 +-
>  lib/vhost-net-cdev.c  |  374 ++
>  lib/vhost-net-cdev.h  |   81 +++
>  lib/virtio-net.c  |  989 ++
>  lib/virtio-net.h  |  117 
>  utilities/automake.mk |3 +-
>  utilities/eventfd_link/Makefile.in|   86 +++
>  utilities/eventfd_link/eventfd_link.c |  189 +
>  utilities/eventfd_link/eventfd_link.h |   79 +++
>  utilities/qemu-wrap.py|  389 +++
>  19 files changed, 3858 insertions(+), 55 deletions(-)
>  create mode 100644 lib/netdev-dpdk-vhost.c
>  create mode 100644 lib/netdev-dpdk-vhost.h
>  create mode 100644 lib/vhost-net-cdev.c
>  create mode 100644 lib/vhost-net-cdev.h
>  create mode 100644 lib/virtio-net.c
>  create mode 100644 lib/virtio-net.h
>  create mode 100644 utilities/eventfd_link/Makefile.in
>  create mode 100644 utilities/eventfd_link/eventfd_link.c
>  create mode 100644 utilities/eventfd_link/eventfd_link.h
>  create mode 100755 utilities/qemu-wrap.py
>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Can VXLAN work with VIP (Keepalived)?

2014-06-03 Thread Jesse Gross
On Mon, Jun 2, 2014 at 6:37 PM, Changbin Liu  wrote:
> Hi Folks,
>
> Using Open vSwitch I am setting up VXLAN tunnels between hosts. I intend to
> make VXLAN tunnels "highly available" via Keepalived. Specifically, there
> are one master node and one backup node for each host, and they share a
> Virtual IP (VIP) via Keepalived, and the VXLAN tunnel will use the VIP as
> endpoint IP. This way when the master node fails, the backup node will
> automatically have the VIP shifted to it. Therefore, human efforts are not
> involved in failure recovery.
>
> I thought VXLAN can work with VIP this way, since VXLAN simply uses UDP
> packets, and as long as the endpoint VIP does not change, it should work.
> But it turns out that it didn't at all. I wonder whether this is intended
> behavior or I missed something.

It should be possible to use VXLAN with a VIP. The IP stack is
configured outside of OVS so you should make sure that it is working
in Linux first.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] cmap test failures

2014-06-03 Thread Ben Pfaff
[adding ovs-dev, hope that's OK]

On Tue, Jun 03, 2014 at 10:19:23AM -0700, Jarno Rajahalme wrote:
> 
> On Jun 3, 2014, at 9:56 AM, Gurucharan Shetty  wrote:
> 
> > Hello Jarno,
> > Occasionally the cmap tests fail on Windows platform with the following o/p.
> > 
> > #tests/ovstest test-cmap check 1
> > test-cmap: lib/ovs-atomic-locked.c:49: pthread_mutex_lock failed
> > (Resource deadlock avoided)
> > 
> 
> This is now fixed in master. However, I also noticed that cmap (or
> any heavy use of ovs-rcu and/or ovs-atomic) is horribly slow with
> pthreads. This is because pthreads doesn?t have memory barriers
> (other than the locks), so lockless atomics cannot be supported with
> pthreads alone.
> 
> We should look into having more native support for MSVC atomics in
> lib/ovs-atomic.h. Depending on the level of support for C11 atomics,
> any of the existing lib/ovs-atomic-*.h files could serve as a
> starting point. I?m pretty sure a thing like ova-atomic-gcc4+.h is
> possible, since MSVC has memory barrier support.

I forgot to mention that I looked at this earlier and found a couple
of obstacles.  The main one is that MSVC doesn't have the GNU C typeof
or ({...}) extensions.  I don't see a way to implement the C11 atomics
without those and without evaluating parameters to the atomic_*()
operations more than once.  So we might have to change the interface
to require clients to tolerate multiple evaluation.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 2/2] lib/classifier: Optimize megaflows for single rule case.

2014-06-03 Thread Jarno Rajahalme

On May 22, 2014, at 10:43 AM, Ben Pfaff  wrote:

> I find myself wondering whether we could end up with a funny
> paradoxical mask, e.g. mark some bit of a TCP port as looked at even
> though we didn't mark the IP protocol as looked at.  I am not sure
> whether this can happen or whether this is a problem.  Actually,
> looking further, I guess it can't happen because we always unwildcard
> all the fields up to the current stage when we jump to range_out.  In
> fact, I think that range_out wildcards the current stage, too: is that
> defeating the intended optimization here?  (I see that you updated the
> tests so presumably not?)

Coming back to an old comment here… So, what we could do instead is to fill in 
the mask bits as we successfully compare, and fill in one bit of the differing 
uint32_t. This would get rid of another scan through the map in both outcomes, 
so it would be faster and simpler, and would take care of the prerequisites 
independent of the chosen boundaries of the lookup ranges (as long as the 
difference and the prerequisite are not in the same uint32_t, which is not the 
case for any field in struct flow).

This would be the incremental, what do you think:

diff --git a/lib/classifier.c b/lib/classifier.c
index b5416c6..2d47310 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -1385,8 +1385,7 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], 
unsigned int n_tries,
 static inline bool
 miniflow_and_mask_matches_flow(const struct miniflow *flow,
const struct minimask *mask,
-   const struct flow *target,
-   struct flow_wildcards *wc)
+   const struct flow *target)
 {
 const uint32_t *flowp = miniflow_get_u32_values(flow);
 const uint32_t *maskp = miniflow_get_u32_values(&mask->masks);
@@ -1396,12 +1395,6 @@ miniflow_and_mask_matches_flow(const struct miniflow 
*flow,
 uint32_t diff = (*flowp++ ^ FLOW_U32_VALUE(target, idx)) & *maskp++;
 
 if (diff) {
-/* Only unwildcard if none of the differing bits is already
- * exact-matched. */
-if (wc && !(FLOW_U32_VALUE(&wc->masks, idx) & diff)) {
-/* Keep one bit of the difference. */
-FLOW_U32_VALUE(&wc->masks, idx) |= rightmost_1bit(diff);
-}
 return false;
 }
 }
@@ -1417,7 +1410,7 @@ find_match(const struct cls_subtable *subtable, const 
struct flow *flow,
 
 HMAP_FOR_EACH_WITH_HASH (rule, hmap_node, hash, &subtable->rules) {
 if (miniflow_and_mask_matches_flow(&rule->flow, &subtable->mask,
-   flow, NULL)) {
+   flow)) {
 return rule;
 }
 }
@@ -1425,6 +1418,42 @@ find_match(const struct cls_subtable *subtable, const 
struct flow *flow,
 return NULL;
 }
 
+/* Returns true if 'target' satisifies 'flow'/'mask', that is, if each bit
+ * for which 'flow', for which 'mask' has a bit set, specifies a particular
+ * value has the correct value in 'target'.
+ *
+ * This function is equivalent to miniflow_and_mask_matches_flow() but this
+ * version fills in the mask bits in 'wc'. */
+static inline bool
+miniflow_and_mask_matches_flow_wc(const struct miniflow *flow,
+  const struct minimask *mask,
+  const struct flow *target,
+  struct flow_wildcards *wc)
+{
+const uint32_t *flowp = miniflow_get_u32_values(flow);
+const uint32_t *maskp = miniflow_get_u32_values(&mask->masks);
+uint32_t idx;
+
+MAP_FOR_EACH_INDEX(idx, mask->masks.map) {
+uint32_t mask = *maskp++;
+uint32_t diff = (*flowp++ ^ FLOW_U32_VALUE(target, idx)) & mask;
+
+if (diff) {
+/* Only unwildcard if none of the differing bits is already
+ * exact-matched. */
+if (!(FLOW_U32_VALUE(&wc->masks, idx) & diff)) {
+/* Keep one bit of the difference. */
+FLOW_U32_VALUE(&wc->masks, idx) |= rightmost_1bit(diff);
+}
+return false;
+}
+/* Fill in the bits that were looked at. */
+FLOW_U32_VALUE(&wc->masks, idx) |= mask;
+}
+
+return true;
+}
+
 static struct cls_match *
 find_match_wc(const struct cls_subtable *subtable, const struct flow *flow,
   struct trie_ctx trie_ctx[CLS_MAX_TRIES], unsigned int n_tries,
@@ -1467,11 +1496,11 @@ find_match_wc(const struct cls_subtable *subtable, 
const struct flow *flow,
  * optimization. */
 if (!inode->s) {
 ASSIGN_CONTAINER(rule, inode - i, index_nodes);
-if (miniflow_and_mask_matches_flow(&rule->flow, &subtable->mask,
-   flow, wc)) {
-goto out;
+if (miniflow_and_mask_matches_flow_wc(&rule->flow, &subtable->m

[ovs-dev] MS HyperV-End Users database

2014-06-03 Thread June White
Hi,

Would you be interested to acquire a list of MS HyperV-End Users & Partners
across worldwide who can be potential clients for your company?

 

Data fields are name, title, email, phone, address and company details,
linkedIn Url or database can be filtered based on the criteria given by the
clients.

 

Let me know if you are focused on any particular application/technology
users or titles so that I can get back to you with all the relevant details.

 

Await your response.

 

Kind Regards,

June White

Business Development

 

Our Services:- Email List I Tracked Email campaign I Email/Data Appending I
Search Engine Optimization I Custom Built List I Tele Marketing I Multi
Channel Marketing I Lead Nurturing I Data Analysis I Customer profiling I
Web Analysis I Campaign Planning

If you are not interested to receive further communication in the future,
please send a reply with the email subject line as "Leave Out".

 

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread Jarno Rajahalme
OVS is slow when compiled with pthreads atomics.  Add a generic note
in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
provides additional detail.

Signed-off-by: Jarno Rajahalme 
---
 INSTALL  |4 
 lib/ovs-atomic.h |3 +++
 2 files changed, 7 insertions(+)

diff --git a/INSTALL b/INSTALL
index 5c869b2..7b08172 100644
--- a/INSTALL
+++ b/INSTALL
@@ -28,6 +28,10 @@ you will need the following software:
   analysis and thread-safety checks.  For Ubuntu, there are
   nightly built packages available on clang's website.
 
+* While OVS may be compatible with other compilers, optimal
+  support for atomic operations may be missing, making OVS
+  very slow (see lib/ovs-atomic.h).
+
 - libssl, from OpenSSL, is optional but recommended if you plan to
   connect the Open vSwitch to an OpenFlow controller.  libssl is
   required to establish confidentiality and authenticity in the
diff --git a/lib/ovs-atomic.h b/lib/ovs-atomic.h
index c21b7ee..2452846 100644
--- a/lib/ovs-atomic.h
+++ b/lib/ovs-atomic.h
@@ -257,6 +257,9 @@
 #elif HAVE_GCC4_ATOMICS
 #include "ovs-atomic-gcc4+.h"
 #else
+/* ovs-atomic-pthreads implementation is provided for portability.
+ * It might be too slow for real use because Open vSwitch is
+ * optimized for platforms where real atomic ops are available. */
 #include "ovs-atomic-pthreads.h"
 #endif
 #undef IN_OVS_ATOMIC_H
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] cmap: Fix deadlock with ovs-atomic-pthreads

2014-06-03 Thread Jarno Rajahalme

On Jun 2, 2014, at 8:50 PM, YAMAMOTO Takashi  wrote:

>> 
>> On Jun 2, 2014, at 12:51 PM, Ben Pfaff  wrote:
>> 
>>> On Mon, Jun 02, 2014 at 12:50:45PM -0700, Jarno Rajahalme wrote:
 
 On Jun 2, 2014, at 11:45 AM, Ben Pfaff  wrote:
 
> I'd really hope that any serious OVS implementation would be able to
> use C11 or GCC or Clang or other compiler-specific techniques to get a
> "real" implementation of atomics.  Really the pthreads version is just
> to make porting easier.
 
 Are we already documenting this somewhere?
>>> 
>>> I don't think so.  It wasn't the attitude I started out with;
>>> originally I hoped that the pthreads implementation would be viable
>>> for real use, but I'm really not sure about that any longer.
>> 
>> I just ran \x93time ./ovstest test-cmap check 1\x94, with 
>> ovs-atomics-pthreads it takes 16 times longer than on master. So it is 
>> likely unusable in practice, so this would be the time to document it, I 
>> guess.
> 
> while i guess test-cmap is far from the real usage,
> documenting expectations is a good idea.
> 
> "ovs-atomic-pthreads implementation is provided for portability.
> It might be too slow for real use because Open vSwitch is
> optimized for platforms where real atomic ops are available.”

Thanks,

I just sent a patch incorporating this into a comment and adding a more generic 
note in INSTALL.

  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Alex Wang
After discussing the issue with few people,

I think we should keep the old_xcache during the full revalidation.

And in xlate_learn_action(), if may_learn is false, we should check if the
learned rule is in the old_xcache, if it is, we should copy the XC_LEARN
entry
to the new_xcache list (but we do not refresh the learnt rule again, since
may_learn to false).

Thoughts?

Alex Wang,


On Tue, Jun 3, 2014 at 10:39 AM, Alex Wang  wrote:

> I confirm this patch fix the bug, but have a high level concern,
>
>
> Assume another learnt flow with hard_timeout=10 and was never hit after
> install, now, if there is frequent change to the flow table and
> need_revalidate
> is constantly 'true', this learnt flow will never timeout...  since we
> always
> redo the flow_mod during revalidation,
>
> Let's discuss further on this scenario, I'm thinking about solutions,
>
> Thanks,
> Alex Wang,
>
>
>
>
> On Tue, Jun 3, 2014 at 2:59 AM, Joe Stringer 
> wrote:
>
>> Caching the results of xlate_learn was previously dependent on the state
>> of the 'may_learn' flag. This meant that if the caller did not specify
>> that this flow may learn, then a learn entry would not be cached.
>> However, the xlate_cache tends to be used on a recurring basis, so
>> failing to cache the learn entry can provide unexpected behaviour later
>> on, particularly in corner cases.
>>
>> Such a corner case occurred previously:-
>> * Revalidation was requested.
>> * A flow with a learn action was dumped.
>> * The flow had no packets.
>> * The flow's corresponding xcache was cleared, and the flow revalidated.
>> * The flow went on to receive packets after the xcache is re-created.
>>
>> In this case, the xcache would be re-created, but would not refresh the
>> timeouts on the learnt flow until the next time it was cleared, even if
>> it received more traffic. This would cause flows to time out sooner than
>> expected. Symptoms of this bug may include unexpected forwarding
>> behaviour or extraneous statistics being attributed to the wrong flow.
>>
>> This patch fixes the issue by caching the entire flow_mod, including
>> actions, upon translating an xlate_learn action. This is used to perform
>> a flow_mod from scratch with the original flow, rather than simply
>> refreshing the rule that was created during the creation of the xcache.
>>
>> Bug #1252997.
>>
>> Reported-by: Scott Hendricks 
>> Signed-off-by: Joe Stringer 
>> ---
>>  ofproto/ofproto-dpif-xlate.c |   58
>> ++
>>  1 file changed, 30 insertions(+), 28 deletions(-)
>>
>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>> index 068d54f..e8f3dbf 100644
>> --- a/ofproto/ofproto-dpif-xlate.c
>> +++ b/ofproto/ofproto-dpif-xlate.c
>> @@ -265,7 +265,9 @@ struct xc_entry {
>>  uint16_t vid;
>>  } bond;
>>  struct {
>> -struct rule_dpif *rule;
>> +struct ofproto_dpif *ofproto;
>> +struct ofputil_flow_mod *fm;
>> +struct ofpbuf *ofpacts;
>>  } learn;
>>  struct {
>>  struct ofproto_dpif *ofproto;
>> @@ -2977,34 +2979,38 @@ xlate_bundle_action(struct xlate_ctx *ctx,
>>  }
>>
>>  static void
>> -xlate_learn_action(struct xlate_ctx *ctx,
>> -   const struct ofpact_learn *learn)
>> +xlate_learn_action__(struct xlate_ctx *ctx, const struct ofpact_learn
>> *learn,
>> + struct ofputil_flow_mod *fm, struct ofpbuf *ofpacts)
>>  {
>> -uint64_t ofpacts_stub[1024 / 8];
>> -struct ofputil_flow_mod fm;
>> -struct ofpbuf ofpacts;
>> +learn_execute(learn, &ctx->xin->flow, fm, ofpacts);
>> +if (ctx->xin->may_learn) {
>> +ofproto_dpif_flow_mod(ctx->xbridge->ofproto, fm);
>> +}
>> +}
>>
>> +static void
>> +xlate_learn_action(struct xlate_ctx *ctx, const struct ofpact_learn
>> *learn)
>> +{
>>  ctx->xout->has_learn = true;
>> -
>>  learn_mask(learn, &ctx->xout->wc);
>>
>> -if (!ctx->xin->may_learn) {
>> -return;
>> -}
>> -
>> -ofpbuf_use_stub(&ofpacts, ofpacts_stub, sizeof ofpacts_stub);
>> -learn_execute(learn, &ctx->xin->flow, &fm, &ofpacts);
>> -ofproto_dpif_flow_mod(ctx->xbridge->ofproto, &fm);
>> -ofpbuf_uninit(&ofpacts);
>> -
>>  if (ctx->xin->xcache) {
>>  struct xc_entry *entry;
>>
>>  entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
>> -/* Lookup the learned rule, taking a reference on it.  The
>> reference
>> - * is released when this cache entry is deleted. */
>> -rule_dpif_lookup(ctx->xbridge->ofproto, &ctx->xin->flow, NULL,
>> - &entry->u.learn.rule, true, NULL);
>> +entry->u.learn.ofproto = ctx->xbridge->ofproto;
>> +entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);
>> +entry->u.learn.ofpacts = ofpbuf_new(1024);
>> +xlate_learn_action__(ctx, learn, entry->u.learn.fm,
>> + entry->u.learn.ofpacts

Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread Ben Pfaff
On Tue, Jun 03, 2014 at 01:09:11PM -0700, Jarno Rajahalme wrote:
> OVS is slow when compiled with pthreads atomics.  Add a generic note
> in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
> provides additional detail.
> 
> Signed-off-by: Jarno Rajahalme 

I think this is only likely to affect Windows (everyone else uses GCC
or Clang) so should we put it in BUILD.Windows?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 2/2] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Ryan Wilson
When a bridge of datatype type netdev receives a packet, it copies the
packet from the NIC to a buffer in userspace. Currently, when making
an upcall, the packet is again copied to the upcall's buffer. However,
this extra copy is not necessary when the datapath exists in userspace
as the upcall can directly access the packet data.

This patch eliminates this extra copy of the packet data in most cases.
In cases where the packet may still be used later by callers of
dp_netdev_execute_actions, making a copy of the packet data is still
necessary.

Signed-off-by: Ryan Wilson 
Acked-by: Jarno Rajahalme 
---
v2: Addressed Jarno's comment to use direct pointer assignment for
upcall->packet instead of ofpbuf_set().
v3: Addressed issues with DPDK mentioned by Pravin with previous
patch.
---
 lib/dpif-netdev.c |   15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 91c83d6..c89ae20 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct ofpbuf 
*packet,
miniflow_hash_5tuple(&key.flow, 0)
% dp->n_handlers,
DPIF_UC_MISS, &key.flow, NULL);
-ofpbuf_delete(packet);
 }
 }
 
@@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
ofpbuf *packet,
 if (userdata) {
 buf_size += NLA_ALIGN(userdata->nla_len);
 }
-buf_size += ofpbuf_size(packet);
 ofpbuf_init(buf, buf_size);
 
 /* Put ODP flow. */
@@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
ofpbuf *packet,
   NLA_ALIGN(userdata->nla_len));
 }
 
-ofpbuf_set_data(&upcall->packet,
-ofpbuf_put(buf, ofpbuf_data(packet), 
ofpbuf_size(packet)));
-ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
+upcall->packet = *packet;
 
 seq_change(q->seq);
 
 error = 0;
 } else {
 dp_netdev_count_packet(dp, DP_STAT_LOST);
+ofpbuf_delete(packet);
 error = ENOBUFS;
 }
 ovs_mutex_unlock(&q->mutex);
@@ -2120,19 +2117,17 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
 break;
 
 case OVS_ACTION_ATTR_USERSPACE: {
+struct ofpbuf *userspace_packet;
 const struct nlattr *userdata;
 
 userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
+userspace_packet = may_steal ? packet : ofpbuf_clone(packet);
 
-dp_netdev_output_userspace(aux->dp, packet,
+dp_netdev_output_userspace(aux->dp, userspace_packet,
miniflow_hash_5tuple(aux->key, 0)
% aux->dp->n_handlers,
DPIF_UC_ACTION, aux->key,
userdata);
-
-if (may_steal) {
-ofpbuf_delete(packet);
-}
 break;
 }
 
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread Jarno Rajahalme

On Jun 3, 2014, at 2:03 PM, Ben Pfaff  wrote:

> On Tue, Jun 03, 2014 at 01:09:11PM -0700, Jarno Rajahalme wrote:
>> OVS is slow when compiled with pthreads atomics.  Add a generic note
>> in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
>> provides additional detail.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> I think this is only likely to affect Windows (everyone else uses GCC
> or Clang) so should we put it in BUILD.Windows?


So you think this should not be mentioned in INSTALL? If so, I think the text 
there should be modified to something like “* GCC 4.x or clang C compiler” 
instead of the current “A C compiler, such as…”.

How about adding the following to BUILD.Windows:

diff --git a/BUILD.Windows b/BUILD.Windows
index ca0d252..05956ba 100644
--- a/BUILD.Windows
+++ b/BUILD.Windows
@@ -39,6 +39,10 @@ project from
 ftp://sourceware.org/pub/pthreads-win32/prebuilt-dll-2-9-1-release to a
 directory (e.g.: C:/pthread).
 
+OVS currently has no native support for atomics on Windows.  Pthreads
+are used as a fallback, but some features, such as OVS-RCU are really
+slow without native atomics support.
+
 * Get the Open vSwitch sources from either cloning the repo using git
 or from a distribution tar ball.
 
  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread Gurucharan Shetty
May be we should add a new TODO section in BUILD.Windows like:

diff --git a/BUILD.Windows b/BUILD.Windows
index ca0d252..41e6eab 100644
--- a/BUILD.Windows
+++ b/BUILD.Windows
@@ -85,3 +85,11 @@ For example,
   --with-openssl="C:/OpenSSL-Win32"

 * Run make for the ported executables.
+
+TODO:
+
+
+* OVS currently has no native support for atomics on Windows.  Pthreads
+are used as a fallback, but some features, such as OVS-RCU are really
+slow without native atomics support. Atomics support for Windows has to
+be brought in.

On Tue, Jun 3, 2014 at 2:16 PM, Jarno Rajahalme  wrote:
>
> On Jun 3, 2014, at 2:03 PM, Ben Pfaff  wrote:
>
>> On Tue, Jun 03, 2014 at 01:09:11PM -0700, Jarno Rajahalme wrote:
>>> OVS is slow when compiled with pthreads atomics.  Add a generic note
>>> in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
>>> provides additional detail.
>>>
>>> Signed-off-by: Jarno Rajahalme 
>>
>> I think this is only likely to affect Windows (everyone else uses GCC
>> or Clang) so should we put it in BUILD.Windows?
>
>
> So you think this should not be mentioned in INSTALL? If so, I think the text 
> there should be modified to something like “* GCC 4.x or clang C compiler” 
> instead of the current “A C compiler, such as…”.
>
> How about adding the following to BUILD.Windows:
>
> diff --git a/BUILD.Windows b/BUILD.Windows
> index ca0d252..05956ba 100644
> --- a/BUILD.Windows
> +++ b/BUILD.Windows
> @@ -39,6 +39,10 @@ project from
>  ftp://sourceware.org/pub/pthreads-win32/prebuilt-dll-2-9-1-release to a
>  directory (e.g.: C:/pthread).
>
> +OVS currently has no native support for atomics on Windows.  Pthreads
> +are used as a fallback, but some features, such as OVS-RCU are really
> +slow without native atomics support.
> +
>  * Get the Open vSwitch sources from either cloning the repo using git
>  or from a distribution tar ball.
>
>   Jarno
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Ryan Wilson 76511
Hey Pravin,

Thanks for the catch here! Turns out the header is already tracked in DPDK
with rte_mbuf's buffer address - sizeof(ofpbuf). Thus, I submitted another
patch that, in free_dpdk_buf(), always gets this header and uses this to
free memory.

http://patchwork.openvswitch.org/patch/4375/


Let me know if this patch works.

Cheers,

Ryan

On 6/3/14 10:50 AM, "Pravin Shelar"  wrote:

>On Tue, Jun 3, 2014 at 10:33 AM, Ryan Wilson  wrote:
>> When a bridge of datatype type netdev receives a packet, it copies the
>> packet from the NIC to a buffer in userspace. Currently, when making
>> an upcall, the packet is again copied to the upcall's buffer. However,
>> this extra copy is not necessary when the datapath exists in userspace
>> as the upcall can directly access the packet data.
>>
>> This patch eliminates this extra copy of the packet data in most cases.
>> In cases where the packet may still be used later by callers of
>> dp_netdev_execute_actions, making a copy of the packet data is still
>> necessary.
>>
>> Signed-off-by: Ryan Wilson 
>> ---
>> v2: Addressed Jarno's comment to use direct pointer assignment for
>> upcall->packet instead of ofpbuf_set().
>> ---
>>  lib/dpif-netdev.c |   15 +--
>>  1 file changed, 5 insertions(+), 10 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 91c83d6..c89ae20 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct
>>ofpbuf *packet,
>> miniflow_hash_5tuple(&key.flow, 0)
>> % dp->n_handlers,
>> DPIF_UC_MISS, &key.flow, NULL);
>> -ofpbuf_delete(packet);
>>  }
>>  }
>>
>> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp,
>>struct ofpbuf *packet,
>>  if (userdata) {
>>  buf_size += NLA_ALIGN(userdata->nla_len);
>>  }
>> -buf_size += ofpbuf_size(packet);
>>  ofpbuf_init(buf, buf_size);
>>
>>  /* Put ODP flow. */
>> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev
>>*dp, struct ofpbuf *packet,
>> 
>>NLA_ALIGN(userdata->nla_len));
>>  }
>>
>> -ofpbuf_set_data(&upcall->packet,
>> -ofpbuf_put(buf, ofpbuf_data(packet),
>>ofpbuf_size(packet)));
>> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
>> +upcall->packet = *packet;
>>
>
>This would not work with DPDK. ofpbuf from dpdk is special case where
>ofpbuf and data are allocated from same memory object. Therefore
>moving ofpbuf->data is nontrivial.
>
>To make it work we need atleast following covered.
>1. Define flag for source of ofpbuf header. So that we can track
>header and data independently.
>2. Fix dpdk ofpbuf destructor to free correct dpdk memory object.
>
>Thanks,
>Pravin.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread Jarno Rajahalme
This better,

Acked-by: Jarno Rajahalme 

On Jun 3, 2014, at 2:20 PM, Gurucharan Shetty  wrote:

> May be we should add a new TODO section in BUILD.Windows like:
> 
> diff --git a/BUILD.Windows b/BUILD.Windows
> index ca0d252..41e6eab 100644
> --- a/BUILD.Windows
> +++ b/BUILD.Windows
> @@ -85,3 +85,11 @@ For example,
>   --with-openssl="C:/OpenSSL-Win32"
> 
> * Run make for the ported executables.
> +
> +TODO:
> +
> +
> +* OVS currently has no native support for atomics on Windows.  Pthreads
> +are used as a fallback, but some features, such as OVS-RCU are really
> +slow without native atomics support. Atomics support for Windows has to
> +be brought in.
> 
> On Tue, Jun 3, 2014 at 2:16 PM, Jarno Rajahalme  wrote:
>> 
>> On Jun 3, 2014, at 2:03 PM, Ben Pfaff  wrote:
>> 
>>> On Tue, Jun 03, 2014 at 01:09:11PM -0700, Jarno Rajahalme wrote:
 OVS is slow when compiled with pthreads atomics.  Add a generic note
 in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
 provides additional detail.
 
 Signed-off-by: Jarno Rajahalme 
>>> 
>>> I think this is only likely to affect Windows (everyone else uses GCC
>>> or Clang) so should we put it in BUILD.Windows?
>> 
>> 
>> So you think this should not be mentioned in INSTALL? If so, I think the 
>> text there should be modified to something like “* GCC 4.x or clang C 
>> compiler” instead of the current “A C compiler, such as…”.
>> 
>> How about adding the following to BUILD.Windows:
>> 
>> diff --git a/BUILD.Windows b/BUILD.Windows
>> index ca0d252..05956ba 100644
>> --- a/BUILD.Windows
>> +++ b/BUILD.Windows
>> @@ -39,6 +39,10 @@ project from
>> ftp://sourceware.org/pub/pthreads-win32/prebuilt-dll-2-9-1-release to a
>> directory (e.g.: C:/pthread).
>> 
>> +OVS currently has no native support for atomics on Windows.  Pthreads
>> +are used as a fallback, but some features, such as OVS-RCU are really
>> +slow without native atomics support.
>> +
>> * Get the Open vSwitch sources from either cloning the repo using git
>> or from a distribution tar ball.
>> 
>>  Jarno
>> 
>> ___
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread Ben Pfaff
On Tue, Jun 03, 2014 at 02:16:29PM -0700, Jarno Rajahalme wrote:
> 
> On Jun 3, 2014, at 2:03 PM, Ben Pfaff  wrote:
> 
> > On Tue, Jun 03, 2014 at 01:09:11PM -0700, Jarno Rajahalme wrote:
> >> OVS is slow when compiled with pthreads atomics.  Add a generic note
> >> in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
> >> provides additional detail.
> >> 
> >> Signed-off-by: Jarno Rajahalme 
> > 
> > I think this is only likely to affect Windows (everyone else uses GCC
> > or Clang) so should we put it in BUILD.Windows?
> 
> 
> So you think this should not be mentioned in INSTALL? If so, I think
> the text there should be modified to something like ?* GCC 4.x or
> clang C compiler? instead of the current ?A C compiler, such as??.

I phrased that badly.  I think we should mention it *additionally* in
BUILD.Windows, not *only* there.

I lean toward recommending GCC or Clang, except MSVC 2013 for Windows,
but not claim that those are the only possibilities.

> How about adding the following to BUILD.Windows:

That looks good, thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 1/2] netdev-dpdk: Free DPDK buffer using mbuf's buffer address.

2014-06-03 Thread Ryan Wilson
Previously, DPDK buffers were freed using 'struct ofpbuf' pointers
passed to free_dpdk_buf(). However, if a new ofpbuf allocated on the
stack shares data with the old ofpbuf and the new ofpbuf is to be
freed with free_dpdk_buf(), this will cause an error.

This patch gets the beginning of the allocated buffer from 'struct
rte_mbuf'. This allows ofpbufs to share packet data when using DPDK.

Signed-off-by: Ryan Wilson 
---
 lib/netdev-dpdk.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ba41d2e..155cbf0 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -206,8 +206,9 @@ void
 free_dpdk_buf(struct ofpbuf *b)
 {
 struct rte_mbuf *pkt = (struct rte_mbuf *) b;
+struct rte_mbuf *buf = (char *) pkt->buf_addr - sizeof(struct ofpbuf);
 
-rte_mempool_put(pkt->pool, pkt);
+rte_mempool_put(pkt->pool, buf);
 }
 
 static void
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread Ben Pfaff
Me too.

Acked-by: Ben Pfaff 


On Tue, Jun 03, 2014 at 02:22:48PM -0700, Jarno Rajahalme wrote:
> This better,
> 
> Acked-by: Jarno Rajahalme 
> 
> On Jun 3, 2014, at 2:20 PM, Gurucharan Shetty  wrote:
> 
> > May be we should add a new TODO section in BUILD.Windows like:
> > 
> > diff --git a/BUILD.Windows b/BUILD.Windows
> > index ca0d252..41e6eab 100644
> > --- a/BUILD.Windows
> > +++ b/BUILD.Windows
> > @@ -85,3 +85,11 @@ For example,
> >   --with-openssl="C:/OpenSSL-Win32"
> > 
> > * Run make for the ported executables.
> > +
> > +TODO:
> > +
> > +
> > +* OVS currently has no native support for atomics on Windows.  Pthreads
> > +are used as a fallback, but some features, such as OVS-RCU are really
> > +slow without native atomics support. Atomics support for Windows has to
> > +be brought in.
> > 
> > On Tue, Jun 3, 2014 at 2:16 PM, Jarno Rajahalme  
> > wrote:
> >> 
> >> On Jun 3, 2014, at 2:03 PM, Ben Pfaff  wrote:
> >> 
> >>> On Tue, Jun 03, 2014 at 01:09:11PM -0700, Jarno Rajahalme wrote:
>  OVS is slow when compiled with pthreads atomics.  Add a generic note
>  in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
>  provides additional detail.
>  
>  Signed-off-by: Jarno Rajahalme 
> >>> 
> >>> I think this is only likely to affect Windows (everyone else uses GCC
> >>> or Clang) so should we put it in BUILD.Windows?
> >> 
> >> 
> >> So you think this should not be mentioned in INSTALL? If so, I think the 
> >> text there should be modified to something like ?* GCC 4.x or clang C 
> >> compiler? instead of the current ?A C compiler, such as??.
> >> 
> >> How about adding the following to BUILD.Windows:
> >> 
> >> diff --git a/BUILD.Windows b/BUILD.Windows
> >> index ca0d252..05956ba 100644
> >> --- a/BUILD.Windows
> >> +++ b/BUILD.Windows
> >> @@ -39,6 +39,10 @@ project from
> >> ftp://sourceware.org/pub/pthreads-win32/prebuilt-dll-2-9-1-release to a
> >> directory (e.g.: C:/pthread).
> >> 
> >> +OVS currently has no native support for atomics on Windows.  Pthreads
> >> +are used as a fallback, but some features, such as OVS-RCU are really
> >> +slow without native atomics support.
> >> +
> >> * Get the Open vSwitch sources from either cloning the repo using git
> >> or from a distribution tar ball.
> >> 
> >>  Jarno
> >> 
> >> ___
> >> dev mailing list
> >> dev@openvswitch.org
> >> http://openvswitch.org/mailman/listinfo/dev
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [slow path HASH and RECIRC action v2] dpif: Fix slow action handling for DP_HASH and RECIRC

2014-06-03 Thread Andy Zhou
In case DP_HASH and RECIRC actions need to be executed in slow path,
current implementation simply don't handle them -- vswitchd simply
crashes. This patch fixes them by supply an implementation for them.

RECIRC will be handled by the datapath, same as the output action.

DP_HASH, on the other hand, is handled in the user space. Although the
resulting hash values may not match those computed by the datapath, it
is less expensive; current use case (bonding) does not require a strict
match to work properly.

Reported-by: YAMAMOTO Takashi 
Signed-off-by: Andy Zhou 

---
v1->v2:   Addresses Ben's review feedback
  fix the comment on why execute dp_hash in user space.
  Add assert for unknown hash algorithm specification.
---
 lib/dpif.c|  4 ++--
 lib/odp-execute.c | 23 ++-
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/lib/dpif.c b/lib/dpif.c
index 84dba28..ac73be1 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -1069,6 +1069,7 @@ dpif_execute_helper_cb(void *aux_, struct ofpbuf *packet,
 switch ((enum ovs_action_attr)type) {
 case OVS_ACTION_ATTR_OUTPUT:
 case OVS_ACTION_ATTR_USERSPACE:
+case OVS_ACTION_ATTR_RECIRC:
 execute.actions = action;
 execute.actions_len = NLA_ALIGN(action->nla_len);
 execute.packet = packet;
@@ -1077,6 +1078,7 @@ dpif_execute_helper_cb(void *aux_, struct ofpbuf *packet,
 aux->error = aux->dpif->dpif_class->execute(aux->dpif, &execute);
 break;
 
+case OVS_ACTION_ATTR_HASH:
 case OVS_ACTION_ATTR_PUSH_VLAN:
 case OVS_ACTION_ATTR_POP_VLAN:
 case OVS_ACTION_ATTR_PUSH_MPLS:
@@ -1084,8 +1086,6 @@ dpif_execute_helper_cb(void *aux_, struct ofpbuf *packet,
 case OVS_ACTION_ATTR_SET:
 case OVS_ACTION_ATTR_SAMPLE:
 case OVS_ACTION_ATTR_UNSPEC:
-case OVS_ACTION_ATTR_RECIRC:
-case OVS_ACTION_ATTR_HASH:
 case __OVS_ACTION_ATTR_MAX:
 OVS_NOT_REACHED();
 }
diff --git a/lib/odp-execute.c b/lib/odp-execute.c
index 12ad679..cc18536 100644
--- a/lib/odp-execute.c
+++ b/lib/odp-execute.c
@@ -26,6 +26,7 @@
 #include "ofpbuf.h"
 #include "odp-util.h"
 #include "packets.h"
+#include "flow.h"
 #include "unaligned.h"
 #include "util.h"
 
@@ -208,7 +209,6 @@ odp_execute_actions__(void *dp, struct ofpbuf *packet, bool 
steal,
 case OVS_ACTION_ATTR_OUTPUT:
 case OVS_ACTION_ATTR_USERSPACE:
 case OVS_ACTION_ATTR_RECIRC:
-case OVS_ACTION_ATTR_HASH:
 if (dp_execute_action) {
 /* Allow 'dp_execute_action' to steal the packet data if we do
  * not need it any more. */
@@ -219,6 +219,27 @@ odp_execute_actions__(void *dp, struct ofpbuf *packet, 
bool steal,
 }
 break;
 
+case OVS_ACTION_ATTR_HASH: {
+const struct ovs_action_hash *hash_act = nl_attr_get(a);
+
+/* Calculate a hash value directly.  This might not match the
+ * value computed by the datapath, but it is much less expensive,
+ * and the current use case (bonding) does not require a strict
+ * match to work properly. */
+if (hash_act->hash_alg == OVS_HASH_ALG_L4) {
+struct flow flow;
+uint32_t hash;
+
+flow_extract(packet, md, &flow);
+hash = flow_hash_5tuple(&flow, hash_act->hash_basis);
+md->dp_hash = hash ? hash : 1;
+} else {
+/* Assert on unknown hash algorithm.  */
+OVS_NOT_REACHED();
+}
+break;
+}
+
 case OVS_ACTION_ATTR_PUSH_VLAN: {
 const struct ovs_action_push_vlan *vlan = nl_attr_get(a);
 eth_push_vlan(packet, htons(ETH_TYPE_VLAN), vlan->vlan_tci);
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif: Fix slow action handling for DP_HASH and RECIRC

2014-06-03 Thread Andy Zhou
Ben, thanks for the review. Both are valid.

I will send out a V2 soon that should address both points.

On Thu, May 29, 2014 at 2:02 PM, Ben Pfaff  wrote:
> On Wed, May 28, 2014 at 05:00:48PM -0700, Andy Zhou wrote:
>> In case DP_HASH and RECIRC actions need to be executed in slow path,
>> current implementation simply don't handle them -- vswitchd simply
>> crashes. This patch fixes them by supply an implementation for them.
>>
>> RECIRC will be handled by the datapath, same as the output action.
>>
>> DP_HASH, on the other hand, is handled in the user space. Although the
>> resulting hash values may not match those computed by the datapath, it
>> is less expensive; current use case (bonding) does not require a strict
>> match to work properly.
>>
>> Reported-by: YAMAMOTO Takashi 
>> Signed-off-by: Andy Zhou 
>
> I would adjust the comment in odp_execute_actions__() to match the
> changelog better, maybe:
> /* Calculate a hash value directly.  This might not match the
>  * value computed by the datapath, but it is much less expensive,
>  * and the current use case (bonding) does not require a strict
>  * match to work properly. */
>
> Should we ovs_assert() that md->dp_hash is OVS_HASH_ALG_L4?
> Presumably we won't ever create an ODP action that uses a hash
> algorithm we don't implement, and if we do we ought to find out about
> it.
>
> Thanks,
>
> Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif: Fix slow action handling for DP_HASH and RECIRC

2014-06-03 Thread Ben Pfaff
Thanks!

On Tue, Jun 03, 2014 at 02:30:12PM -0700, Andy Zhou wrote:
> Ben, thanks for the review. Both are valid.
> 
> I will send out a V2 soon that should address both points.
> 
> On Thu, May 29, 2014 at 2:02 PM, Ben Pfaff  wrote:
> > On Wed, May 28, 2014 at 05:00:48PM -0700, Andy Zhou wrote:
> >> In case DP_HASH and RECIRC actions need to be executed in slow path,
> >> current implementation simply don't handle them -- vswitchd simply
> >> crashes. This patch fixes them by supply an implementation for them.
> >>
> >> RECIRC will be handled by the datapath, same as the output action.
> >>
> >> DP_HASH, on the other hand, is handled in the user space. Although the
> >> resulting hash values may not match those computed by the datapath, it
> >> is less expensive; current use case (bonding) does not require a strict
> >> match to work properly.
> >>
> >> Reported-by: YAMAMOTO Takashi 
> >> Signed-off-by: Andy Zhou 
> >
> > I would adjust the comment in odp_execute_actions__() to match the
> > changelog better, maybe:
> > /* Calculate a hash value directly.  This might not match the
> >  * value computed by the datapath, but it is much less 
> > expensive,
> >  * and the current use case (bonding) does not require a strict
> >  * match to work properly. */
> >
> > Should we ovs_assert() that md->dp_hash is OVS_HASH_ALG_L4?
> > Presumably we won't ever create an ODP action that uses a hash
> > algorithm we don't implement, and if we do we ought to find out about
> > it.
> >
> > Thanks,
> >
> > Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Joe Stringer
On 4 June 2014 05:39, Alex Wang  wrote:

> I confirm this patch fix the bug, but have a high level concern,
>
>
> Assume another learnt flow with hard_timeout=10 and was never hit after
> install, now, if there is frequent change to the flow table and
> need_revalidate
> is constantly 'true', this learnt flow will never timeout...  since we
> always
> redo the flow_mod during revalidation,
>

(Correct me if I'm wrong, but this is how I see it:)
For this particular case, we will not redo the flow_mod during
revalidation. The "may_learn" flag is the same as before. Now, we will
create the flow_mod and cache it, but only do the flow_mod if may_learn is
true.

The "execute_learn" function is perhaps a bit misleading here: It only
assembles the flow_mod, doesn't actually execute it.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Alex Wang
You are right, sorry for misreading your patch.

Acked-by: Alex Wang 



On Tue, Jun 3, 2014 at 2:51 PM, Joe Stringer  wrote:

> On 4 June 2014 05:39, Alex Wang  wrote:
>
>> I confirm this patch fix the bug, but have a high level concern,
>>
>>
>> Assume another learnt flow with hard_timeout=10 and was never hit after
>> install, now, if there is frequent change to the flow table and
>> need_revalidate
>> is constantly 'true', this learnt flow will never timeout...  since we
>> always
>> redo the flow_mod during revalidation,
>>
>
> (Correct me if I'm wrong, but this is how I see it:)
> For this particular case, we will not redo the flow_mod during
> revalidation. The "may_learn" flag is the same as before. Now, we will
> create the flow_mod and cache it, but only do the flow_mod if may_learn is
> true.
>
> The "execute_learn" function is perhaps a bit misleading here: It only
> assembles the flow_mod, doesn't actually execute it.
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] ofproto: Remove ofproto_refresh_rule().

2014-06-03 Thread Alex Wang
Acked-by: Alex Wang 


On Tue, Jun 3, 2014 at 2:59 AM, Joe Stringer  wrote:

> The only user of this function was removed in the previous patch, so
> remove it.
>
> Signed-off-by: Joe Stringer 
> ---
>  ofproto/ofproto-dpif.c |   12 
>  ofproto/ofproto-provider.h |2 --
>  ofproto/ofproto.c  |   41
> -
>  3 files changed, 55 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 06be234..22ba14e 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -364,18 +364,6 @@ ofproto_dpif_flow_mod(struct ofproto_dpif *ofproto,
>  ofproto_flow_mod(&ofproto->up, fm);
>  }
>
> -/* Resets the modified time for 'rule' or an equivalent rule. If 'rule'
> is not
> - * in the classifier, but an equivalent rule is, unref 'rule' and ref the
> new
> - * rule. Otherwise if 'rule' is no longer installed in the classifier,
> - * reinstall it.
> - *
> - * Returns the rule whose modified time has been reset. */
> -struct rule_dpif *
> -ofproto_dpif_refresh_rule(struct rule_dpif *rule)
> -{
> -return rule_dpif_cast(ofproto_refresh_rule(&rule->up));
> -}
> -
>  /* Appends 'pin' to the queue of "packet ins" to be sent to the
> controller.
>   * Takes ownership of 'pin' and pin->packet. */
>  void
> diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
> index ff539b9..7741044 100644
> --- a/ofproto/ofproto-provider.h
> +++ b/ofproto/ofproto-provider.h
> @@ -1716,8 +1716,6 @@ BUILD_ASSERT_DECL(OFPROTO_POSTPONE < OFPERR_OFS);
>
>  int ofproto_flow_mod(struct ofproto *, struct ofputil_flow_mod *)
>  OVS_EXCLUDED(ofproto_mutex);
> -struct rule *ofproto_refresh_rule(struct rule *rule)
> -OVS_EXCLUDED(ofproto_mutex);
>  void ofproto_add_flow(struct ofproto *, const struct match *,
>unsigned int priority,
>const struct ofpact *ofpacts, size_t ofpacts_len)
> diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> index 24a709b..f34cfde 100644
> --- a/ofproto/ofproto.c
> +++ b/ofproto/ofproto.c
> @@ -1978,47 +1978,6 @@ ofproto_flow_mod(struct ofproto *ofproto, struct
> ofputil_flow_mod *fm)
>  return handle_flow_mod__(ofproto, NULL, fm, NULL);
>  }
>
> -/* Resets the modified time for 'rule' or an equivalent rule. If 'rule'
> is not
> - * in the classifier, but an equivalent rule is, unref 'rule' and ref the
> new
> - * rule. Otherwise if 'rule' is no longer installed in the classifier,
> - * reinstall it.
> - *
> - * Returns the rule whose modified time has been reset. */
> -struct rule *
> -ofproto_refresh_rule(struct rule *rule)
> -{
> -const struct oftable *table = &rule->ofproto->tables[rule->table_id];
> -const struct cls_rule *cr = &rule->cr;
> -struct rule *r;
> -
> -/* do_add_flow() requires that the rule is not installed. We lock the
> - * ofproto_mutex here so that another thread cannot add the flow
> before
> - * we get a chance to add it.*/
> -ovs_mutex_lock(&ofproto_mutex);
> -
> -fat_rwlock_rdlock(&table->cls.rwlock);
> -r = rule_from_cls_rule(classifier_find_rule_exactly(&table->cls, cr));
> -if (r != rule) {
> -ofproto_rule_ref(r);
> -}
> -fat_rwlock_unlock(&table->cls.rwlock);
> -
> -if (!r) {
> -do_add_flow(rule->ofproto, NULL, NULL, 0, rule);
> -} else if  (r != rule) {
> -ofproto_rule_unref(rule);
> -rule = r;
> -}
> -ovs_mutex_unlock(&ofproto_mutex);
> -
> -/* Refresh the modified time for the rule. */
> -ovs_mutex_lock(&rule->mutex);
> -rule->modified = MAX(rule->modified, time_msec());
> -ovs_mutex_unlock(&rule->mutex);
> -
> -return rule;
> -}
> -
>  /* Searches for a rule with matching criteria exactly equal to 'target' in
>   * ofproto's table 0 and, if it finds one, deletes it.
>   *
> --
> 1.7.10.4
>
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Ben Pfaff
On Wed, Jun 04, 2014 at 09:51:16AM +1200, Joe Stringer wrote:
> On 4 June 2014 05:39, Alex Wang  wrote:
> 
> > I confirm this patch fix the bug, but have a high level concern,
> >
> >
> > Assume another learnt flow with hard_timeout=10 and was never hit after
> > install, now, if there is frequent change to the flow table and
> > need_revalidate
> > is constantly 'true', this learnt flow will never timeout...  since we
> > always
> > redo the flow_mod during revalidation,
> >
> 
> (Correct me if I'm wrong, but this is how I see it:)
> For this particular case, we will not redo the flow_mod during
> revalidation. The "may_learn" flag is the same as before. Now, we will
> create the flow_mod and cache it, but only do the flow_mod if may_learn is
> true.
> 
> The "execute_learn" function is perhaps a bit misleading here: It only
> assembles the flow_mod, doesn't actually execute it.

I think this is logically correct.  Have you confirmed that it fixes
the original bug?

One concern I have is memory consumption.  struct ofputil_flow_mod is
452 bytes (which we use and initialize), plus we allocate another 1024
bytes for actions.  I don't see an easy way to get rid of the former,
but I think that the actions would normally be pretty small.  I think
we'd probably be better off with an initial stub for the latter that
we then xmemdup() to save.  Alternatively, if you changed the starting
size of the actions ofpbuf to just, say, 64 bytes, I think that would
save a lot of memory in the common case.

Assuming it fixes the bug,
Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Alex Wang
Yes, I have confirmed that this patch fix the bug.


On Tue, Jun 3, 2014 at 3:11 PM, Ben Pfaff  wrote:

> On Wed, Jun 04, 2014 at 09:51:16AM +1200, Joe Stringer wrote:
> > On 4 June 2014 05:39, Alex Wang  wrote:
> >
> > > I confirm this patch fix the bug, but have a high level concern,
> > >
> > >
> > > Assume another learnt flow with hard_timeout=10 and was never hit after
> > > install, now, if there is frequent change to the flow table and
> > > need_revalidate
> > > is constantly 'true', this learnt flow will never timeout...  since we
> > > always
> > > redo the flow_mod during revalidation,
> > >
> >
> > (Correct me if I'm wrong, but this is how I see it:)
> > For this particular case, we will not redo the flow_mod during
> > revalidation. The "may_learn" flag is the same as before. Now, we will
> > create the flow_mod and cache it, but only do the flow_mod if may_learn
> is
> > true.
> >
> > The "execute_learn" function is perhaps a bit misleading here: It only
> > assembles the flow_mod, doesn't actually execute it.
>
> I think this is logically correct.  Have you confirmed that it fixes
> the original bug?
>
> One concern I have is memory consumption.  struct ofputil_flow_mod is
> 452 bytes (which we use and initialize), plus we allocate another 1024
> bytes for actions.  I don't see an easy way to get rid of the former,
> but I think that the actions would normally be pretty small.  I think
> we'd probably be better off with an initial stub for the latter that
> we then xmemdup() to save.  Alternatively, if you changed the starting
> size of the actions ofpbuf to just, say, 64 bytes, I think that would
> save a lot of memory in the common case.
>
> Assuming it fixes the bug,
> Acked-by: Ben Pfaff 
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Joe Stringer
I'm not against this idea; It seems correct to me, to keep the cache the
same even if we need to revalidate, because we will throw it away if
revalidation fails anyway.

We already had a discussion about using the xcache even when revalidation
is required:-
http://openvswitch.org/pipermail/dev/2014-May/040440.html

I wonder if we need to extend that, so that rather than clearing the xcache
in the revalidation case, we keep it (and make sure that it is not
overwritten by the xlate_actions__).

Regardless, I would feel more comfortable with this patch going in, as the
cases of creating xcache and learning are separate, and not handled
correctly in that function currently.

On 4 June 2014 08:26, Alex Wang  wrote:

> After discussing the issue with few people,
>
> I think we should keep the old_xcache during the full revalidation.
>
> And in xlate_learn_action(), if may_learn is false, we should check if the
> learned rule is in the old_xcache, if it is, we should copy the XC_LEARN
> entry
> to the new_xcache list (but we do not refresh the learnt rule again, since
> may_learn to false).
>
>  Thoughts?
>
> Alex Wang,
>
>
> On Tue, Jun 3, 2014 at 10:39 AM, Alex Wang  wrote:
>
>> I confirm this patch fix the bug, but have a high level concern,
>>
>>
>> Assume another learnt flow with hard_timeout=10 and was never hit after
>> install, now, if there is frequent change to the flow table and
>> need_revalidate
>> is constantly 'true', this learnt flow will never timeout...  since we
>> always
>> redo the flow_mod during revalidation,
>>
>> Let's discuss further on this scenario, I'm thinking about solutions,
>>
>> Thanks,
>> Alex Wang,
>>
>>
>>
>>
>> On Tue, Jun 3, 2014 at 2:59 AM, Joe Stringer 
>> wrote:
>>
>>> Caching the results of xlate_learn was previously dependent on the state
>>> of the 'may_learn' flag. This meant that if the caller did not specify
>>> that this flow may learn, then a learn entry would not be cached.
>>> However, the xlate_cache tends to be used on a recurring basis, so
>>> failing to cache the learn entry can provide unexpected behaviour later
>>> on, particularly in corner cases.
>>>
>>> Such a corner case occurred previously:-
>>> * Revalidation was requested.
>>> * A flow with a learn action was dumped.
>>> * The flow had no packets.
>>> * The flow's corresponding xcache was cleared, and the flow revalidated.
>>> * The flow went on to receive packets after the xcache is re-created.
>>>
>>> In this case, the xcache would be re-created, but would not refresh the
>>> timeouts on the learnt flow until the next time it was cleared, even if
>>> it received more traffic. This would cause flows to time out sooner than
>>> expected. Symptoms of this bug may include unexpected forwarding
>>> behaviour or extraneous statistics being attributed to the wrong flow.
>>>
>>> This patch fixes the issue by caching the entire flow_mod, including
>>> actions, upon translating an xlate_learn action. This is used to perform
>>> a flow_mod from scratch with the original flow, rather than simply
>>> refreshing the rule that was created during the creation of the xcache.
>>>
>>> Bug #1252997.
>>>
>>> Reported-by: Scott Hendricks 
>>> Signed-off-by: Joe Stringer 
>>> ---
>>>  ofproto/ofproto-dpif-xlate.c |   58
>>> ++
>>>  1 file changed, 30 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>>> index 068d54f..e8f3dbf 100644
>>> --- a/ofproto/ofproto-dpif-xlate.c
>>> +++ b/ofproto/ofproto-dpif-xlate.c
>>> @@ -265,7 +265,9 @@ struct xc_entry {
>>>  uint16_t vid;
>>>  } bond;
>>>  struct {
>>> -struct rule_dpif *rule;
>>> +struct ofproto_dpif *ofproto;
>>> +struct ofputil_flow_mod *fm;
>>> +struct ofpbuf *ofpacts;
>>>  } learn;
>>>  struct {
>>>  struct ofproto_dpif *ofproto;
>>> @@ -2977,34 +2979,38 @@ xlate_bundle_action(struct xlate_ctx *ctx,
>>>  }
>>>
>>>  static void
>>> -xlate_learn_action(struct xlate_ctx *ctx,
>>> -   const struct ofpact_learn *learn)
>>> +xlate_learn_action__(struct xlate_ctx *ctx, const struct ofpact_learn
>>> *learn,
>>> + struct ofputil_flow_mod *fm, struct ofpbuf
>>> *ofpacts)
>>>  {
>>> -uint64_t ofpacts_stub[1024 / 8];
>>> -struct ofputil_flow_mod fm;
>>> -struct ofpbuf ofpacts;
>>> +learn_execute(learn, &ctx->xin->flow, fm, ofpacts);
>>> +if (ctx->xin->may_learn) {
>>> +ofproto_dpif_flow_mod(ctx->xbridge->ofproto, fm);
>>> +}
>>> +}
>>>
>>> +static void
>>> +xlate_learn_action(struct xlate_ctx *ctx, const struct ofpact_learn
>>> *learn)
>>> +{
>>>  ctx->xout->has_learn = true;
>>> -
>>>  learn_mask(learn, &ctx->xout->wc);
>>>
>>> -if (!ctx->xin->may_learn) {
>>> -return;
>>> -}
>>> -
>>> -ofpbuf_use_stub(&ofpacts, ofpacts_stub, sizeof ofpacts_stub);
>>> -learn_execute(learn, &ctx-

Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Ben Pfaff
OK.

My memory consumption comment is just quibbling.  I'd like to get in
the fix, then we can reduce the memory consumption if need be.

On Tue, Jun 03, 2014 at 03:14:01PM -0700, Alex Wang wrote:
> Yes, I have confirmed that this patch fix the bug.
> 
> 
> On Tue, Jun 3, 2014 at 3:11 PM, Ben Pfaff  wrote:
> 
> > On Wed, Jun 04, 2014 at 09:51:16AM +1200, Joe Stringer wrote:
> > > On 4 June 2014 05:39, Alex Wang  wrote:
> > >
> > > > I confirm this patch fix the bug, but have a high level concern,
> > > >
> > > >
> > > > Assume another learnt flow with hard_timeout=10 and was never hit after
> > > > install, now, if there is frequent change to the flow table and
> > > > need_revalidate
> > > > is constantly 'true', this learnt flow will never timeout...  since we
> > > > always
> > > > redo the flow_mod during revalidation,
> > > >
> > >
> > > (Correct me if I'm wrong, but this is how I see it:)
> > > For this particular case, we will not redo the flow_mod during
> > > revalidation. The "may_learn" flag is the same as before. Now, we will
> > > create the flow_mod and cache it, but only do the flow_mod if may_learn
> > is
> > > true.
> > >
> > > The "execute_learn" function is perhaps a bit misleading here: It only
> > > assembles the flow_mod, doesn't actually execute it.
> >
> > I think this is logically correct.  Have you confirmed that it fixes
> > the original bug?
> >
> > One concern I have is memory consumption.  struct ofputil_flow_mod is
> > 452 bytes (which we use and initialize), plus we allocate another 1024
> > bytes for actions.  I don't see an easy way to get rid of the former,
> > but I think that the actions would normally be pretty small.  I think
> > we'd probably be better off with an initial stub for the latter that
> > we then xmemdup() to save.  Alternatively, if you changed the starting
> > size of the actions ofpbuf to just, say, 64 bytes, I think that would
> > save a lot of memory in the common case.
> >
> > Assuming it fixes the bug,
> > Acked-by: Ben Pfaff 
> >
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Joe Stringer
Would this incremental alleviate your concerns?

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index e8f3dbf..aa8cea1 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3000,11 +3000,11 @@ xlate_learn_action(struct xlate_ctx *ctx, const
struct ofpact_learn *learn)
 entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
 entry->u.learn.ofproto = ctx->xbridge->ofproto;
 entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);
-entry->u.learn.ofpacts = ofpbuf_new(1024);
+entry->u.learn.ofpacts = ofpbuf_new(64);
 xlate_learn_action__(ctx, learn, entry->u.learn.fm,
  entry->u.learn.ofpacts);
 } else if (ctx->xin->may_learn) {
-uint64_t ofpacts_stub[1024 / 8];
+uint64_t ofpacts_stub[64 / 8];
 struct ofputil_flow_mod fm;
 struct ofpbuf ofpacts;


On 4 June 2014 10:11, Ben Pfaff  wrote:

> On Wed, Jun 04, 2014 at 09:51:16AM +1200, Joe Stringer wrote:
> > On 4 June 2014 05:39, Alex Wang  wrote:
> >
> > > I confirm this patch fix the bug, but have a high level concern,
> > >
> > >
> > > Assume another learnt flow with hard_timeout=10 and was never hit after
> > > install, now, if there is frequent change to the flow table and
> > > need_revalidate
> > > is constantly 'true', this learnt flow will never timeout...  since we
> > > always
> > > redo the flow_mod during revalidation,
> > >
> >
> > (Correct me if I'm wrong, but this is how I see it:)
> > For this particular case, we will not redo the flow_mod during
> > revalidation. The "may_learn" flag is the same as before. Now, we will
> > create the flow_mod and cache it, but only do the flow_mod if may_learn
> is
> > true.
> >
> > The "execute_learn" function is perhaps a bit misleading here: It only
> > assembles the flow_mod, doesn't actually execute it.
>
> I think this is logically correct.  Have you confirmed that it fixes
> the original bug?
>
> One concern I have is memory consumption.  struct ofputil_flow_mod is
> 452 bytes (which we use and initialize), plus we allocate another 1024
> bytes for actions.  I don't see an easy way to get rid of the former,
> but I think that the actions would normally be pretty small.  I think
> we'd probably be better off with an initial stub for the latter that
> we then xmemdup() to save.  Alternatively, if you changed the starting
> size of the actions ofpbuf to just, say, 64 bytes, I think that would
> save a lot of memory in the common case.
>
> Assuming it fixes the bug,
> Acked-by: Ben Pfaff 
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Ben Pfaff
On Wed, Jun 04, 2014 at 10:30:13AM +1200, Joe Stringer wrote:
> Would this incremental alleviate your concerns?
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index e8f3dbf..aa8cea1 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -3000,11 +3000,11 @@ xlate_learn_action(struct xlate_ctx *ctx, const
> struct ofpact_learn *learn)
>  entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
>  entry->u.learn.ofproto = ctx->xbridge->ofproto;
>  entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);

This is good:

> -entry->u.learn.ofpacts = ofpbuf_new(1024);
> +entry->u.learn.ofpacts = ofpbuf_new(64);
>  xlate_learn_action__(ctx, learn, entry->u.learn.fm,
>   entry->u.learn.ofpacts);
>  } else if (ctx->xin->may_learn) {

I'd leave this out, though, since 1024 bytes of stack space doesn't
really cost anything:

> -uint64_t ofpacts_stub[1024 / 8];
> +uint64_t ofpacts_stub[64 / 8];
>  struct ofputil_flow_mod fm;
>  struct ofpbuf ofpacts;

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Joe Stringer
OK, thanks. I'll push this patchset to master and branch-2.3 soon.


On 4 June 2014 10:33, Ben Pfaff  wrote:

> On Wed, Jun 04, 2014 at 10:30:13AM +1200, Joe Stringer wrote:
> > Would this incremental alleviate your concerns?
> >
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index e8f3dbf..aa8cea1 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -3000,11 +3000,11 @@ xlate_learn_action(struct xlate_ctx *ctx, const
> > struct ofpact_learn *learn)
> >  entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
> >  entry->u.learn.ofproto = ctx->xbridge->ofproto;
> >  entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);
>
> This is good:
>
> > -entry->u.learn.ofpacts = ofpbuf_new(1024);
> > +entry->u.learn.ofpacts = ofpbuf_new(64);
> >  xlate_learn_action__(ctx, learn, entry->u.learn.fm,
> >   entry->u.learn.ofpacts);
> >  } else if (ctx->xin->may_learn) {
>
> I'd leave this out, though, since 1024 bytes of stack space doesn't
> really cost anything:
>
> > -uint64_t ofpacts_stub[1024 / 8];
> > +uint64_t ofpacts_stub[64 / 8];
> >  struct ofputil_flow_mod fm;
> >  struct ofpbuf ofpacts;
>
> Thanks,
>
> Ben.
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Ben Pfaff
Thank you!  Also many thanks to Alex.

On Wed, Jun 04, 2014 at 10:34:44AM +1200, Joe Stringer wrote:
> OK, thanks. I'll push this patchset to master and branch-2.3 soon.
> 
> 
> On 4 June 2014 10:33, Ben Pfaff  wrote:
> 
> > On Wed, Jun 04, 2014 at 10:30:13AM +1200, Joe Stringer wrote:
> > > Would this incremental alleviate your concerns?
> > >
> > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > > index e8f3dbf..aa8cea1 100644
> > > --- a/ofproto/ofproto-dpif-xlate.c
> > > +++ b/ofproto/ofproto-dpif-xlate.c
> > > @@ -3000,11 +3000,11 @@ xlate_learn_action(struct xlate_ctx *ctx, const
> > > struct ofpact_learn *learn)
> > >  entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
> > >  entry->u.learn.ofproto = ctx->xbridge->ofproto;
> > >  entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);
> >
> > This is good:
> >
> > > -entry->u.learn.ofpacts = ofpbuf_new(1024);
> > > +entry->u.learn.ofpacts = ofpbuf_new(64);
> > >  xlate_learn_action__(ctx, learn, entry->u.learn.fm,
> > >   entry->u.learn.ofpacts);
> > >  } else if (ctx->xin->may_learn) {
> >
> > I'd leave this out, though, since 1024 bytes of stack space doesn't
> > really cost anything:
> >
> > > -uint64_t ofpacts_stub[1024 / 8];
> > > +uint64_t ofpacts_stub[64 / 8];
> > >  struct ofputil_flow_mod fm;
> > >  struct ofpbuf ofpacts;
> >
> > Thanks,
> >
> > Ben.
> >
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2.58] datapath: Add basic MPLS support to kernel

2014-06-03 Thread Jesse Gross
On Mon, Jun 2, 2014 at 9:04 PM, Simon Horman  wrote:
> Hi Jesse,
>
> thanks for your feedback.
>
> On Mon, Jun 02, 2014 at 05:58:10PM -0700, Jesse Gross wrote:
>> On Sun, May 25, 2014 at 5:22 PM, Simon Horman  wrote:
>> > diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
>> > index 803a94c..8ce596c 100644
>> > --- a/datapath/flow_netlink.c
>> > +++ b/datapath/flow_netlink.c
>> > +   case OVS_ACTION_ATTR_POP_MPLS:
>> > +   if (!eth_p_mpls(eth_type))
>> > +   return -EINVAL;
>>
>> Should this also take into account the VLAN tag? It's really part of
>> the EtherType although it has been stripped out here. Actually, maybe
>> it's better to not track the vlan_tci separately at all during
>> validation but just fold it into the EtherType.
>>
>> The practical implication of this is that you wouldn't be able to pop
>> out from underneath a VLAN. This may be a good thing if we are trying
>> to avoid tag order issues - after all, you can't push underneath a
>> VLAN either. I'm not sure what effects this has on the need to track
>> mac_len, if any.
>
> My thinking is that the ordering problem only surfaces in relation
> to push MPLS actions where should it go in relation to VLAN tags.
> For pop actions it seems to me that the outermost tag should be removed
> regardless of its position in relation to other tags.
>
> So I think that the code above is safe. Though now you mention
> it I do notice that it only allows pop MPLS if there is at most
> one VLAN tag present.
>
> That said, I would not mind particularly disabling pop MPLS in the
> presence of VLAN tags. At the very least it is related to the
> painful issue of tag ordering.

I agree that it is safe but my thought was the it avoids a number of
potential corner cases such as:
 * Difference between push and pop underneath vlan tags.
 * Pop with multiple vlan tags
 * Differences with varying EtherTypes used for vlans

> I explored your idea of tracking only eth_type rather than both
> it and vlan_tci. I did this by adding the following logic to
> ovs_nla_copy_actions().
>
> if (key->eth.tci & htons(VLAN_TAG_PRESENT))
> eth_type = htons(ETH_P_8021Q);
> else
> eth_type = key->eth.type;
>
> I then updated the usage of eth_type in ovs_nla_copy_actions__() accordingly.
> One problem that I have run into is what to do about pop VLAN.
>
> I don't believe its possible to know what the new eth type is.
> This makes subsequent checks of the eth type for validate_set()
> a little awkward. And seems to indicate that an extra parameter would
> be needed.
>
> For this reason I am inclined to leave the eth_type and vlan_tci
> parameters in place. In this case there is no problem with pop VLAN
> as the ether type inside the VLAN tag should be the value of eth_type.

Can't we populate eth_type with the EtherType from the flow key in
pop_vlan? This doesn't provide us with the ability to look arbitrarily
deep into the packet but it should at least retain the functionality
that we have today.

>> Otherwise, I'm happy with this. I think that we need to conclude the
>> discussion on the other patch and update this appropriately first.
>
> Yes, lets get that sorted out.
>
> Assuming the other patch is accepted do you want me to increase the
> coverage of the compatibility code (in this patch) up to whichever version
> of the kernel the other patch is included in? It seems logical to me but I
> do not have strong feelings about it.

Yes, I think that probably makes sense.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Cache full flowmod for learning.

2014-06-03 Thread Joe Stringer
Yes,  Alex has helped a lot with this. Thanks Alex!


On 4 June 2014 10:36, Ben Pfaff  wrote:

> Thank you!  Also many thanks to Alex.
>
> On Wed, Jun 04, 2014 at 10:34:44AM +1200, Joe Stringer wrote:
> > OK, thanks. I'll push this patchset to master and branch-2.3 soon.
> >
> >
> > On 4 June 2014 10:33, Ben Pfaff  wrote:
> >
> > > On Wed, Jun 04, 2014 at 10:30:13AM +1200, Joe Stringer wrote:
> > > > Would this incremental alleviate your concerns?
> > > >
> > > > diff --git a/ofproto/ofproto-dpif-xlate.c
> b/ofproto/ofproto-dpif-xlate.c
> > > > index e8f3dbf..aa8cea1 100644
> > > > --- a/ofproto/ofproto-dpif-xlate.c
> > > > +++ b/ofproto/ofproto-dpif-xlate.c
> > > > @@ -3000,11 +3000,11 @@ xlate_learn_action(struct xlate_ctx *ctx,
> const
> > > > struct ofpact_learn *learn)
> > > >  entry = xlate_cache_add_entry(ctx->xin->xcache, XC_LEARN);
> > > >  entry->u.learn.ofproto = ctx->xbridge->ofproto;
> > > >  entry->u.learn.fm = xmalloc(sizeof *entry->u.learn.fm);
> > >
> > > This is good:
> > >
> > > > -entry->u.learn.ofpacts = ofpbuf_new(1024);
> > > > +entry->u.learn.ofpacts = ofpbuf_new(64);
> > > >  xlate_learn_action__(ctx, learn, entry->u.learn.fm,
> > > >   entry->u.learn.ofpacts);
> > > >  } else if (ctx->xin->may_learn) {
> > >
> > > I'd leave this out, though, since 1024 bytes of stack space doesn't
> > > really cost anything:
> > >
> > > > -uint64_t ofpacts_stub[1024 / 8];
> > > > +uint64_t ofpacts_stub[64 / 8];
> > > >  struct ofputil_flow_mod fm;
> > > >  struct ofpbuf ofpacts;
> > >
> > > Thanks,
> > >
> > > Ben.
> > >
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] ofproto: Remove ofproto_refresh_rule().

2014-06-03 Thread Joe Stringer
Thanks, I pushed this series to master and branch-2.3.


On 4 June 2014 10:06, Alex Wang  wrote:

> Acked-by: Alex Wang 
>
>
> On Tue, Jun 3, 2014 at 2:59 AM, Joe Stringer 
> wrote:
>
>> The only user of this function was removed in the previous patch, so
>> remove it.
>>
>> Signed-off-by: Joe Stringer 
>> ---
>>  ofproto/ofproto-dpif.c |   12 
>>  ofproto/ofproto-provider.h |2 --
>>  ofproto/ofproto.c  |   41
>> -
>>  3 files changed, 55 deletions(-)
>>
>> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
>> index 06be234..22ba14e 100644
>> --- a/ofproto/ofproto-dpif.c
>> +++ b/ofproto/ofproto-dpif.c
>> @@ -364,18 +364,6 @@ ofproto_dpif_flow_mod(struct ofproto_dpif *ofproto,
>>  ofproto_flow_mod(&ofproto->up, fm);
>>  }
>>
>> -/* Resets the modified time for 'rule' or an equivalent rule. If 'rule'
>> is not
>> - * in the classifier, but an equivalent rule is, unref 'rule' and ref
>> the new
>> - * rule. Otherwise if 'rule' is no longer installed in the classifier,
>> - * reinstall it.
>> - *
>> - * Returns the rule whose modified time has been reset. */
>> -struct rule_dpif *
>> -ofproto_dpif_refresh_rule(struct rule_dpif *rule)
>> -{
>> -return rule_dpif_cast(ofproto_refresh_rule(&rule->up));
>> -}
>> -
>>  /* Appends 'pin' to the queue of "packet ins" to be sent to the
>> controller.
>>   * Takes ownership of 'pin' and pin->packet. */
>>  void
>> diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
>> index ff539b9..7741044 100644
>> --- a/ofproto/ofproto-provider.h
>> +++ b/ofproto/ofproto-provider.h
>> @@ -1716,8 +1716,6 @@ BUILD_ASSERT_DECL(OFPROTO_POSTPONE < OFPERR_OFS);
>>
>>  int ofproto_flow_mod(struct ofproto *, struct ofputil_flow_mod *)
>>  OVS_EXCLUDED(ofproto_mutex);
>> -struct rule *ofproto_refresh_rule(struct rule *rule)
>> -OVS_EXCLUDED(ofproto_mutex);
>>  void ofproto_add_flow(struct ofproto *, const struct match *,
>>unsigned int priority,
>>const struct ofpact *ofpacts, size_t ofpacts_len)
>> diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
>> index 24a709b..f34cfde 100644
>> --- a/ofproto/ofproto.c
>> +++ b/ofproto/ofproto.c
>> @@ -1978,47 +1978,6 @@ ofproto_flow_mod(struct ofproto *ofproto, struct
>> ofputil_flow_mod *fm)
>>  return handle_flow_mod__(ofproto, NULL, fm, NULL);
>>  }
>>
>> -/* Resets the modified time for 'rule' or an equivalent rule. If 'rule'
>> is not
>> - * in the classifier, but an equivalent rule is, unref 'rule' and ref
>> the new
>> - * rule. Otherwise if 'rule' is no longer installed in the classifier,
>> - * reinstall it.
>> - *
>> - * Returns the rule whose modified time has been reset. */
>> -struct rule *
>> -ofproto_refresh_rule(struct rule *rule)
>> -{
>> -const struct oftable *table = &rule->ofproto->tables[rule->table_id];
>> -const struct cls_rule *cr = &rule->cr;
>> -struct rule *r;
>> -
>> -/* do_add_flow() requires that the rule is not installed. We lock the
>> - * ofproto_mutex here so that another thread cannot add the flow
>> before
>> - * we get a chance to add it.*/
>> -ovs_mutex_lock(&ofproto_mutex);
>> -
>> -fat_rwlock_rdlock(&table->cls.rwlock);
>> -r = rule_from_cls_rule(classifier_find_rule_exactly(&table->cls,
>> cr));
>> -if (r != rule) {
>> -ofproto_rule_ref(r);
>> -}
>> -fat_rwlock_unlock(&table->cls.rwlock);
>> -
>> -if (!r) {
>> -do_add_flow(rule->ofproto, NULL, NULL, 0, rule);
>> -} else if  (r != rule) {
>> -ofproto_rule_unref(rule);
>> -rule = r;
>> -}
>> -ovs_mutex_unlock(&ofproto_mutex);
>> -
>> -/* Refresh the modified time for the rule. */
>> -ovs_mutex_lock(&rule->mutex);
>> -rule->modified = MAX(rule->modified, time_msec());
>> -ovs_mutex_unlock(&rule->mutex);
>> -
>> -return rule;
>> -}
>> -
>>  /* Searches for a rule with matching criteria exactly equal to 'target'
>> in
>>   * ofproto's table 0 and, if it finds one, deletes it.
>>   *
>> --
>> 1.7.10.4
>>
>>
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Ryan Wilson 76511
I've ran into some unexpected issues while perf testing this, lets hold
off on looking at this. I'll submit another patch when I've had all the
kinks worked out.

Ryan

On 6/3/14 2:21 PM, "Ryan Wilson 76511"  wrote:

>Hey Pravin,
>
>Thanks for the catch here! Turns out the header is already tracked in DPDK
>with rte_mbuf's buffer address - sizeof(ofpbuf). Thus, I submitted another
>patch that, in free_dpdk_buf(), always gets this header and uses this to
>free memory.
>
>http://patchwork.openvswitch.org/patch/4375/
>
>
>Let me know if this patch works.
>
>Cheers,
>
>Ryan
>
>On 6/3/14 10:50 AM, "Pravin Shelar"  wrote:
>
>>On Tue, Jun 3, 2014 at 10:33 AM, Ryan Wilson  wrote:
>>> When a bridge of datatype type netdev receives a packet, it copies the
>>> packet from the NIC to a buffer in userspace. Currently, when making
>>> an upcall, the packet is again copied to the upcall's buffer. However,
>>> this extra copy is not necessary when the datapath exists in userspace
>>> as the upcall can directly access the packet data.
>>>
>>> This patch eliminates this extra copy of the packet data in most cases.
>>> In cases where the packet may still be used later by callers of
>>> dp_netdev_execute_actions, making a copy of the packet data is still
>>> necessary.
>>>
>>> Signed-off-by: Ryan Wilson 
>>> ---
>>> v2: Addressed Jarno's comment to use direct pointer assignment for
>>> upcall->packet instead of ofpbuf_set().
>>> ---
>>>  lib/dpif-netdev.c |   15 +--
>>>  1 file changed, 5 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index 91c83d6..c89ae20 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct
>>>ofpbuf *packet,
>>> miniflow_hash_5tuple(&key.flow, 0)
>>> % dp->n_handlers,
>>> DPIF_UC_MISS, &key.flow, NULL);
>>> -ofpbuf_delete(packet);
>>>  }
>>>  }
>>>
>>> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp,
>>>struct ofpbuf *packet,
>>>  if (userdata) {
>>>  buf_size += NLA_ALIGN(userdata->nla_len);
>>>  }
>>> -buf_size += ofpbuf_size(packet);
>>>  ofpbuf_init(buf, buf_size);
>>>
>>>  /* Put ODP flow. */
>>> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev
>>>*dp, struct ofpbuf *packet,
>>>
>>>NLA_ALIGN(userdata->nla_len));
>>>  }
>>>
>>> -ofpbuf_set_data(&upcall->packet,
>>> -ofpbuf_put(buf, ofpbuf_data(packet),
>>>ofpbuf_size(packet)));
>>> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
>>> +upcall->packet = *packet;
>>>
>>
>>This would not work with DPDK. ofpbuf from dpdk is special case where
>>ofpbuf and data are allocated from same memory object. Therefore
>>moving ofpbuf->data is nontrivial.
>>
>>To make it work we need atleast following covered.
>>1. Define flag for source of ofpbuf header. So that we can track
>>header and data independently.
>>2. Fix dpdk ofpbuf destructor to free correct dpdk memory object.
>>
>>Thanks,
>>Pravin.
>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 net-next] MPLS: Use mpls_features to activate software MPLS GSO segmentation

2014-06-03 Thread Simon Horman
If an MPLS packet requires segmentation then use mpls_features
to determine if the software implementation should be used.

As no driver advertises MPLS GSO segmentation this will always be
the case.

I had not noticed that this was necessary before as software MPLS GSO
segmentation was already being used in my test environment. I believe that
the reason for that is the skbs in question always had fragments and the
driver I used does not advertise NETIF_F_FRAGLIST (which seems to be the
case for most drivers). Thus software segmentation was activated by
skb_gso_ok().

This introduces the overhead of an extra call to skb_network_protocol()
in the case where where CONFIG_NET_MPLS_GSO is set and
skb->ip_summed == CHECKSUM_NONE.

Thanks to Jesse Gross for prompting me to investigate this.

Signed-off-by: Simon Horman 
Acked-by: YAMAMOTO Takashi 

---
v5
* As suggested by Yamamoto-san
  - Remove unused tmp variable
  - Do not use unlikely() for MPLS ethertype check
* Added Ack from Yamamoto-san

v4.2
* As suggested by Eric Dumazet
  - Use htons() instead of cpu_to_be16()
  - Refactor code to pass type to net_mpls_features.

v4.1
* Following fedback from Thomas Graff and Jesse Gross
  - Use ethertype of packet to detect MPLS rather than
relying on mac_len indicating a gap between the end of L2
and the beginning of L3. That assumption seems to
be broken by the GRE GSO code.
  - Move mpls_features handling into harmonize_features()
This allows an existing call in there to skb_network_protocol()
to be leveraged.
  - Removed acks as the patch has now changed in a material way

v4
* As suggested by YAMAMOTO Takashi
  - Correct typos in comment
* Added Ack from YAMAMOTO Takashi
v3
* As requested by David Miller
  - Do not mark net_mpls_features as inline
  - Correct alignment of parameters

v2
* Added Ack from Jesse Gross
* Removed duplicate 'Thus' from changelog
---
 net/core/dev.c | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 0355ca5..f726bfb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2498,13 +2498,39 @@ static int dev_gso_segment(struct sk_buff *skb, 
netdev_features_t features)
return 0;
 }
 
+/* If MPLS offload request, verify we are testing hardware MPLS features
+ * instead of standard features for the netdev.
+ */
+#ifdef CONFIG_NET_MPLS_GSO
+static netdev_features_t net_mpls_features(struct sk_buff *skb,
+  netdev_features_t features,
+  __be16 type)
+{
+   if (type == htons(ETH_P_MPLS_UC) || type == htons(ETH_P_MPLS_MC))
+   features &= skb->dev->mpls_features;
+
+   return features;
+}
+#else
+static netdev_features_t net_mpls_features(struct sk_buff *skb,
+  netdev_features_t features,
+  __be16 type)
+{
+   return features;
+}
+#endif
+
 static netdev_features_t harmonize_features(struct sk_buff *skb,
netdev_features_t features)
 {
int tmp;
+   __be16 type;
+
+   type = skb_network_protocol(skb, &tmp);
+   features = net_mpls_features(skb, features, type);
 
if (skb->ip_summed != CHECKSUM_NONE &&
-   !can_checksum_protocol(features, skb_network_protocol(skb, &tmp))) {
+   !can_checksum_protocol(features, type)) {
features &= ~NETIF_F_ALL_CSUM;
} else if (illegal_highdma(skb->dev, skb)) {
features &= ~NETIF_F_SG;
-- 
2.0.0.rc2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/5] dpif-netdev: batch packet receiving

2014-06-03 Thread Daniele Di Proietto
Hi Pravin,
On May 30, 2014, at 4:53 PM, Pravin Shelar  wrote:
> We just need to reset mf->map, Can you add api for that ?
> infact miniflow_extract() does not read from map, is there need to
> initialize it on every packet?

You’re right. mini flow_extract() should be enough.

> This metadata is same for given batch, we can sent just one instance of it.
> 

This is true, and helps improving performances too, since we do not need to 
initialize a big amount of memory.

Thanks,

Daniele___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Is there any way to simulate the tunnel traffic in test cases to test IPFIX feature?

2014-06-03 Thread Jesse Gross
On Tue, Jun 3, 2014 at 2:51 AM, Wenyu Zhang  wrote:
>
> I am trying to simulate traffic in test cases to test IPFIX feature.
>
> For normal traffic, I can use "ovs-appctl netdev-dummy/receive " to simulate
> it, just refer to the netflow and sflow cases in tests/ofproto-dpif.at
> And I can capture the IPFIX packets on lo interface as expected.
>
> But for tunnel traffic, I haven't find a workable way to generate this kind
> of traffic.
> For example,  I try to use "ovs-appctl ofproto/trace ovs-dummy" command with
> "-generate" option.
> And we can see that the "sample" actions has been inserted as expected, but
> these actions are not executed, and no IPFIX packets can be captured on lo
> interface.
>
> Anyone knows about the usage of this command? Or other way to generate the
> tunnel traffic to trigger IPFIX sample action? Thanks a lot.

I don't believe that this is implemented. netdev-dummy/receive
constructs a packet from the flow  and the discards everything else.
Since there is no support for additional metadata beyond the input
port or tunneling implemented in the dummy datapath, it basically ends
up looking like a non-tunneled packet arriving on a tunnel port.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 3/5] netdev-dpdk: implement send_batch

2014-06-03 Thread Daniele Di Proietto

On May 30, 2014, at 4:54 PM, Pravin Shelar  wrote:

> On Fri, May 23, 2014 at 11:04 AM, Daniele Di Proietto
>  wrote:
> need to skip this packet.

Ops, added a continue.

>> 
>> +/* TODO: warn if any packet bigger than dev->max_packet_len */
> 
> I think we have to check this. Better to add it now, and optimize it later.

I added a check. I does not seem to slow down things too much.


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 5/5] dpif-netdev: batch packet sending

2014-06-03 Thread Daniele Di Proietto

On May 30, 2014, at 4:55 PM, Pravin Shelar  wrote:

>> 
>> int i;
>> 
>> +memset(&batch, 0, sizeof batch);
>> +
> Is there need to reset batch? can batch_init() take care of it?

You’re right. It should be enough to set batch.flow to NULL.

> 
>> +for (i = 0; i < c; i++) {
>> +struct ofpbuf * packet = packets[i];
> extra space before packet.
> need blank line after declaration.

Fixed, sorry.

>> +if (q->head - q->tail < MAX_QUEUE_LEN) {
>> +struct dp_netdev_upcall *u = &q->upcalls[q->head++ & 
>> QUEUE_MASK];
>> +struct dpif_upcall *upcall = &u->upcall;
>> +struct ofpbuf *buf = &u->buf;
>> +size_t buf_size;
>> +struct flow flow;
>> +
>> +upcall->type = type;
>> +
>> +/* Allocate buffer big enough for everything. */
>> +buf_size = ODPUTIL_FLOW_KEY_BYTES;
>> +if (userdata) {
>> +buf_size += NLA_ALIGN(userdata->nla_len);
>> +}
>> +buf_size += ofpbuf_size(packet);
>> +ofpbuf_init(buf, buf_size);
>> +
>> +/* Put ODP flow. */
>> +miniflow_expand(key, &flow);
>> +odp_flow_key_from_flow(buf, &flow, NULL, flow.in_port.odp_port, 
>> true);
>> +upcall->key = ofpbuf_data(buf);
>> +upcall->key_len = ofpbuf_size(buf);
>> +
>> +/* Put userdata. */
>> +if (userdata) {
>> +upcall->userdata = ofpbuf_put(buf, userdata,
>> +  NLA_ALIGN(userdata->nla_len));
>> +}
>> 
>> -ofpbuf_set_data(&upcall->packet,
>> -ofpbuf_put(buf, ofpbuf_data(packet), 
>> ofpbuf_size(packet)));
>> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
>> +ofpbuf_set_data(&upcall->packet,
>> +ofpbuf_put(buf, ofpbuf_data(packet),
>> +ofpbuf_size(packet)));
>> +ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
>> 
>> -seq_change(q->seq);
>> +seq_change(q->seq);
>> 
>> -error = 0;
>> -} else {
>> -dp_netdev_count_packet(dp, DP_STAT_LOST, 1);
>> -error = ENOBUFS;
>> +error = 0;
>> +} else {
>> +dp_netdev_count_packet(dp, DP_STAT_LOST, 1);
>> +error = ENOBUFS;
>> +}
> Can you make separate function to send a packet for upcall processing?
> it is easier to read that way.

Sure, I did it. At some point, though, we might want to batch that too.

Thanks for the suggestions, I’m about to send an updated patch set.

Daniele

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/3] netdev: netdev_send accepts multiple packets

2014-06-03 Thread Daniele Di Proietto
The netdev_send function has been modified to accept multiple packets, to
allow netdev providers to amortize locking and queuing costs.
This is especially true for netdev-dpdk.

Later commits exploit the new API.

Signed-off-by: Daniele Di Proietto 
---
 lib/dpif-netdev.c |   2 +-
 lib/netdev-bsd.c  |  55 ++
 lib/netdev-dpdk.c | 158 --
 lib/netdev-dummy.c|  66 -
 lib/netdev-linux.c|  47 ++-
 lib/netdev-provider.h |  18 +++---
 lib/netdev.c  |  17 --
 lib/netdev.h  |   3 +-
 8 files changed, 226 insertions(+), 140 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 91c83d6..e0e5ad8 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2115,7 +2115,7 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
 case OVS_ACTION_ATTR_OUTPUT:
 p = dp_netdev_lookup_port(aux->dp, u32_to_odp(nl_attr_get_u32(a)));
 if (p) {
-netdev_send(p->netdev, packet, may_steal);
+netdev_send(p->netdev, &packet, 1, may_steal);
 }
 break;
 
diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
index 8dc33df..4f4aa28 100644
--- a/lib/netdev-bsd.c
+++ b/lib/netdev-bsd.c
@@ -681,13 +681,13 @@ netdev_bsd_rxq_drain(struct netdev_rxq *rxq_)
  * system or a tap device.
  */
 static int
-netdev_bsd_send(struct netdev *netdev_, struct ofpbuf *pkt, bool may_steal)
+netdev_bsd_send(struct netdev *netdev_, struct ofpbuf **pkts, int cnt,
+bool may_steal)
 {
 struct netdev_bsd *dev = netdev_bsd_cast(netdev_);
 const char *name = netdev_get_name(netdev_);
-const void *data = ofpbuf_data(pkt);
-size_t size = ofpbuf_size(pkt);
 int error;
+int i;
 
 ovs_mutex_lock(&dev->mutex);
 if (dev->tap_fd < 0 && !dev->pcap) {
@@ -696,35 +696,42 @@ netdev_bsd_send(struct netdev *netdev_, struct ofpbuf 
*pkt, bool may_steal)
 error = 0;
 }
 
-while (!error) {
-ssize_t retval;
-if (dev->tap_fd >= 0) {
-retval = write(dev->tap_fd, data, size);
-} else {
-retval = pcap_inject(dev->pcap, data, size);
-}
-if (retval < 0) {
-if (errno == EINTR) {
-continue;
+for (i = 0; i < cnt; i++) {
+const void *data = ofpbuf_data(pkts[i]);
+size_t size = ofpbuf_size(pkts[i]);
+
+while (!error) {
+ssize_t retval;
+if (dev->tap_fd >= 0) {
+retval = write(dev->tap_fd, data, size);
 } else {
-error = errno;
-if (error != EAGAIN) {
-VLOG_WARN_RL(&rl, "error sending Ethernet packet on %s: "
- "%s", name, ovs_strerror(error));
+retval = pcap_inject(dev->pcap, data, size);
+}
+if (retval < 0) {
+if (errno == EINTR) {
+continue;
+} else {
+error = errno;
+if (error != EAGAIN) {
+VLOG_WARN_RL(&rl, "error sending Ethernet packet on 
%s: "
+ "%s", name, ovs_strerror(error));
+}
 }
+} else if (retval != size) {
+VLOG_WARN_RL(&rl, "sent partial Ethernet packet (%"PRIuSIZE"d 
bytes of "
+ "%"PRIuSIZE") on %s", retval, size, name);
+error = EMSGSIZE;
+} else {
+break;
 }
-} else if (retval != size) {
-VLOG_WARN_RL(&rl, "sent partial Ethernet packet (%"PRIuSIZE"d 
bytes of "
- "%"PRIuSIZE") on %s", retval, size, name);
-error = EMSGSIZE;
-} else {
-break;
 }
 }
 
 ovs_mutex_unlock(&dev->mutex);
 if (may_steal) {
-ofpbuf_delete(pkt);
+for (i = 0; i < cnt; i++) {
+ofpbuf_delete(pkt);
+}
 }
 
 return error;
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 7a78c34..b731985 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -608,103 +608,147 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq_, struct 
ofpbuf **packets, int *c)
 }
 
 inline static void
-dpdk_queue_pkt(struct netdev_dpdk *dev, int qid,
-   struct rte_mbuf *pkt)
+dpdk_queue_pkts(struct netdev_dpdk *dev, int qid,
+   struct rte_mbuf **pkts, int cnt)
 {
 struct dpdk_tx_queue *txq = &dev->tx_q[qid];
 uint64_t diff_tsc;
 uint64_t cur_tsc;
 uint32_t nb_tx;
 
+int i = 0;
+
 rte_spinlock_lock(&txq->tx_lock);
-txq->burst_pkts[txq->count++] = pkt;
-if (txq->count == MAX_TX_QUEUE_LEN) {
-goto flush;
-}
-cur_tsc = rte_get_timer_cycles();
-if (txq->count == 1) {
-txq->tsc = cur_tsc;
-}
-diff_tsc = cur_tsc - txq->tsc;
-if (diff_tsc >= DRAIN_TSC) {
-   

[ovs-dev] [PATCH 1/3] netdev-dpdk: receive up to NETDEV_MAX_RX_BATCH

2014-06-03 Thread Daniele Di Proietto
As per netdev-provider interface, netdev_dpdk_rxq_recv should receive at most
NETDEV_MAX_RX_BATCH.

Signed-off-by: Daniele Di Proietto 
---
 lib/netdev-dpdk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ba41d2e..7a78c34 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -595,7 +595,9 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq_, struct ofpbuf 
**packets, int *c)
 dpdk_queue_flush(dev, rxq_->queue_id);
 
 nb_rx = rte_eth_rx_burst(rx->port_id, rxq_->queue_id,
- (struct rte_mbuf **) packets, MAX_RX_QUEUE_LEN);
+ (struct rte_mbuf **) packets,
+ MIN((int)NETDEV_MAX_RX_BATCH,
+ (int)MAX_RX_QUEUE_LEN));
 if (!nb_rx) {
 return EAGAIN;
 }
-- 
2.0.0.rc2

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 3/3] dpif-netdev: batch packet processing

2014-06-03 Thread Daniele Di Proietto
This change in dpif-netdev allows faster packet processing for devices which
implement batching (netdev-dpdk currently).

Signed-off-by: Daniele Di Proietto 
---
 lib/dpif-netdev.c| 240 ++-
 lib/dpif.c   |   7 +-
 lib/odp-execute.c|  49 ++---
 lib/odp-execute.h|   4 +-
 ofproto/ofproto-dpif-xlate.c |   2 +-
 5 files changed, 211 insertions(+), 91 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e0e5ad8..fba1d65 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -331,18 +331,18 @@ static void dp_netdev_destroy_all_queues(struct dp_netdev 
*dp)
 OVS_REQ_WRLOCK(dp->queue_rwlock);
 static int dpif_netdev_open(const struct dpif_class *, const char *name,
 bool create, struct dpif **);
-static int dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf *,
-  int queue_no, int type,
+static int dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf **,
+  int cnt, int queue_no, int type,
   const struct miniflow *,
   const struct nlattr *userdata);
 static void dp_netdev_execute_actions(struct dp_netdev *dp,
   const struct miniflow *,
-  struct ofpbuf *, bool may_steal,
+  struct ofpbuf **, int cnt, bool 
may_steal,
   struct pkt_metadata *,
   const struct nlattr *actions,
   size_t actions_len);
-static void dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf *packet,
- struct pkt_metadata *);
+static void dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf **packets,
+ int cnt, odp_port_t port_no);
 
 static void dp_netdev_set_pmd_threads(struct dp_netdev *, int n);
 
@@ -1518,7 +1518,6 @@ static int
 dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute)
 {
 struct dp_netdev *dp = get_dp_netdev(dpif);
-struct pkt_metadata *md = &execute->md;
 struct {
 struct miniflow flow;
 uint32_t buf[FLOW_U32S];
@@ -1531,9 +1530,9 @@ dpif_netdev_execute(struct dpif *dpif, struct 
dpif_execute *execute)
 
 /* Extract flow key. */
 miniflow_initialize(&key.flow, key.buf);
-miniflow_extract(execute->packet, md, &key.flow);
+miniflow_extract(execute->packet, &execute->md, &key.flow);
 
-dp_netdev_execute_actions(dp, &key.flow, execute->packet, false, md,
+dp_netdev_execute_actions(dp, &key.flow, &execute->packet, 1, false, 
&execute->md,
   execute->actions, execute->actions_len);
 
 return 0;
@@ -1747,17 +1746,12 @@ dp_netdev_process_rxq_port(struct dp_netdev *dp,
   struct dp_netdev_port *port,
   struct netdev_rxq *rxq)
 {
-struct ofpbuf *packet[NETDEV_MAX_RX_BATCH];
-int error, c;
+struct ofpbuf *packets[NETDEV_MAX_RX_BATCH];
+int error, cnt;
 
-error = netdev_rxq_recv(rxq, packet, &c);
+error = netdev_rxq_recv(rxq, packets, &cnt);
 if (!error) {
-struct pkt_metadata md = PKT_METADATA_INITIALIZER(port->port_no);
-int i;
-
-for (i = 0; i < c; i++) {
-dp_netdev_port_input(dp, packet[i], &md);
-}
+dp_netdev_port_input(dp, packets, cnt, port->port_no);
 } else if (error != EAGAIN && error != EOPNOTSUPP) {
 static struct vlog_rate_limit rl
 = VLOG_RATE_LIMIT_INIT(1, 5);
@@ -1954,7 +1948,7 @@ dp_netdev_flow_stats_new_cb(void)
 
 static void
 dp_netdev_flow_used(struct dp_netdev_flow *netdev_flow,
-const struct ofpbuf *packet,
+int cnt, int size,
 const struct miniflow *key)
 {
 uint16_t tcp_flags = miniflow_get_tcp_flags(key);
@@ -1966,8 +1960,8 @@ dp_netdev_flow_used(struct dp_netdev_flow *netdev_flow,
 
 ovs_mutex_lock(&bucket->mutex);
 bucket->used = MAX(now, bucket->used);
-bucket->packet_count++;
-bucket->byte_count += ofpbuf_size(packet);
+bucket->packet_count += cnt;
+bucket->byte_count += size;
 bucket->tcp_flags |= tcp_flags;
 ovs_mutex_unlock(&bucket->mutex);
 }
@@ -1981,74 +1975,143 @@ dp_netdev_stats_new_cb(void)
 }
 
 static void
-dp_netdev_count_packet(struct dp_netdev *dp, enum dp_stat_type type)
+dp_netdev_count_packet(struct dp_netdev *dp, enum dp_stat_type type, int cnt)
 {
 struct dp_netdev_stats *bucket;
 
 bucket = ovsthread_stats_bucket_get(&dp->stats, dp_netdev_stats_new_cb);
 ovs_mutex_lock(&bucket->mutex);
-bucket->n[type]++;
+bucket->n[type] += cnt;
 ovs_mutex_unlock(&bucket->mutex);
 }
 
+struct batch_pkt_execute {
+unsigned int packet_co

Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Pravin Shelar
Can you combine it with second patch, otherwise it introduces a bug.

On Tue, Jun 3, 2014 at 4:22 PM, Ryan Wilson 76511  wrote:
> I've ran into some unexpected issues while perf testing this, lets hold
> off on looking at this. I'll submit another patch when I've had all the
> kinks worked out.
>
> Ryan
>
> On 6/3/14 2:21 PM, "Ryan Wilson 76511"  wrote:
>
>>Hey Pravin,
>>
>>Thanks for the catch here! Turns out the header is already tracked in DPDK
>>with rte_mbuf's buffer address - sizeof(ofpbuf). Thus, I submitted another
>>patch that, in free_dpdk_buf(), always gets this header and uses this to
>>free memory.
>>
>>http://patchwork.openvswitch.org/patch/4375/
>>
>>
>>Let me know if this patch works.
>>
>>Cheers,
>>
>>Ryan
>>
>>On 6/3/14 10:50 AM, "Pravin Shelar"  wrote:
>>
>>>On Tue, Jun 3, 2014 at 10:33 AM, Ryan Wilson  wrote:
 When a bridge of datatype type netdev receives a packet, it copies the
 packet from the NIC to a buffer in userspace. Currently, when making
 an upcall, the packet is again copied to the upcall's buffer. However,
 this extra copy is not necessary when the datapath exists in userspace
 as the upcall can directly access the packet data.

 This patch eliminates this extra copy of the packet data in most cases.
 In cases where the packet may still be used later by callers of
 dp_netdev_execute_actions, making a copy of the packet data is still
 necessary.

 Signed-off-by: Ryan Wilson 
 ---
 v2: Addressed Jarno's comment to use direct pointer assignment for
 upcall->packet instead of ofpbuf_set().
 ---
  lib/dpif-netdev.c |   15 +--
  1 file changed, 5 insertions(+), 10 deletions(-)

 diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
 index 91c83d6..c89ae20 100644
 --- a/lib/dpif-netdev.c
 +++ b/lib/dpif-netdev.c
 @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct
ofpbuf *packet,
 miniflow_hash_5tuple(&key.flow, 0)
 % dp->n_handlers,
 DPIF_UC_MISS, &key.flow, NULL);
 -ofpbuf_delete(packet);
  }
  }

 @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp,
struct ofpbuf *packet,
  if (userdata) {
  buf_size += NLA_ALIGN(userdata->nla_len);
  }
 -buf_size += ofpbuf_size(packet);
  ofpbuf_init(buf, buf_size);

  /* Put ODP flow. */
 @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev
*dp, struct ofpbuf *packet,

NLA_ALIGN(userdata->nla_len));
  }

 -ofpbuf_set_data(&upcall->packet,
 -ofpbuf_put(buf, ofpbuf_data(packet),
ofpbuf_size(packet)));
 -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
 +upcall->packet = *packet;

>>>
>>>This would not work with DPDK. ofpbuf from dpdk is special case where
>>>ofpbuf and data are allocated from same memory object. Therefore
>>>moving ofpbuf->data is nontrivial.
>>>
>>>To make it work we need atleast following covered.
>>>1. Define flag for source of ofpbuf header. So that we can track
>>>header and data independently.
>>>2. Fix dpdk ofpbuf destructor to free correct dpdk memory object.
>>>
>>>Thanks,
>>>Pravin.
>>
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] revalidator: Refactor ukey creation/lookup.

2014-06-03 Thread Joe Stringer
I agree, that would look a bit tidier. I'll send a patch.


On 3 June 2014 08:53, Ethan Jackson  wrote:

> I'm wondering if it'd be cleaner to have ukey_create() return a locked
> ukey, and to pull the try_lock() into the critical section for the
> ukeys map?  I.E.
>
> lock_hmap()
> ukey = ukey_lookup()
> if (!ukey) {
> //create and insert
> } else {
> // trylock
> }
> unlock_hmap()
>
> What do you think?
>
>
>
> On Thu, May 29, 2014 at 6:38 PM, Joe Stringer 
> wrote:
> > This patch refactors the code around ukey creation and lookup to
> > simplify the code for callers. A new function ukey_acquire() combines
> > these functions and attempts to acquire a lock on the ukey. Failure to
> > acquire a lock on the ukey is usually a sign that another thread is
> > handling the same flow concurrently, which means the flow does not need
> > to be handled anyway.
> >
> > Signed-off-by: Joe Stringer 
> > ---
> >  ofproto/ofproto-dpif-upcall.c |   86
> ++---
> >  1 file changed, 37 insertions(+), 49 deletions(-)
> >
> > diff --git a/ofproto/ofproto-dpif-upcall.c
> b/ofproto/ofproto-dpif-upcall.c
> > index 1c82b6b..36e9d41 100644
> > --- a/ofproto/ofproto-dpif-upcall.c
> > +++ b/ofproto/ofproto-dpif-upcall.c
> > @@ -216,6 +216,12 @@ static void upcall_unixctl_set_flow_limit(struct
> unixctl_conn *conn, int argc,
> >
> >  static struct udpif_key *ukey_create(const struct nlattr *key, size_t
> key_len,
> >   long long int used);
> > +static struct udpif_key *ukey_lookup(struct udpif *udpif,
> > + const struct nlattr *key, size_t
> key_len,
> > + uint32_t hash);
> > +static bool ukey_acquire(struct udpif *udpif, const struct nlattr *key,
> > + size_t key_len, long long int used,
> > + struct udpif_key **result);
> >  static void ukey_delete(struct revalidator *, struct udpif_key *);
> >
> >  static atomic_bool enable_megaflows = ATOMIC_VAR_INIT(true);
> > @@ -983,8 +989,9 @@ handle_upcalls(struct handler *handler, struct
> upcall *upcalls,
> >
> >  /* Must be called with udpif->ukeys[hash %
> udpif->n_revalidators].mutex. */
> >  static struct udpif_key *
> > -ukey_lookup__(struct udpif *udpif, const struct nlattr *key, size_t
> key_len,
> > -  uint32_t hash)
> > +ukey_lookup(struct udpif *udpif, const struct nlattr *key, size_t
> key_len,
> > +uint32_t hash)
> > +OVS_REQUIRES(udpif->ukeys->mutex)
> >  {
> >  struct udpif_key *ukey;
> >  struct hmap *hmap = &udpif->ukeys[hash %
> udpif->n_revalidators].hmap;
> > @@ -998,21 +1005,8 @@ ukey_lookup__(struct udpif *udpif, const struct
> nlattr *key, size_t key_len,
> >  }
> >
> >  static struct udpif_key *
> > -ukey_lookup(struct udpif *udpif, const struct nlattr *key, size_t
> key_len,
> > -uint32_t hash)
> > -{
> > -struct udpif_key *ukey;
> > -uint32_t idx = hash % udpif->n_revalidators;
> > -
> > -ovs_mutex_lock(&udpif->ukeys[idx].mutex);
> > -ukey = ukey_lookup__(udpif, key, key_len, hash);
> > -ovs_mutex_unlock(&udpif->ukeys[idx].mutex);
> > -
> > -return ukey;
> > -}
> > -
> > -static struct udpif_key *
> >  ukey_create(const struct nlattr *key, size_t key_len, long long int
> used)
> > +OVS_NO_THREAD_SAFETY_ANALYSIS
> >  {
> >  struct udpif_key *ukey = xmalloc(sizeof *ukey);
> >  ovs_mutex_init(&ukey->mutex);
> > @@ -1020,38 +1014,47 @@ ukey_create(const struct nlattr *key, size_t
> key_len, long long int used)
> >  ukey->key = (struct nlattr *) &ukey->key_buf;
> >  memcpy(&ukey->key_buf, key, key_len);
> >  ukey->key_len = key_len;
> > -
> > -ovs_mutex_lock(&ukey->mutex);
> >  ukey->mark = false;
> >  ukey->flow_exists = true;
> >  ukey->created = used ? used : time_msec();
> >  memset(&ukey->stats, 0, sizeof ukey->stats);
> >  ukey->xcache = NULL;
> > -ovs_mutex_unlock(&ukey->mutex);
> >
> >  return ukey;
> >  }
> >
> > -/* Checks for a ukey in 'udpif->ukeys' with the same 'ukey->key' and
> 'hash',
> > - * and inserts 'ukey' if it does not exist.
> > +/* Searches for a ukey in 'udpif->ukeys' that matches 'key' and
> 'key_len' and
> > + * attempts to lock the ukey. If the ukey does not exist, create it.
> >   *
> > - * Returns true if 'ukey' was inserted into 'udpif->ukeys', false
> otherwise. */
> > + * Returns true on success, setting *result to the matching ukey and
> returning
> > + * it in a locked state. Otherwise, returns false and clears *result. */
> >  static bool
> > -udpif_insert_ukey(struct udpif *udpif, struct udpif_key *ukey, uint32_t
> hash)
> > +ukey_acquire(struct udpif *udpif, const struct nlattr *key, size_t
> key_len,
> > + long long int used, struct udpif_key **result)
> > +OVS_TRY_LOCK(true, (*result)->mutex)
> >  {
> > -struct udpif_key *duplicate;
> > -uint32_t idx = hash % udpif->n_revalidators;
>

[ovs-dev] [PATCHv3 1/2] ovs-dev.py: Add option to specify which tests to run.

2014-06-03 Thread Joe Stringer
Signed-off-by: Joe Stringer 
---
v3: Don't require users to specify "-k" for searches.
v2: No change.
v1: First post.
---
 utilities/ovs-dev.py |   14 ++
 1 file changed, 14 insertions(+)

diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
index 814a9d0..2c3af19 100755
--- a/utilities/ovs-dev.py
+++ b/utilities/ovs-dev.py
@@ -144,6 +144,14 @@ commands.append(make)
 
 
 def check():
+flags = ""
+if options.tests:
+for arg in str.split(options.tests):
+if arg[0].isdigit():
+flags += "%s " % arg
+else:
+flags += "-k %s " % arg
+ENV["TESTSUITEFLAGS"] = flags
 make("check")
 commands.append(check)
 
@@ -355,6 +363,12 @@ def main():
  const=i, help="compile with -O%d" % i)
 parser.add_option_group(group)
 
+group = optparse.OptionGroup(parser, "check")
+group.add_option("--tests", dest="tests", metavar="FILTER",
+ help="""run specific tests and/or a test category
+  eg, --tests=\"1-10 megaflow\)
+parser.add_option_group(group)
+
 group = optparse.OptionGroup(parser, "run")
 group.add_option("-g", "--gdb", dest="gdb", action="store_true",
  help="run ovs-vswitchd under gdb")
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCHv3 2/2] ovs-dev.py: Add option to run tests in parallel.

2014-06-03 Thread Joe Stringer
Signed-off-by: Joe Stringer 
---
v3: Set the default to -j8.
v2: Make it configurable.
v1: First post.
---
 utilities/ovs-dev.py |6 ++
 1 file changed, 6 insertions(+)

diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
index 2c3af19..fa74d57 100755
--- a/utilities/ovs-dev.py
+++ b/utilities/ovs-dev.py
@@ -145,6 +145,10 @@ commands.append(make)
 
 def check():
 flags = ""
+if options.parallel:
+flags += "-j%d " % options.parallel
+else:
+flags += "-j8 "
 if options.tests:
 for arg in str.split(options.tests):
 if arg[0].isdigit():
@@ -364,6 +368,8 @@ def main():
 parser.add_option_group(group)
 
 group = optparse.OptionGroup(parser, "check")
+group.add_option("-p", dest="parallel", metavar="N_TESTS", type="int",
+ help="Run N_TESTS in parallel")
 group.add_option("--tests", dest="tests", metavar="FILTER",
  help="""run specific tests and/or a test category
   eg, --tests=\"1-10 megaflow\)
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCHv3 1/2] ovs-dev.py: Add option to specify which tests to run.

2014-06-03 Thread Ethan Jackson
Acked-by: Ethan Jackson 


On Tue, Jun 3, 2014 at 5:41 PM, Joe Stringer  wrote:
> Signed-off-by: Joe Stringer 
> ---
> v3: Don't require users to specify "-k" for searches.
> v2: No change.
> v1: First post.
> ---
>  utilities/ovs-dev.py |   14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
> index 814a9d0..2c3af19 100755
> --- a/utilities/ovs-dev.py
> +++ b/utilities/ovs-dev.py
> @@ -144,6 +144,14 @@ commands.append(make)
>
>
>  def check():
> +flags = ""
> +if options.tests:
> +for arg in str.split(options.tests):
> +if arg[0].isdigit():
> +flags += "%s " % arg
> +else:
> +flags += "-k %s " % arg
> +ENV["TESTSUITEFLAGS"] = flags
>  make("check")
>  commands.append(check)
>
> @@ -355,6 +363,12 @@ def main():
>   const=i, help="compile with -O%d" % i)
>  parser.add_option_group(group)
>
> +group = optparse.OptionGroup(parser, "check")
> +group.add_option("--tests", dest="tests", metavar="FILTER",
> + help="""run specific tests and/or a test category
> +  eg, --tests=\"1-10 megaflow\)
> +parser.add_option_group(group)
> +
>  group = optparse.OptionGroup(parser, "run")
>  group.add_option("-g", "--gdb", dest="gdb", action="store_true",
>   help="run ovs-vswitchd under gdb")
> --
> 1.7.10.4
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCHv3 2/2] ovs-dev.py: Add option to run tests in parallel.

2014-06-03 Thread Ethan Jackson
Sorry to nit pick, but could we change the option name to "--jobs" and
the abbreviated option to "-j".  This is consistent with how make does
it.

Acked-by: Ethan Jackson 


On Tue, Jun 3, 2014 at 5:41 PM, Joe Stringer  wrote:
> Signed-off-by: Joe Stringer 
> ---
> v3: Set the default to -j8.
> v2: Make it configurable.
> v1: First post.
> ---
>  utilities/ovs-dev.py |6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
> index 2c3af19..fa74d57 100755
> --- a/utilities/ovs-dev.py
> +++ b/utilities/ovs-dev.py
> @@ -145,6 +145,10 @@ commands.append(make)
>
>  def check():
>  flags = ""
> +if options.parallel:
> +flags += "-j%d " % options.parallel
> +else:
> +flags += "-j8 "
>  if options.tests:
>  for arg in str.split(options.tests):
>  if arg[0].isdigit():
> @@ -364,6 +368,8 @@ def main():
>  parser.add_option_group(group)
>
>  group = optparse.OptionGroup(parser, "check")
> +group.add_option("-p", dest="parallel", metavar="N_TESTS", type="int",
> + help="Run N_TESTS in parallel")
>  group.add_option("--tests", dest="tests", metavar="FILTER",
>   help="""run specific tests and/or a test category
>eg, --tests=\"1-10 megaflow\)
> --
> 1.7.10.4
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2.58] datapath: Add basic MPLS support to kernel

2014-06-03 Thread Simon Horman
On Tue, Jun 03, 2014 at 03:40:27PM -0700, Jesse Gross wrote:
> On Mon, Jun 2, 2014 at 9:04 PM, Simon Horman  wrote:
> > Hi Jesse,
> >
> > thanks for your feedback.
> >
> > On Mon, Jun 02, 2014 at 05:58:10PM -0700, Jesse Gross wrote:
> >> On Sun, May 25, 2014 at 5:22 PM, Simon Horman  wrote:
> >> > diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
> >> > index 803a94c..8ce596c 100644
> >> > --- a/datapath/flow_netlink.c
> >> > +++ b/datapath/flow_netlink.c
> >> > +   case OVS_ACTION_ATTR_POP_MPLS:
> >> > +   if (!eth_p_mpls(eth_type))
> >> > +   return -EINVAL;
> >>
> >> Should this also take into account the VLAN tag? It's really part of
> >> the EtherType although it has been stripped out here. Actually, maybe
> >> it's better to not track the vlan_tci separately at all during
> >> validation but just fold it into the EtherType.
> >>
> >> The practical implication of this is that you wouldn't be able to pop
> >> out from underneath a VLAN. This may be a good thing if we are trying
> >> to avoid tag order issues - after all, you can't push underneath a
> >> VLAN either. I'm not sure what effects this has on the need to track
> >> mac_len, if any.
> >
> > My thinking is that the ordering problem only surfaces in relation
> > to push MPLS actions where should it go in relation to VLAN tags.
> > For pop actions it seems to me that the outermost tag should be removed
> > regardless of its position in relation to other tags.
> >
> > So I think that the code above is safe. Though now you mention
> > it I do notice that it only allows pop MPLS if there is at most
> > one VLAN tag present.
> >
> > That said, I would not mind particularly disabling pop MPLS in the
> > presence of VLAN tags. At the very least it is related to the
> > painful issue of tag ordering.
> 
> I agree that it is safe but my thought was the it avoids a number of
> potential corner cases such as:
>  * Difference between push and pop underneath vlan tags.
>  * Pop with multiple vlan tags
>  * Differences with varying EtherTypes used for vlans
> 
> > I explored your idea of tracking only eth_type rather than both
> > it and vlan_tci. I did this by adding the following logic to
> > ovs_nla_copy_actions().
> >
> > if (key->eth.tci & htons(VLAN_TAG_PRESENT))
> > eth_type = htons(ETH_P_8021Q);
> > else
> > eth_type = key->eth.type;
> >
> > I then updated the usage of eth_type in ovs_nla_copy_actions__() 
> > accordingly.
> > One problem that I have run into is what to do about pop VLAN.
> >
> > I don't believe its possible to know what the new eth type is.
> > This makes subsequent checks of the eth type for validate_set()
> > a little awkward. And seems to indicate that an extra parameter would
> > be needed.
> >
> > For this reason I am inclined to leave the eth_type and vlan_tci
> > parameters in place. In this case there is no problem with pop VLAN
> > as the ether type inside the VLAN tag should be the value of eth_type.
> 
> Can't we populate eth_type with the EtherType from the flow key in
> pop_vlan? This doesn't provide us with the ability to look arbitrarily
> deep into the packet but it should at least retain the functionality
> that we have today.

I think that things are still a bit tricky for at least two reasons.

1. Tracking eth_type if there are more push VLAN actions than pop VLAN actions

   e.g.: A packet with no VLAN tags with the following
 actions applied: push VLAN, push VLAN, pop VLAN

   I believe that with the change your propose eth_type would end up being
   that of the original packet than of the packet with one VLAN tag removed
   after two are added.

   I think would allow a push MPLS action but it should be rejected.

   I think this could be resolved by setting eth_type to a bogus value
   (e.g. 0) but that would may cause subsequent calls to validate_set()
   to fail when they should pass.

2. The use of eth_type in validate_set().

   Suppose we have an IPv6 packet with one VLAN.

   validate_set() allows an IPv6 set operation if the eth_type is IPv6.
   Which would be true if the eth_type is that of the inner rather
   than the outer packet. However, I believe with the scheme
   you are proposing this becomes a bit tricky as the eth_type will
   be that of the outer packet.

   I suppose this particular problem could be resolved using something like
   this in validate_set():

if (eth_type == htons(ETH_P_8021Q))
eth_type = flow_key.eth_type;

   But that may get a bit tricky. For example suppose there is
   an IPv6 packet with no VLAN tags and the following actions are applied.

   push MPLS, push VLAN, set IPv6

   validate_set() would be passed:
   eth_type: htons(ETH_P_8021Q)
   flow_key.eth_type: htons(ETH_P_IPV6)

   But validate_set should be checking against an MPLS ethtype.

   I think this particular problem could be resolved with a new
   restriction,

Re: [ovs-dev] [PATCH 3/3] dpif-netdev: batch packet processing

2014-06-03 Thread Ethan Jackson
Couple of questions.

First, this patch adds a bit of code which extends beyond 79 characters.

dp_netdev_input() does this rather interesting stuff with two keys
that mfk switches between.  This is definitely not obvious when first
looking at it.  Minimally it needs a comment explaining why.  That
said I don't totally understand it, why can't we ruse the same key
after each batch is executed?  If we can't, why is using only two keys
sufficient?  This seems a bit odd.

Actually on that note, do we need miniflowkey at all?  Could it just
be embedded at the end of batch_pkt_execute?

After this patch, both the hash and recirc and set actions have the
wrong behavior.  I'd rather they abort() the switch than silently
misbehave.  Looking at this now, I think this is a rather deep issue
that we need to think through.  Let's discuss it tomorrow.

Ethan

On Tue, Jun 3, 2014 at 5:10 PM, Daniele Di Proietto
 wrote:
> This change in dpif-netdev allows faster packet processing for devices which
> implement batching (netdev-dpdk currently).
>
> Signed-off-by: Daniele Di Proietto 
> ---
>  lib/dpif-netdev.c| 240 
> ++-
>  lib/dpif.c   |   7 +-
>  lib/odp-execute.c|  49 ++---
>  lib/odp-execute.h|   4 +-
>  ofproto/ofproto-dpif-xlate.c |   2 +-
>  5 files changed, 211 insertions(+), 91 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e0e5ad8..fba1d65 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -331,18 +331,18 @@ static void dp_netdev_destroy_all_queues(struct 
> dp_netdev *dp)
>  OVS_REQ_WRLOCK(dp->queue_rwlock);
>  static int dpif_netdev_open(const struct dpif_class *, const char *name,
>  bool create, struct dpif **);
> -static int dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf *,
> -  int queue_no, int type,
> +static int dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf **,
> +  int cnt, int queue_no, int type,
>const struct miniflow *,
>const struct nlattr *userdata);
>  static void dp_netdev_execute_actions(struct dp_netdev *dp,
>const struct miniflow *,
> -  struct ofpbuf *, bool may_steal,
> +  struct ofpbuf **, int cnt, bool 
> may_steal,
>struct pkt_metadata *,
>const struct nlattr *actions,
>size_t actions_len);
> -static void dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf *packet,
> - struct pkt_metadata *);
> +static void dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf 
> **packets,
> + int cnt, odp_port_t port_no);
>
>  static void dp_netdev_set_pmd_threads(struct dp_netdev *, int n);
>
> @@ -1518,7 +1518,6 @@ static int
>  dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute)
>  {
>  struct dp_netdev *dp = get_dp_netdev(dpif);
> -struct pkt_metadata *md = &execute->md;
>  struct {
>  struct miniflow flow;
>  uint32_t buf[FLOW_U32S];
> @@ -1531,9 +1530,9 @@ dpif_netdev_execute(struct dpif *dpif, struct 
> dpif_execute *execute)
>
>  /* Extract flow key. */
>  miniflow_initialize(&key.flow, key.buf);
> -miniflow_extract(execute->packet, md, &key.flow);
> +miniflow_extract(execute->packet, &execute->md, &key.flow);
>
> -dp_netdev_execute_actions(dp, &key.flow, execute->packet, false, md,
> +dp_netdev_execute_actions(dp, &key.flow, &execute->packet, 1, false, 
> &execute->md,
>execute->actions, execute->actions_len);
>
>  return 0;
> @@ -1747,17 +1746,12 @@ dp_netdev_process_rxq_port(struct dp_netdev *dp,
>struct dp_netdev_port *port,
>struct netdev_rxq *rxq)
>  {
> -struct ofpbuf *packet[NETDEV_MAX_RX_BATCH];
> -int error, c;
> +struct ofpbuf *packets[NETDEV_MAX_RX_BATCH];
> +int error, cnt;
>
> -error = netdev_rxq_recv(rxq, packet, &c);
> +error = netdev_rxq_recv(rxq, packets, &cnt);
>  if (!error) {
> -struct pkt_metadata md = PKT_METADATA_INITIALIZER(port->port_no);
> -int i;
> -
> -for (i = 0; i < c; i++) {
> -dp_netdev_port_input(dp, packet[i], &md);
> -}
> +dp_netdev_port_input(dp, packets, cnt, port->port_no);
>  } else if (error != EAGAIN && error != EOPNOTSUPP) {
>  static struct vlog_rate_limit rl
>  = VLOG_RATE_LIMIT_INIT(1, 5);
> @@ -1954,7 +1948,7 @@ dp_netdev_flow_stats_new_cb(void)
>
>  static void
>  dp_netdev_flow_used(struct dp_netdev_flow *netdev_flow,
> -   

[ovs-dev] [PATCH v4] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Ryan Wilson
When a bridge of datatype type netdev receives a packet, it copies the
packet from the NIC to a buffer in userspace. Currently, when making
an upcall, the packet is again copied to the upcall's buffer. However,
this extra copy is not necessary when the datapath exists in userspace
as the upcall can directly access the packet data.

This patch eliminates this extra copy of the packet data in most cases.
In cases where the packet may still be used later by callers of
dp_netdev_execute_actions, making a copy of the packet data is still
necessary.

This patch also adds a header_ field to 'struct ofpbuf' when using
DPDK. This field holds a pointer to the allocated buffer in the
rte_mempool. Thus, an upcall packet ofpbuf allocated on the stack can
now share data and free memory of a rte_mempool allocated ofpbuf.

Signed-off-by: Ryan Wilson 
Acked-by: Jarno Rajahalme 
---
v2: Addressed Jarno's comment to use direct pointer assignment for
upcall->packet instead of ofpbuf_set().
v3: Addressed issues with DPDK mentioned by Pravin.
v4: Fixed various bugs found in perf test
---
 lib/dpif-netdev.c |   15 +--
 lib/netdev-dpdk.c |2 +-
 lib/ofpbuf.c  |6 +-
 lib/ofpbuf.h  |   20 ++--
 4 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 91c83d6..c89ae20 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct ofpbuf 
*packet,
miniflow_hash_5tuple(&key.flow, 0)
% dp->n_handlers,
DPIF_UC_MISS, &key.flow, NULL);
-ofpbuf_delete(packet);
 }
 }
 
@@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
ofpbuf *packet,
 if (userdata) {
 buf_size += NLA_ALIGN(userdata->nla_len);
 }
-buf_size += ofpbuf_size(packet);
 ofpbuf_init(buf, buf_size);
 
 /* Put ODP flow. */
@@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
ofpbuf *packet,
   NLA_ALIGN(userdata->nla_len));
 }
 
-ofpbuf_set_data(&upcall->packet,
-ofpbuf_put(buf, ofpbuf_data(packet), 
ofpbuf_size(packet)));
-ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
+upcall->packet = *packet;
 
 seq_change(q->seq);
 
 error = 0;
 } else {
 dp_netdev_count_packet(dp, DP_STAT_LOST);
+ofpbuf_delete(packet);
 error = ENOBUFS;
 }
 ovs_mutex_unlock(&q->mutex);
@@ -2120,19 +2117,17 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
 break;
 
 case OVS_ACTION_ATTR_USERSPACE: {
+struct ofpbuf *userspace_packet;
 const struct nlattr *userdata;
 
 userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
+userspace_packet = may_steal ? packet : ofpbuf_clone(packet);
 
-dp_netdev_output_userspace(aux->dp, packet,
+dp_netdev_output_userspace(aux->dp, userspace_packet,
miniflow_hash_5tuple(aux->key, 0)
% aux->dp->n_handlers,
DPIF_UC_ACTION, aux->key,
userdata);
-
-if (may_steal) {
-ofpbuf_delete(packet);
-}
 break;
 }
 
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ba41d2e..e48bfde 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -205,7 +205,7 @@ dpdk_rte_mzalloc(size_t sz)
 void
 free_dpdk_buf(struct ofpbuf *b)
 {
-struct rte_mbuf *pkt = (struct rte_mbuf *) b;
+struct rte_mbuf *pkt = (struct rte_mbuf *) ofpbuf_header(b);
 
 rte_mempool_put(pkt->pool, pkt);
 }
diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c
index 1f4b61d..9490a1e 100644
--- a/lib/ofpbuf.c
+++ b/lib/ofpbuf.c
@@ -117,6 +117,9 @@ void
 ofpbuf_init_dpdk(struct ofpbuf *b, size_t allocated)
 {
 ofpbuf_init__(b, allocated, OFPBUF_DPDK);
+#ifdef DPDK_NETDEV
+ofpbuf_set_header(b, b);
+#endif
 }
 
 /* Initializes 'b' as an empty ofpbuf with an initial capacity of 'size'
@@ -134,8 +137,9 @@ ofpbuf_uninit(struct ofpbuf *b)
 if (b) {
 if (b->source == OFPBUF_MALLOC) {
 free(ofpbuf_base(b));
+} else if (b->source == OFPBUF_DPDK) {
+free_dpdk_buf(b);
 }
-ovs_assert(b->source != OFPBUF_DPDK);
 }
 }
 
diff --git a/lib/ofpbuf.h b/lib/ofpbuf.h
index 85be899..54113d0 100644
--- a/lib/ofpbuf.h
+++ b/lib/ofpbuf.h
@@ -59,6 +59,7 @@ enum OVS_PACKED_ENUM ofpbuf_source {
 struct ofpbuf {
 #ifdef DPDK_NETDEV
 struct rte_mbuf mbuf;   /* DPDK mbuf */
+struct ofpbuf *header_; /* First byte of allocated buffer header. */
 #else
 void *base_; /* First byte of allocated space. */
 void *data_; /* First byte actually in use. */

Re: [ovs-dev] [PATCH 3/3] dpif-netdev: batch packet processing

2014-06-03 Thread Ethan Jackson
Oh, also I'm getting these errors when compiling with clang's thread
saftey analysis.

../lib/dpif-netdev.c:2115:12: error: reading variable 'head' requires
locking any mutex [-Werror,-Wthread-safety-analysis]
if (q->head - q->tail < MAX_QUEUE_LEN) {
   ^
../lib/dpif-netdev.c:2115:22: error: reading variable 'tail' requires
locking any mutex [-Werror,-Wthread-safety-analysis]
if (q->head - q->tail < MAX_QUEUE_LEN) {
 ^
../lib/dpif-netdev.c:2116:53: error: writing variable 'head' requires
locking any mutex exclusively [-Werror,-Wthread-safety-analysis]
struct dp_netdev_upcall *u = &q->upcalls[q->head++ & QUEUE_MASK];

On Tue, Jun 3, 2014 at 6:05 PM, Ethan Jackson  wrote:
> Couple of questions.
>
> First, this patch adds a bit of code which extends beyond 79 characters.
>
> dp_netdev_input() does this rather interesting stuff with two keys
> that mfk switches between.  This is definitely not obvious when first
> looking at it.  Minimally it needs a comment explaining why.  That
> said I don't totally understand it, why can't we ruse the same key
> after each batch is executed?  If we can't, why is using only two keys
> sufficient?  This seems a bit odd.
>
> Actually on that note, do we need miniflowkey at all?  Could it just
> be embedded at the end of batch_pkt_execute?
>
> After this patch, both the hash and recirc and set actions have the
> wrong behavior.  I'd rather they abort() the switch than silently
> misbehave.  Looking at this now, I think this is a rather deep issue
> that we need to think through.  Let's discuss it tomorrow.
>
> Ethan
>
> On Tue, Jun 3, 2014 at 5:10 PM, Daniele Di Proietto
>  wrote:
>> This change in dpif-netdev allows faster packet processing for devices which
>> implement batching (netdev-dpdk currently).
>>
>> Signed-off-by: Daniele Di Proietto 
>> ---
>>  lib/dpif-netdev.c| 240 
>> ++-
>>  lib/dpif.c   |   7 +-
>>  lib/odp-execute.c|  49 ++---
>>  lib/odp-execute.h|   4 +-
>>  ofproto/ofproto-dpif-xlate.c |   2 +-
>>  5 files changed, 211 insertions(+), 91 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index e0e5ad8..fba1d65 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -331,18 +331,18 @@ static void dp_netdev_destroy_all_queues(struct 
>> dp_netdev *dp)
>>  OVS_REQ_WRLOCK(dp->queue_rwlock);
>>  static int dpif_netdev_open(const struct dpif_class *, const char *name,
>>  bool create, struct dpif **);
>> -static int dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf *,
>> -  int queue_no, int type,
>> +static int dp_netdev_output_userspace(struct dp_netdev *dp, struct ofpbuf 
>> **,
>> +  int cnt, int queue_no, int type,
>>const struct miniflow *,
>>const struct nlattr *userdata);
>>  static void dp_netdev_execute_actions(struct dp_netdev *dp,
>>const struct miniflow *,
>> -  struct ofpbuf *, bool may_steal,
>> +  struct ofpbuf **, int cnt, bool 
>> may_steal,
>>struct pkt_metadata *,
>>const struct nlattr *actions,
>>size_t actions_len);
>> -static void dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf 
>> *packet,
>> - struct pkt_metadata *);
>> +static void dp_netdev_port_input(struct dp_netdev *dp, struct ofpbuf 
>> **packets,
>> + int cnt, odp_port_t port_no);
>>
>>  static void dp_netdev_set_pmd_threads(struct dp_netdev *, int n);
>>
>> @@ -1518,7 +1518,6 @@ static int
>>  dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute)
>>  {
>>  struct dp_netdev *dp = get_dp_netdev(dpif);
>> -struct pkt_metadata *md = &execute->md;
>>  struct {
>>  struct miniflow flow;
>>  uint32_t buf[FLOW_U32S];
>> @@ -1531,9 +1530,9 @@ dpif_netdev_execute(struct dpif *dpif, struct 
>> dpif_execute *execute)
>>
>>  /* Extract flow key. */
>>  miniflow_initialize(&key.flow, key.buf);
>> -miniflow_extract(execute->packet, md, &key.flow);
>> +miniflow_extract(execute->packet, &execute->md, &key.flow);
>>
>> -dp_netdev_execute_actions(dp, &key.flow, execute->packet, false, md,
>> +dp_netdev_execute_actions(dp, &key.flow, &execute->packet, 1, false, 
>> &execute->md,
>>execute->actions, execute->actions_len);
>>
>>  return 0;
>> @@ -1747,17 +1746,12 @@ dp_netdev_process_rxq_port(struct dp_netdev *dp,
>>struct dp_netdev_port *port,
>>struct netdev_rxq *rxq)
>>  {
>> -   

Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Ryan Wilson 76511
Sure, I combined them and sent out a new patch fixing all the bugs I
encountered in the DPDK perf test.

http://patchwork.openvswitch.org/patch/4388/


One issue is that since ofproto-dpif-upcall calls ofpbuf_uninit(), I had
to free the DPDK buffer in that function as opposed to ofpbuf_delete().
I'm not sure I entirely like that since we're freeing the header too,
which goes directly against the philosophy of ofpbuf_uninit().

One idea I had was to only allow ofpbuf_uninit to be called with
OFPBUF_DPDK only if struct ofpbuf *b != ofpbuf_header(b) (meaning b is a
copy of the header and thus we can free the original header). I figured
I'd let you check out the patch and see if you had any other ideas.

Cheers,

Ryan

On 6/3/14 5:22 PM, "Pravin Shelar"  wrote:

>Can you combine it with second patch, otherwise it introduces a bug.
>
>On Tue, Jun 3, 2014 at 4:22 PM, Ryan Wilson 76511 
>wrote:
>> I've ran into some unexpected issues while perf testing this, lets hold
>> off on looking at this. I'll submit another patch when I've had all the
>> kinks worked out.
>>
>> Ryan
>>
>> On 6/3/14 2:21 PM, "Ryan Wilson 76511"  wrote:
>>
>>>Hey Pravin,
>>>
>>>Thanks for the catch here! Turns out the header is already tracked in
>>>DPDK
>>>with rte_mbuf's buffer address - sizeof(ofpbuf). Thus, I submitted
>>>another
>>>patch that, in free_dpdk_buf(), always gets this header and uses this to
>>>free memory.
>>>
>>>https://urldefense.proofpoint.com/v1/url?u=http://patchwork.openvswitch.
>>>org/patch/4375/&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=TfBS78Vw3dzttvXidh
>>>bffg%3D%3D%0A&m=gZZtvgxp3GIUGlkz85Olmu3cqs62R0tf%2F4IxSAQLQ9Y%3D%0A&s=20
>>>9e52cdfc433e5fa4deb7ef48a561f78207981f0dabf4dd5ecb3257bde613e2
>>>
>>>
>>>Let me know if this patch works.
>>>
>>>Cheers,
>>>
>>>Ryan
>>>
>>>On 6/3/14 10:50 AM, "Pravin Shelar"  wrote:
>>>
On Tue, Jun 3, 2014 at 10:33 AM, Ryan Wilson  wrote:
> When a bridge of datatype type netdev receives a packet, it copies
>the
> packet from the NIC to a buffer in userspace. Currently, when making
> an upcall, the packet is again copied to the upcall's buffer.
>However,
> this extra copy is not necessary when the datapath exists in
>userspace
> as the upcall can directly access the packet data.
>
> This patch eliminates this extra copy of the packet data in most
>cases.
> In cases where the packet may still be used later by callers of
> dp_netdev_execute_actions, making a copy of the packet data is still
> necessary.
>
> Signed-off-by: Ryan Wilson 
> ---
> v2: Addressed Jarno's comment to use direct pointer assignment for
> upcall->packet instead of ofpbuf_set().
> ---
>  lib/dpif-netdev.c |   15 +--
>  1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 91c83d6..c89ae20 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct
>ofpbuf *packet,
> miniflow_hash_5tuple(&key.flow,
>0)
> % dp->n_handlers,
> DPIF_UC_MISS, &key.flow, NULL);
> -ofpbuf_delete(packet);
>  }
>  }
>
> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev
>*dp,
>struct ofpbuf *packet,
>  if (userdata) {
>  buf_size += NLA_ALIGN(userdata->nla_len);
>  }
> -buf_size += ofpbuf_size(packet);
>  ofpbuf_init(buf, buf_size);
>
>  /* Put ODP flow. */
> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev
>*dp, struct ofpbuf *packet,
>
>NLA_ALIGN(userdata->nla_len));
>  }
>
> -ofpbuf_set_data(&upcall->packet,
> -ofpbuf_put(buf, ofpbuf_data(packet),
>ofpbuf_size(packet)));
> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
> +upcall->packet = *packet;
>

This would not work with DPDK. ofpbuf from dpdk is special case where
ofpbuf and data are allocated from same memory object. Therefore
moving ofpbuf->data is nontrivial.

To make it work we need atleast following covered.
1. Define flag for source of ofpbuf header. So that we can track
header and data independently.
2. Fix dpdk ofpbuf destructor to free correct dpdk memory object.

Thanks,
Pravin.
>>>
>>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Jarno Rajahalme

> On Jun 3, 2014, at 6:11 PM, Ryan Wilson  wrote:
> 
> When a bridge of datatype type netdev receives a packet, it copies the
> packet from the NIC to a buffer in userspace. Currently, when making
> an upcall, the packet is again copied to the upcall's buffer. However,
> this extra copy is not necessary when the datapath exists in userspace
> as the upcall can directly access the packet data.
> 
> This patch eliminates this extra copy of the packet data in most cases.
> In cases where the packet may still be used later by callers of
> dp_netdev_execute_actions, making a copy of the packet data is still
> necessary.
> 
> This patch also adds a header_ field to 'struct ofpbuf' when using
> DPDK. This field holds a pointer to the allocated buffer in the
> rte_mempool. Thus, an upcall packet ofpbuf allocated on the stack can
> now share data and free memory of a rte_mempool allocated ofpbuf.
> 
> Signed-off-by: Ryan Wilson 
> Acked-by: Jarno Rajahalme 
> ---
> v2: Addressed Jarno's comment to use direct pointer assignment for
> upcall->packet instead of ofpbuf_set().
> v3: Addressed issues with DPDK mentioned by Pravin.
> v4: Fixed various bugs found in perf test
> ---
> lib/dpif-netdev.c |   15 +--
> lib/netdev-dpdk.c |2 +-
> lib/ofpbuf.c  |6 +-
> lib/ofpbuf.h  |   20 ++--
> 4 files changed, 25 insertions(+), 18 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 91c83d6..c89ae20 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct ofpbuf 
> *packet,
>miniflow_hash_5tuple(&key.flow, 0)
>% dp->n_handlers,
>DPIF_UC_MISS, &key.flow, NULL);
> -ofpbuf_delete(packet);
> }
> }
> 
> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev *dp, struct 
> ofpbuf *packet,
> if (userdata) {
> buf_size += NLA_ALIGN(userdata->nla_len);
> }
> -buf_size += ofpbuf_size(packet);
> ofpbuf_init(buf, buf_size);
> 
> /* Put ODP flow. */
> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev *dp, 
> struct ofpbuf *packet,
>   NLA_ALIGN(userdata->nla_len));
> }
> 
> -ofpbuf_set_data(&upcall->packet,
> -ofpbuf_put(buf, ofpbuf_data(packet), 
> ofpbuf_size(packet)));
> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
> +upcall->packet = *packet;
> 
> seq_change(q->seq);
> 
> error = 0;
> } else {
> dp_netdev_count_packet(dp, DP_STAT_LOST);
> +ofpbuf_delete(packet);
> error = ENOBUFS;
> }
> ovs_mutex_unlock(&q->mutex);
> @@ -2120,19 +2117,17 @@ dp_execute_cb(void *aux_, struct ofpbuf *packet,
> break;
> 
> case OVS_ACTION_ATTR_USERSPACE: {
> +struct ofpbuf *userspace_packet;
> const struct nlattr *userdata;
> 
> userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
> +userspace_packet = may_steal ? packet : ofpbuf_clone(packet);
> 
> -dp_netdev_output_userspace(aux->dp, packet,
> +dp_netdev_output_userspace(aux->dp, userspace_packet,
>miniflow_hash_5tuple(aux->key, 0)
>% aux->dp->n_handlers,
>DPIF_UC_ACTION, aux->key,
>userdata);
> -
> -if (may_steal) {
> -ofpbuf_delete(packet);
> -}
> break;
> }
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index ba41d2e..e48bfde 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -205,7 +205,7 @@ dpdk_rte_mzalloc(size_t sz)
> void
> free_dpdk_buf(struct ofpbuf *b)
> {
> -struct rte_mbuf *pkt = (struct rte_mbuf *) b;
> +struct rte_mbuf *pkt = (struct rte_mbuf *) ofpbuf_header(b);
> 

Could use the address of the 'mbuf' member instead of a cast here?

> rte_mempool_put(pkt->pool, pkt);
> }
> diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c
> index 1f4b61d..9490a1e 100644
> --- a/lib/ofpbuf.c
> +++ b/lib/ofpbuf.c
> @@ -117,6 +117,9 @@ void
> ofpbuf_init_dpdk(struct ofpbuf *b, size_t allocated)
> {
> ofpbuf_init__(b, allocated, OFPBUF_DPDK);
> +#ifdef DPDK_NETDEV
> +ofpbuf_set_header(b, b);
> +#endif
> }
> 
> /* Initializes 'b' as an empty ofpbuf with an initial capacity of 'size'
> @@ -134,8 +137,9 @@ ofpbuf_uninit(struct ofpbuf *b)
> if (b) {
> if (b->source == OFPBUF_MALLOC) {
> free(ofpbuf_base(b));
> +} else if (b->source == OFPBUF_DPDK) {
> +free_dpdk_buf(b);
> }
> -ovs_assert(b->source != OFPBUF_DPDK);
> }
> }
> 
> diff --git a/lib/ofpbuf.h b/lib/ofpbuf.h
> index 85be899..54113d0 100644
> --- a/lib/ofpbuf.h
> +++ b/lib/ofpbuf.h
> @@ -59,6 +59,7 @@ enum OVS_

Re: [ovs-dev] [PATCH] dpif-netdev: Remove an extra memcpy of packet data from datapath-upcall interface for bridges with datapath_type=netdev.

2014-06-03 Thread Jarno Rajahalme

> On Jun 3, 2014, at 6:22 PM, Ryan Wilson 76511  wrote:
> 
> Sure, I combined them and sent out a new patch fixing all the bugs I
> encountered in the DPDK perf test.
> 
> http://patchwork.openvswitch.org/patch/4388/
> 
> 
> One issue is that since ofproto-dpif-upcall calls ofpbuf_uninit(), I had
> to free the DPDK buffer in that function as opposed to ofpbuf_delete().

ofpbuf_delete() is supposed to delete the ofpbuf itself, while ofpbuf_ininit() 
deletes any associated storage for an ofpbuf whose own memory is managed 
elsewhere (stack, some struct, etc).

Upcall handling will call uninit on a copy of the dpdk ofpbuf, not on the dpdk 
allocated buf itself, so the ofpbuf pointer given as a parameter to uninit 
remains valid even after the call.

> I'm not sure I entirely like that since we're freeing the header too,
> which goes directly against the philosophy of ofpbuf_uninit().
> 

As said above, you should be fine in this regard.

> One idea I had was to only allow ofpbuf_uninit to be called with
> OFPBUF_DPDK only if struct ofpbuf *b != ofpbuf_header(b) (meaning b is a
> copy of the header and thus we can free the original header). I figured
> I'd let you check out the patch and see if you had any other ideas.
> 

An assert to this effect would help detect/avoid bugs.

> Cheers,
> 
> Ryan
> 
>> On 6/3/14 5:22 PM, "Pravin Shelar"  wrote:
>> 
>> Can you combine it with second patch, otherwise it introduces a bug.
>> 
>> On Tue, Jun 3, 2014 at 4:22 PM, Ryan Wilson 76511 
>> wrote:
>>> I've ran into some unexpected issues while perf testing this, lets hold
>>> off on looking at this. I'll submit another patch when I've had all the
>>> kinks worked out.
>>> 
>>> Ryan
>>> 
 On 6/3/14 2:21 PM, "Ryan Wilson 76511"  wrote:
 
 Hey Pravin,
 
 Thanks for the catch here! Turns out the header is already tracked in
 DPDK
 with rte_mbuf's buffer address - sizeof(ofpbuf). Thus, I submitted
 another
 patch that, in free_dpdk_buf(), always gets this header and uses this to
 free memory.
 
 https://urldefense.proofpoint.com/v1/url?u=http://patchwork.openvswitch.
 org/patch/4375/&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=TfBS78Vw3dzttvXidh
 bffg%3D%3D%0A&m=gZZtvgxp3GIUGlkz85Olmu3cqs62R0tf%2F4IxSAQLQ9Y%3D%0A&s=20
 9e52cdfc433e5fa4deb7ef48a561f78207981f0dabf4dd5ecb3257bde613e2
 
 
 Let me know if this patch works.
 
 Cheers,
 
 Ryan
 
> On 6/3/14 10:50 AM, "Pravin Shelar"  wrote:
> 
>> On Tue, Jun 3, 2014 at 10:33 AM, Ryan Wilson  wrote:
>> When a bridge of datatype type netdev receives a packet, it copies
>> the
>> packet from the NIC to a buffer in userspace. Currently, when making
>> an upcall, the packet is again copied to the upcall's buffer.
>> However,
>> this extra copy is not necessary when the datapath exists in
>> userspace
>> as the upcall can directly access the packet data.
>> 
>> This patch eliminates this extra copy of the packet data in most
>> cases.
>> In cases where the packet may still be used later by callers of
>> dp_netdev_execute_actions, making a copy of the packet data is still
>> necessary.
>> 
>> Signed-off-by: Ryan Wilson 
>> ---
>> v2: Addressed Jarno's comment to use direct pointer assignment for
>> upcall->packet instead of ofpbuf_set().
>> ---
>> lib/dpif-netdev.c |   15 +--
>> 1 file changed, 5 insertions(+), 10 deletions(-)
>> 
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 91c83d6..c89ae20 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -2024,7 +2024,6 @@ dp_netdev_input(struct dp_netdev *dp, struct
>> ofpbuf *packet,
>>miniflow_hash_5tuple(&key.flow,
>> 0)
>>% dp->n_handlers,
>>DPIF_UC_MISS, &key.flow, NULL);
>> -ofpbuf_delete(packet);
>> }
>> }
>> 
>> @@ -2063,7 +2062,6 @@ dp_netdev_output_userspace(struct dp_netdev
>> *dp,
>> struct ofpbuf *packet,
>> if (userdata) {
>> buf_size += NLA_ALIGN(userdata->nla_len);
>> }
>> -buf_size += ofpbuf_size(packet);
>> ofpbuf_init(buf, buf_size);
>> 
>> /* Put ODP flow. */
>> @@ -2078,15 +2076,14 @@ dp_netdev_output_userspace(struct dp_netdev
>> *dp, struct ofpbuf *packet,
>> 
>> NLA_ALIGN(userdata->nla_len));
>> }
>> 
>> -ofpbuf_set_data(&upcall->packet,
>> -ofpbuf_put(buf, ofpbuf_data(packet),
>> ofpbuf_size(packet)));
>> -ofpbuf_set_size(&upcall->packet, ofpbuf_size(packet));
>> +upcall->packet = *packet;
> 
> This would not work with DPDK. ofpbuf from dpdk is special case where
> ofpbuf and data are allocated from same memory object. Th

[ovs-dev] [PATCHv2 3/3] revalidator: Use xcache when revalidation is required.

2014-06-03 Thread Joe Stringer
One of the reasons that xlate_cache was introduced was to ensure that
statistics were attributed to the correct rules and interfaces according
to the flow that was installed into the datapath, rather than according
to the current state of the flow table.

This patch makes the revalidators use the xlate_cache to attribute stats
when full revalidation is required. Furthermore, rather than deleting
the xcache when revalidating, keep it around.

Signed-off-by: Joe Stringer 
---
v2: Don't delete the old xcache when revalidating.
Removed extraneous "ukey->xcache = NULL".
---
 ofproto/ofproto-dpif-upcall.c |   23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 90e18e3..b4aa48b 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -1148,10 +1148,16 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 }
 
 may_learn = push.n_packets > 0;
-if (ukey->xcache && !udpif->need_revalidate) {
+if (ukey->xcache) {
 xlate_push_stats(ukey->xcache, may_learn, &push);
-ok = true;
-goto exit;
+if (udpif->need_revalidate) {
+push.n_packets = 0;
+push.n_bytes = 0;
+may_learn = false;
+} else {
+ok = true;
+goto exit;
+}
 }
 
 error = xlate_receive(udpif->backer, NULL, ukey->key, ukey->key_len, &flow,
@@ -1160,16 +1166,11 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 goto exit;
 }
 
-if (udpif->need_revalidate) {
-xlate_cache_clear(ukey->xcache);
-}
-if (!ukey->xcache) {
-ukey->xcache = xlate_cache_new();
-}
-
 xlate_in_init(&xin, ofproto, &flow, NULL, push.tcp_flags, NULL);
 xin.resubmit_stats = push.n_packets ? &push : NULL;
-xin.xcache = ukey->xcache;
+if (!ukey->xcache) {
+ukey->xcache = xin.xcache = xlate_cache_new();
+}
 xin.may_learn = may_learn;
 xin.skip_wildcards = !udpif->need_revalidate;
 xlate_actions(&xin, &xout);
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCHv2 1/3] revalidator: Refactor ukey creation/lookup.

2014-06-03 Thread Joe Stringer
This patch refactors the code around ukey creation and lookup to
simplify the code for callers. A new function ukey_acquire() combines
these functions and attempts to acquire a lock on the ukey. Failure to
acquire a lock on the ukey is usually a sign that another thread is
handling the same flow concurrently, which means the flow does not need
to be handled anyway.

Signed-off-by: Joe Stringer 
---
v2: Make ukey_create() return the ukey locked.
Rearrange the logic in ukey_acquire().
---
 ofproto/ofproto-dpif-upcall.c |   95 +++--
 1 file changed, 43 insertions(+), 52 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 9d48e98..3a75690 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -216,6 +216,12 @@ static void upcall_unixctl_set_flow_limit(struct 
unixctl_conn *conn, int argc,
 
 static struct udpif_key *ukey_create(const struct nlattr *key, size_t key_len,
  long long int used);
+static struct udpif_key *ukey_lookup(struct udpif *udpif,
+ const struct nlattr *key, size_t key_len,
+ uint32_t hash);
+static bool ukey_acquire(struct udpif *udpif, const struct nlattr *key,
+ size_t key_len, long long int used,
+ struct udpif_key **result);
 static void ukey_delete(struct revalidator *, struct udpif_key *);
 
 static atomic_bool enable_megaflows = ATOMIC_VAR_INIT(true);
@@ -983,8 +989,9 @@ handle_upcalls(struct handler *handler, struct upcall 
*upcalls,
 
 /* Must be called with udpif->ukeys[hash % udpif->n_revalidators].mutex. */
 static struct udpif_key *
-ukey_lookup__(struct udpif *udpif, const struct nlattr *key, size_t key_len,
-  uint32_t hash)
+ukey_lookup(struct udpif *udpif, const struct nlattr *key, size_t key_len,
+uint32_t hash)
+OVS_REQUIRES(udpif->ukeys->mutex)
 {
 struct udpif_key *ukey;
 struct hmap *hmap = &udpif->ukeys[hash % udpif->n_revalidators].hmap;
@@ -997,26 +1004,15 @@ ukey_lookup__(struct udpif *udpif, const struct nlattr 
*key, size_t key_len,
 return NULL;
 }
 
-static struct udpif_key *
-ukey_lookup(struct udpif *udpif, const struct nlattr *key, size_t key_len,
-uint32_t hash)
-{
-struct udpif_key *ukey;
-uint32_t idx = hash % udpif->n_revalidators;
-
-ovs_mutex_lock(&udpif->ukeys[idx].mutex);
-ukey = ukey_lookup__(udpif, key, key_len, hash);
-ovs_mutex_unlock(&udpif->ukeys[idx].mutex);
-
-return ukey;
-}
-
+/* Creates a ukey for 'key' and 'key_len', returning it with ukey->mutex in
+ * a locked state. */
 static struct udpif_key *
 ukey_create(const struct nlattr *key, size_t key_len, long long int used)
+OVS_NO_THREAD_SAFETY_ANALYSIS
 {
 struct udpif_key *ukey = xmalloc(sizeof *ukey);
-ovs_mutex_init(&ukey->mutex);
 
+ovs_mutex_init(&ukey->mutex);
 ukey->key = (struct nlattr *) &ukey->key_buf;
 memcpy(&ukey->key_buf, key, key_len);
 ukey->key_len = key_len;
@@ -1027,33 +1023,44 @@ ukey_create(const struct nlattr *key, size_t key_len, 
long long int used)
 ukey->created = used ? used : time_msec();
 memset(&ukey->stats, 0, sizeof ukey->stats);
 ukey->xcache = NULL;
-ovs_mutex_unlock(&ukey->mutex);
 
 return ukey;
 }
 
-/* Checks for a ukey in 'udpif->ukeys' with the same 'ukey->key' and 'hash',
- * and inserts 'ukey' if it does not exist.
+/* Searches for a ukey in 'udpif->ukeys' that matches 'key' and 'key_len' and
+ * attempts to lock the ukey. If the ukey does not exist, create it.
  *
- * Returns true if 'ukey' was inserted into 'udpif->ukeys', false otherwise. */
+ * Returns true on success, setting *result to the matching ukey and returning
+ * it in a locked state. Otherwise, returns false and clears *result. */
 static bool
-udpif_insert_ukey(struct udpif *udpif, struct udpif_key *ukey, uint32_t hash)
+ukey_acquire(struct udpif *udpif, const struct nlattr *key, size_t key_len,
+ long long int used, struct udpif_key **result)
+OVS_TRY_LOCK(true, (*result)->mutex)
 {
-struct udpif_key *duplicate;
-uint32_t idx = hash % udpif->n_revalidators;
-bool ok;
+struct udpif_key *ukey;
+uint32_t hash, idx;
+bool locked = false;
+
+hash = hash_bytes(key, key_len, udpif->secret);
+idx = hash % udpif->n_revalidators;
 
 ovs_mutex_lock(&udpif->ukeys[idx].mutex);
-duplicate = ukey_lookup__(udpif, ukey->key, ukey->key_len, hash);
-if (duplicate) {
-ok = false;
-} else {
+ukey = ukey_lookup(udpif, key, key_len, hash);
+if (!ukey) {
+ukey = ukey_create(key, key_len, used);
 hmap_insert(&udpif->ukeys[idx].hmap, &ukey->hmap_node, hash);
-ok = true;
+locked = true;
+} else if (!ovs_mutex_trylock(&ukey->mutex)) {
+locked = true;
 }
 ovs_mutex_unlock(&udpif->ukeys[idx].mutex)

[ovs-dev] [PATCHv2 2/3] revalidator: Replace ukey->mark with dump_seq.

2014-06-03 Thread Joe Stringer
Rather than setting and resetting the 'mark' field in the ukey, this
patch introduces a seq to track whether a flow has been seen during the
most recent dump. This tidies the code and simplifies the logic for
detecting when flows are duplicated from the datapath.

Signed-off-by: Joe Stringer 
---
v2: Rebase
---
 ofproto/ofproto-dpif-upcall.c |   33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 3a75690..90e18e3 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -179,8 +179,7 @@ struct udpif_key {
 struct ovs_mutex mutex;   /* Guards the following. */
 struct dpif_flow_stats stats OVS_GUARDED; /* Last known stats.*/
 long long int created OVS_GUARDED;/* Estimate of creation time. */
-bool mark OVS_GUARDED;/* For mark and sweep garbage
- collection. */
+uint64_t dump_seq OVS_GUARDED;/* Tracks udpif->dump_seq. */
 bool flow_exists OVS_GUARDED; /* Ensures flows are only deleted
  once. */
 
@@ -1018,7 +1017,7 @@ ukey_create(const struct nlattr *key, size_t key_len, 
long long int used)
 ukey->key_len = key_len;
 
 ovs_mutex_lock(&ukey->mutex);
-ukey->mark = false;
+ukey->dump_seq = 0;
 ukey->flow_exists = true;
 ukey->created = used ? used : time_msec();
 memset(&ukey->stats, 0, sizeof ukey->stats);
@@ -1327,8 +1326,10 @@ revalidate(struct revalidator *revalidator)
 {
 struct udpif *udpif = revalidator->udpif;
 struct dpif_flow_dump_thread *dump_thread;
+uint64_t dump_seq;
 unsigned int flow_limit;
 
+dump_seq = seq_read(udpif->dump_seq);
 atomic_read(&udpif->flow_limit, &flow_limit);
 dump_thread = dpif_flow_dump_thread_create(udpif->dump);
 for (;;) {
@@ -1370,7 +1371,7 @@ revalidate(struct revalidator *revalidator)
 for (f = flows; f < &flows[n_dumped]; f++) {
 long long int used = f->stats.used;
 struct udpif_key *ukey;
-bool already_dumped, mark;
+bool already_dumped, keep;
 
 if (!ukey_acquire(udpif, f->key, f->key_len, used, &ukey)) {
 /* We couldn't acquire the ukey. This means that
@@ -1380,7 +1381,7 @@ revalidate(struct revalidator *revalidator)
 continue;
 }
 
-already_dumped = ukey->mark || !ukey->flow_exists;
+already_dumped = ukey->dump_seq == dump_seq;
 if (already_dumped) {
 /* The flow has already been dumped and handled by another
  * revalidator during this flow dump operation. Skip it. */
@@ -1393,13 +1394,14 @@ revalidate(struct revalidator *revalidator)
 used = ukey->created;
 }
 if (kill_them_all || (used && used < now - max_idle)) {
-mark = false;
+keep = false;
 } else {
-mark = revalidate_ukey(udpif, ukey, f);
+keep = revalidate_ukey(udpif, ukey, f);
 }
-ukey->mark = ukey->flow_exists = mark;
+ukey->dump_seq = dump_seq;
+ukey->flow_exists = keep;
 
-if (!mark) {
+if (!keep) {
 dump_op_init(&ops[n_ops++], f->key, f->key_len, ukey);
 }
 ovs_mutex_unlock(&ukey->mutex);
@@ -1419,22 +1421,21 @@ revalidator_sweep__(struct revalidator *revalidator, 
bool purge)
 struct dump_op ops[REVALIDATE_MAX_BATCH];
 struct udpif_key *ukey, *next;
 size_t n_ops;
+uint64_t dump_seq;
 
 n_ops = 0;
+dump_seq = seq_read(revalidator->udpif->dump_seq);
 
 /* During garbage collection, this revalidator completely owns its ukeys
  * map, and therefore doesn't need to do any locking. */
 HMAP_FOR_EACH_SAFE (ukey, next, hmap_node, revalidator->ukeys) {
-if (!purge && ukey->mark) {
-ukey->mark = false;
-} else if (!ukey->flow_exists) {
+if (!ukey->flow_exists) {
 ukey_delete(revalidator, ukey);
-} else {
+} else if (purge || ukey->dump_seq != dump_seq) {
 struct dump_op *op = &ops[n_ops++];
 
-/* If we have previously seen a flow in the datapath, but didn't
- * see it during the most recent dump, delete it. This allows us
- * to clean up the ukey and keep the statistics consistent. */
+/* If we have previously seen a flow in the datapath, but it
+ * hasn't been seen in the current dump, delete it. */
 dump_op_init(op, ukey->key, ukey->key_len, ukey);
 if (n_ops == REVALIDATE_MAX_BATCH) {
 push_dump_ops(revalidator, ops, n_ops);
-- 
1.7.10.4

___
dev mailing list
dev@openvs

Re: [ovs-dev] [PATCHv3 2/2] ovs-dev.py: Add option to run tests in parallel.

2014-06-03 Thread Joe Stringer
Thanks, I applied this with the following incremental:

diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
index fa74d57..efc65c8 100755
--- a/utilities/ovs-dev.py
+++ b/utilities/ovs-dev.py
@@ -145,8 +145,8 @@ commands.append(make)

 def check():
 flags = ""
-if options.parallel:
-flags += "-j%d " % options.parallel
+if options.jobs:
+flags += "-j%d " % options.jobs
 else:
 flags += "-j8 "
 if options.tests:
@@ -368,7 +368,7 @@ def main():
 parser.add_option_group(group)

 group = optparse.OptionGroup(parser, "check")
-group.add_option("-p", dest="parallel", metavar="N_TESTS", type="int",
- help="Run N_TESTS in parallel")
+group.add_option("-j", "--jobs", dest="jobs", metavar="N", type="int",
+ help="Run N tests in parallel")
 group.add_option("--tests", dest="tests", metavar="FILTER",
  help="""run specific tests and/or a test category


On 4 June 2014 12:45, Ethan Jackson  wrote:

> Sorry to nit pick, but could we change the option name to "--jobs" and
> the abbreviated option to "-j".  This is consistent with how make does
> it.
>
> Acked-by: Ethan Jackson 
>
>
> On Tue, Jun 3, 2014 at 5:41 PM, Joe Stringer 
> wrote:
> > Signed-off-by: Joe Stringer 
> > ---
> > v3: Set the default to -j8.
> > v2: Make it configurable.
> > v1: First post.
> > ---
> >  utilities/ovs-dev.py |6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
> > index 2c3af19..fa74d57 100755
> > --- a/utilities/ovs-dev.py
> > +++ b/utilities/ovs-dev.py
> > @@ -145,6 +145,10 @@ commands.append(make)
> >
> >  def check():
> >  flags = ""
> > +if options.parallel:
> > +flags += "-j%d " % options.parallel
> > +else:
> > +flags += "-j8 "
> >  if options.tests:
> >  for arg in str.split(options.tests):
> >  if arg[0].isdigit():
> > @@ -364,6 +368,8 @@ def main():
> >  parser.add_option_group(group)
> >
> >  group = optparse.OptionGroup(parser, "check")
> > +group.add_option("-p", dest="parallel", metavar="N_TESTS",
> type="int",
> > + help="Run N_TESTS in parallel")
> >  group.add_option("--tests", dest="tests", metavar="FILTER",
> >   help="""run specific tests and/or a test category
> >eg, --tests=\"1-10 megaflow\)
> > --
> > 1.7.10.4
> >
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] datapath: Clean up files on distclean.

2014-06-03 Thread Joe Stringer
Thanks Pravin, I applied this to master.


On 31 May 2014 07:33, Pravin Shelar  wrote:

> On Mon, May 26, 2014 at 6:47 PM, Joe Stringer 
> wrote:
> > This was causing failures in 'make distcleancheck'.
> >
> > Signed-off-by: Joe Stringer 
>
> LGTM
> Acked-by: Pravin B Shelar 
>
> > ---
> >  datapath/linux/Makefile.main.in |3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/datapath/linux/Makefile.main.in b/datapath/linux/
> Makefile.main.in
> > index 88f144c..7d18253 100644
> > --- a/datapath/linux/Makefile.main.in
> > +++ b/datapath/linux/Makefile.main.in
> > @@ -29,7 +29,8 @@ check: all
> >  installcheck:
> >  mostlyclean:
> >  clean:
> > -   rm -f *.o *.ko *.mod.* Module.symvers *.cmd kcompat.h.new
> > +   rm -f *.o *.ko *.mod.* Module.symvers .*.cmd kcompat.h.new \
> > +   modules.order .tmp_versions/openvswitch.mod
> > for d in $(build_links); do if test -h $$d; then rm $$d; fi; done
> >  distclean: clean
> > rm -f kcompat.h
> > --
> > 1.7.10.4
> >
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCHv3 1/2] ovs-dev.py: Add option to specify which tests to run.

2014-06-03 Thread Joe Stringer
Thanks, pushed to master.


On 4 June 2014 12:44, Ethan Jackson  wrote:

> Acked-by: Ethan Jackson 
>
>
> On Tue, Jun 3, 2014 at 5:41 PM, Joe Stringer 
> wrote:
> > Signed-off-by: Joe Stringer 
> > ---
> > v3: Don't require users to specify "-k" for searches.
> > v2: No change.
> > v1: First post.
> > ---
> >  utilities/ovs-dev.py |   14 ++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py
> > index 814a9d0..2c3af19 100755
> > --- a/utilities/ovs-dev.py
> > +++ b/utilities/ovs-dev.py
> > @@ -144,6 +144,14 @@ commands.append(make)
> >
> >
> >  def check():
> > +flags = ""
> > +if options.tests:
> > +for arg in str.split(options.tests):
> > +if arg[0].isdigit():
> > +flags += "%s " % arg
> > +else:
> > +flags += "-k %s " % arg
> > +ENV["TESTSUITEFLAGS"] = flags
> >  make("check")
> >  commands.append(check)
> >
> > @@ -355,6 +363,12 @@ def main():
> >   const=i, help="compile with -O%d" % i)
> >  parser.add_option_group(group)
> >
> > +group = optparse.OptionGroup(parser, "check")
> > +group.add_option("--tests", dest="tests", metavar="FILTER",
> > + help="""run specific tests and/or a test category
> > +  eg, --tests=\"1-10 megaflow\)
> > +parser.add_option_group(group)
> > +
> >  group = optparse.OptionGroup(parser, "run")
> >  group.add_option("-g", "--gdb", dest="gdb", action="store_true",
> >   help="run ovs-vswitchd under gdb")
> > --
> > 1.7.10.4
> >
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] timeval: Add monotonic time functionality for NetBSD and FreeBSD.

2014-06-03 Thread YAMAMOTO Takashi
> This patch also checks the system platform as clock_gettime
> could exist on different platforms but with different values of
> CLOCK_MONOTONIC and different definitions of 'struct timespec'.
> In this case, the system call would be expected to catch the
> error, which is dangerous.
> 
> This patch ensures Linux, NetBSD and FreeBSD platforms use
> clock_gettime with their corresponding correct values and
> definitions. All other platforms use time.time().
> 
> Signed-off-by: Ryan Wilson 

Acked-by: YAMAMOTO Takashi 

> 
> ---
> v2: Added Yamamoto's diff to fix NetBSD case
> ---
>  python/ovs/timeval.py |   34 +++---
>  1 file changed, 27 insertions(+), 7 deletions(-)
> 
> diff --git a/python/ovs/timeval.py b/python/ovs/timeval.py
> index 4463e87..a16bf60 100644
> --- a/python/ovs/timeval.py
> +++ b/python/ovs/timeval.py
> @@ -12,22 +12,41 @@
>  # See the License for the specific language governing permissions and
>  # limitations under the License.
>  
> +import sys
>  import time
>  
> -LIBRT = 'librt.so.1'
> -CLOCK_MONOTONIC = 1
> -
>  try:
>  import ctypes
>  
> +LIBRT = 'librt.so.1'
> +clock_gettime_name = 'clock_gettime'
> +
> +if sys.platform.startswith("linux"):
> +CLOCK_MONOTONIC = 1
> +time_t = ctypes.c_long
> +elif sys.platform.startswith("netbsd"):
> +# NetBSD uses function renaming for ABI versioning.  While the proper
> +# way to get the appropriate version is of course "#include 
> ",
> +# it is difficult with ctypes.  The following is appropriate for
> +# recent versions of NetBSD, including NetBSD-6.
> +LIBRT = 'libc.so.12'
> +clock_gettime_name = '__clock_gettime50'
> +CLOCK_MONOTONIC = 3
> +time_t = ctypes.c_int64
> +elif sys.platform.startswith("freebsd"):
> +CLOCK_MONOTONIC = 4
> +time_t = ctypes.c_int64
> +else:
> +raise Exception
> +
>  class timespec(ctypes.Structure):
>  _fields_ = [
> -('tv_sec', ctypes.c_long),
> +('tv_sec', time_t),
>  ('tv_nsec', ctypes.c_long),
>  ]
>  
>  librt = ctypes.CDLL(LIBRT)
> -clock_gettime = librt.clock_gettime
> +clock_gettime = getattr(librt, clock_gettime_name)
>  clock_gettime.argtypes = [ctypes.c_int, ctypes.POINTER(timespec)]
>  except:
>  # Librt shared library could not be loaded
> @@ -48,8 +67,9 @@ if not hasattr(time, 'monotonic'):
>  time.monotonic = monotonic
>  
>  def msec():
> -"""Returns the current time, as the amount of time since the epoch, in
> -milliseconds, as a float."""
> +""" Returns the system's monotonic time if possible, otherwise returns 
> the
> +current time as the amount of time since the epoch, in milliseconds, as a
> +float."""
>  return time.monotonic() * 1000.0
>  
>  
> -- 
> 1.7.9.5
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] timeval: Add monotonic time functionality for NetBSD and FreeBSD.

2014-06-03 Thread YAMAMOTO Takashi
> Thanks for the fix! Sorry, I don't know too much about the intricacies of
> NetBSD. My first version seemed to work on my NetBSD instance though, so I
> wrongly assumed it would work for most versions of NetBSD.

it picked the old (pre-64bit-time_t) compat version of the function.
yes, it should work.  but better to use the latest one which all newly
compiled C programs would use.

YAMAMOTO Takashi

> 
> I tested this on FreeBSD and it works, but I don't know the intricacies of
> FreeBSD either.
> 
> I applied your incremental in my next patch.
> 
> http://patchwork.openvswitch.org/patch/4367/
> 
> 
> Again thanks!
> 
> Ryan
> 
> On 6/2/14 10:38 PM, "YAMAMOTO Takashi"  wrote:
> 
>>> This patch also checks the system platform as clock_gettime
>>> could exist on different platforms but with different values of
>>> CLOCK_MONOTONIC and different definitions of 'struct timespec'.
>>> In this case, the system call would be expected to catch the
>>> error, which is dangerous.
>>> 
>>> This patch ensures Linux, NetBSD and FreeBSD platforms use
>>> clock_gettime with their corresponding correct values and
>>> definitions. All other platforms use time.time().
>>
>>thanks.
>>
>>the following is incremental diff to fix NetBSD case.
>>please feel free to fold it into yours.
>>
>>i haven't checked FreeBSD part.
>>
>>YAMAMOTO Takashi
>>
>>From 71891a0fc4ba892e26fec8e704167ab8ccde408a Mon Sep 17 00:00:00 2001
>>From: YAMAMOTO Takashi 
>>Date: Tue, 3 Jun 2014 14:17:23 +0900
>>Subject: [PATCH] timeval: Fix NetBSD case
>>
>>Signed-off-by: YAMAMOTO Takashi 
>>---
>> python/ovs/timeval.py | 9 -
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>>diff --git a/python/ovs/timeval.py b/python/ovs/timeval.py
>>index 6f3cbb3..a34ef2b 100644
>>--- a/python/ovs/timeval.py
>>+++ b/python/ovs/timeval.py
>>@@ -16,6 +16,7 @@ import sys
>> import time
>> 
>> LIBRT = 'librt.so.1'
>>+clock_gettime_name = 'clock_gettime'
>> 
>> try:
>> import ctypes
>>@@ -24,6 +25,12 @@ try:
>> CLOCK_MONOTONIC = 1
>> time_t = ctypes.c_long
>> elif sys.platform.startswith("netbsd"):
>>+# NetBSD uses function renaming for ABI versioning.  While the
>>proper
>>+# way to get the appropriate version is of course "#include
>>",
>>+# it is difficult with ctypes.  The following is appropriate for
>>+# recent versions of NetBSD, including NetBSD-6.
>>+LIBRT = 'libc.so.12'
>>+clock_gettime_name = '__clock_gettime50'
>> CLOCK_MONOTONIC = 3
>> time_t = ctypes.c_int64
>> elif sys.platform.startswith("freebsd"):
>>@@ -39,7 +46,7 @@ try:
>> ]
>> 
>> librt = ctypes.CDLL(LIBRT)
>>-clock_gettime = librt.clock_gettime
>>+clock_gettime = getattr(librt, clock_gettime_name)
>> clock_gettime.argtypes = [ctypes.c_int, ctypes.POINTER(timespec)]
>> except:
>> # Librt shared library could not be loaded
>>-- 
>>1.9.0
>>___
>>dev mailing list
>>dev@openvswitch.org
>>https://urldefense.proofpoint.com/v1/url?u=http://openvswitch.org/mailman/
>>listinfo/dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=TfBS78Vw3dzttvXidhbffg%
>>3D%3D%0A&m=EIzWMNYZlk7pLtwllkmlJqm%2FDcGWDbOB7Icp95%2Fr1tA%3D%0A&s=0e9550c
>>56b07bdd67032cee1451ade1acb6b88ebcdc237cc36e1cb1f61e8855b
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] tests: Rename "ofproto-if"->"ofproto-dpif"

2014-06-03 Thread Joe Stringer
It's unclear why these tests are named "ofproto-if..." unlike all of the
other ofproto-dpif tests. Rename them to be more consistent.

Signed-off-by: Joe Stringer 
---
 tests/ofproto-dpif.at |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 2c03ea9..c7d5fa1 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -4630,7 +4630,7 @@ OVS_VSWITCHD_STOP(["/resubmits yielded over 64 kB of 
stack/d"])
 AT_CLEANUP
 
 
-AT_SETUP([ofproto-if packet-out controller])
+AT_SETUP([ofproto-dpif packet-out controller])
 OVS_VSWITCHD_START
 ADD_OF_PORTS([br0], 1, 2)
 
@@ -4673,7 +4673,7 @@ OVS_VSWITCHD_STOP
 AT_CLEANUP
 
 
-AT_SETUP([ofproto-if packet-out controller (patch port)])
+AT_SETUP([ofproto-dpif packet-out controller (patch port)])
 OVS_VSWITCHD_START(
   [-- \
add-port br0 p1 -- \
@@ -4725,7 +4725,7 @@ OVS_VSWITCHD_STOP
 AT_CLEANUP
 
 
-AT_SETUP([ofproto-if packet-out goto_table])
+AT_SETUP([ofproto-dpif packet-out goto_table])
 OVS_VSWITCHD_START
 ADD_OF_PORTS([br0], 1, 2)
 
@@ -4777,7 +4777,7 @@ OVS_VSWITCHD_STOP
 AT_CLEANUP
 
 
-AT_SETUP([ofproto-if packet-out table-miss (continue)])
+AT_SETUP([ofproto-dpif packet-out table-miss (continue)])
 OVS_VSWITCHD_START
 ADD_OF_PORTS([br0], 1, 2)
 
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Is there any way to simulate the tunnel traffic in test cases to test IPFIX feature?

2014-06-03 Thread Wenyu Zhang
Thanks for your help.

And how about "ovs-appctl ofproto/trace ovs-dummy" command? The manual about 
ovs-vswitchd says that it can generate a packet with "-generate" option.
Can it fit my requirement?

Bests,
Wenyu
- Original Message -
From: "Jesse Gross" 
To: "Wenyu Zhang" 
Cc: dev@openvswitch.org
Sent: Wednesday, June 4, 2014 8:11:30 AM
Subject: Re: [ovs-dev] Is there any way to simulate the tunnel traffic in test 
cases to test IPFIX feature?

On Tue, Jun 3, 2014 at 2:51 AM, Wenyu Zhang  wrote:
>
> I am trying to simulate traffic in test cases to test IPFIX feature.
>
> For normal traffic, I can use "ovs-appctl netdev-dummy/receive " to simulate
> it, just refer to the netflow and sflow cases in tests/ofproto-dpif.at
> And I can capture the IPFIX packets on lo interface as expected.
>
> But for tunnel traffic, I haven't find a workable way to generate this kind
> of traffic.
> For example,  I try to use "ovs-appctl ofproto/trace ovs-dummy" command with
> "-generate" option.
> And we can see that the "sample" actions has been inserted as expected, but
> these actions are not executed, and no IPFIX packets can be captured on lo
> interface.
>
> Anyone knows about the usage of this command? Or other way to generate the
> tunnel traffic to trigger IPFIX sample action? Thanks a lot.

I don't believe that this is implemented. netdev-dummy/receive
constructs a packet from the flow  and the discards everything else.
Since there is no support for additional metadata beyond the input
port or tunneling implemented in the dummy datapath, it basically ends
up looking like a non-tunneled packet arriving on a tunnel port
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] tests: Rename "ofproto-if"->"ofproto-dpif"

2014-06-03 Thread Simon Horman
On Wed, Jun 04, 2014 at 02:54:27PM +1200, Joe Stringer wrote:
> It's unclear why these tests are named "ofproto-if..." unlike all of the
> other ofproto-dpif tests. Rename them to be more consistent.
> 
> Signed-off-by: Joe Stringer 

Acked-by: Simon Horman 

> ---
>  tests/ofproto-dpif.at |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index 2c03ea9..c7d5fa1 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -4630,7 +4630,7 @@ OVS_VSWITCHD_STOP(["/resubmits yielded over 64 kB of 
> stack/d"])
>  AT_CLEANUP
>  
>  
> -AT_SETUP([ofproto-if packet-out controller])
> +AT_SETUP([ofproto-dpif packet-out controller])
>  OVS_VSWITCHD_START
>  ADD_OF_PORTS([br0], 1, 2)
>  
> @@ -4673,7 +4673,7 @@ OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
>  
> -AT_SETUP([ofproto-if packet-out controller (patch port)])
> +AT_SETUP([ofproto-dpif packet-out controller (patch port)])
>  OVS_VSWITCHD_START(
>[-- \
> add-port br0 p1 -- \
> @@ -4725,7 +4725,7 @@ OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
>  
> -AT_SETUP([ofproto-if packet-out goto_table])
> +AT_SETUP([ofproto-dpif packet-out goto_table])
>  OVS_VSWITCHD_START
>  ADD_OF_PORTS([br0], 1, 2)
>  
> @@ -4777,7 +4777,7 @@ OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
>  
> -AT_SETUP([ofproto-if packet-out table-miss (continue)])
> +AT_SETUP([ofproto-dpif packet-out table-miss (continue)])
>  OVS_VSWITCHD_START
>  ADD_OF_PORTS([br0], 1, 2)
>  
> -- 
> 1.7.10.4
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Can VXLAN work with VIP (Keepalived)?

2014-06-03 Thread Changbin Liu
Thanks, Jesse.

I figured out the issue. For those who are interested in using VIP as VXLAN
endpoint IP: after the backup node takes over VIP, its "source address"
also needs to be updated.  By default (I tested with Ubuntu 14.04 OS), the
bakcup node's source address keeps unchanged. That explains why my VXLAN
tunnels did not work previously.


Thanks

Changbin


On Tue, Jun 3, 2014 at 3:39 PM, Jesse Gross  wrote:

> On Mon, Jun 2, 2014 at 6:37 PM, Changbin Liu 
> wrote:
> > Hi Folks,
> >
> > Using Open vSwitch I am setting up VXLAN tunnels between hosts. I intend
> to
> > make VXLAN tunnels "highly available" via Keepalived. Specifically, there
> > are one master node and one backup node for each host, and they share a
> > Virtual IP (VIP) via Keepalived, and the VXLAN tunnel will use the VIP as
> > endpoint IP. This way when the master node fails, the backup node will
> > automatically have the VIP shifted to it. Therefore, human efforts are
> not
> > involved in failure recovery.
> >
> > I thought VXLAN can work with VIP this way, since VXLAN simply uses UDP
> > packets, and as long as the endpoint VIP does not change, it should work.
> > But it turns out that it didn't at all. I wonder whether this is intended
> > behavior or I missed something.
>
> It should be possible to use VXLAN with a VIP. The IP stack is
> configured outside of OVS so you should make sure that it is working
> in Linux first.
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Can VXLAN work with VIP (Keepalived)?

2014-06-03 Thread Vasiliy Tolstov
2014-06-04 8:12 GMT+04:00 Changbin Liu :
> I figured out the issue. For those who are interested in using VIP as VXLAN
> endpoint IP: after the backup node takes over VIP, its "source address" also
> needs to be updated.  By default (I tested with Ubuntu 14.04 OS), the bakcup
> node's source address keeps unchanged. That explains why my VXLAN tunnels
> did not work previously.


And how you solve source address change?

-- 
Vasiliy Tolstov,
e-mail: v.tols...@selfip.ru
jabber: v...@selfip.ru
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Can VXLAN work with VIP (Keepalived)?

2014-06-03 Thread Changbin Liu
It appears that we can change source address via updating the routing table.

http://www.symantec.com/business/support/index?page=content&id=HOWTO58872

But somehow it doesn't work on Ubuntu 14.04. Maybe I missed something.

So right now, manually I first delete the default IP on the interface, and
then update the routing table. I don't have an automated solution yet.

Any ideas?


Thanks

Changbin


On Wed, Jun 4, 2014 at 12:15 AM, Vasiliy Tolstov 
wrote:

> 2014-06-04 8:12 GMT+04:00 Changbin Liu :
> > I figured out the issue. For those who are interested in using VIP as
> VXLAN
> > endpoint IP: after the backup node takes over VIP, its "source address"
> also
> > needs to be updated.  By default (I tested with Ubuntu 14.04 OS), the
> bakcup
> > node's source address keeps unchanged. That explains why my VXLAN tunnels
> > did not work previously.
>
>
> And how you solve source address change?
>
> --
> Vasiliy Tolstov,
> e-mail: v.tols...@selfip.ru
> jabber: v...@selfip.ru
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovs-sandbox: Use the correct make

2014-06-03 Thread YAMAMOTO Takashi
On some platforms including NetBSD,
GNU make is usually installed as "gmake".

Signed-off-by: YAMAMOTO Takashi 
---
 tutorial/automake.mk | 2 +-
 tutorial/ovs-sandbox | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tutorial/automake.mk b/tutorial/automake.mk
index d88d5b4..82ad66d 100644
--- a/tutorial/automake.mk
+++ b/tutorial/automake.mk
@@ -9,4 +9,4 @@ EXTRA_DIST += \
tutorial/t-stage4
 
 sandbox: all
-   cd $(srcdir)/tutorial && ./ovs-sandbox -b $(abs_builddir)
+   cd $(srcdir)/tutorial && MAKE=$(MAKE) ./ovs-sandbox -b $(abs_builddir)
diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
index db980d2..21066d1 100755
--- a/tutorial/ovs-sandbox
+++ b/tutorial/ovs-sandbox
@@ -190,7 +190,7 @@ OVS_SYSCONFDIR=$sandbox; export OVS_SYSCONFDIR
 
 if $built; then
 # Easy access to OVS manpages.
-(cd "$builddir" && make install-man mandir="$sandbox"/man)
+(cd "$builddir" && ${MAKE} install-man mandir="$sandbox"/man)
 MANPATH=$sandbox/man:; export MANPATH
 fi
 
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [slow path HASH and RECIRC action v2] dpif: Fix slow action handling for DP_HASH and RECIRC

2014-06-03 Thread YAMAMOTO Takashi
> In case DP_HASH and RECIRC actions need to be executed in slow path,
> current implementation simply don't handle them -- vswitchd simply
> crashes. This patch fixes them by supply an implementation for them.
> 
> RECIRC will be handled by the datapath, same as the output action.
> 
> DP_HASH, on the other hand, is handled in the user space. Although the
> resulting hash values may not match those computed by the datapath, it
> is less expensive; current use case (bonding) does not require a strict
> match to work properly.
> 
> Reported-by: YAMAMOTO Takashi 
> Signed-off-by: Andy Zhou 

Acked-by: YAMAMOTO Takashi 

> 
> ---
> v1->v2:   Addresses Ben's review feedback
>   fix the comment on why execute dp_hash in user space.
> Add assert for unknown hash algorithm specification.
> ---
>  lib/dpif.c|  4 ++--
>  lib/odp-execute.c | 23 ++-
>  2 files changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 84dba28..ac73be1 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -1069,6 +1069,7 @@ dpif_execute_helper_cb(void *aux_, struct ofpbuf 
> *packet,
>  switch ((enum ovs_action_attr)type) {
>  case OVS_ACTION_ATTR_OUTPUT:
>  case OVS_ACTION_ATTR_USERSPACE:
> +case OVS_ACTION_ATTR_RECIRC:
>  execute.actions = action;
>  execute.actions_len = NLA_ALIGN(action->nla_len);
>  execute.packet = packet;
> @@ -1077,6 +1078,7 @@ dpif_execute_helper_cb(void *aux_, struct ofpbuf 
> *packet,
>  aux->error = aux->dpif->dpif_class->execute(aux->dpif, &execute);
>  break;
>  
> +case OVS_ACTION_ATTR_HASH:
>  case OVS_ACTION_ATTR_PUSH_VLAN:
>  case OVS_ACTION_ATTR_POP_VLAN:
>  case OVS_ACTION_ATTR_PUSH_MPLS:
> @@ -1084,8 +1086,6 @@ dpif_execute_helper_cb(void *aux_, struct ofpbuf 
> *packet,
>  case OVS_ACTION_ATTR_SET:
>  case OVS_ACTION_ATTR_SAMPLE:
>  case OVS_ACTION_ATTR_UNSPEC:
> -case OVS_ACTION_ATTR_RECIRC:
> -case OVS_ACTION_ATTR_HASH:
>  case __OVS_ACTION_ATTR_MAX:
>  OVS_NOT_REACHED();
>  }
> diff --git a/lib/odp-execute.c b/lib/odp-execute.c
> index 12ad679..cc18536 100644
> --- a/lib/odp-execute.c
> +++ b/lib/odp-execute.c
> @@ -26,6 +26,7 @@
>  #include "ofpbuf.h"
>  #include "odp-util.h"
>  #include "packets.h"
> +#include "flow.h"
>  #include "unaligned.h"
>  #include "util.h"
>  
> @@ -208,7 +209,6 @@ odp_execute_actions__(void *dp, struct ofpbuf *packet, 
> bool steal,
>  case OVS_ACTION_ATTR_OUTPUT:
>  case OVS_ACTION_ATTR_USERSPACE:
>  case OVS_ACTION_ATTR_RECIRC:
> -case OVS_ACTION_ATTR_HASH:
>  if (dp_execute_action) {
>  /* Allow 'dp_execute_action' to steal the packet data if we 
> do
>   * not need it any more. */
> @@ -219,6 +219,27 @@ odp_execute_actions__(void *dp, struct ofpbuf *packet, 
> bool steal,
>  }
>  break;
>  
> +case OVS_ACTION_ATTR_HASH: {
> +const struct ovs_action_hash *hash_act = nl_attr_get(a);
> +
> +/* Calculate a hash value directly.  This might not match the
> + * value computed by the datapath, but it is much less expensive,
> + * and the current use case (bonding) does not require a strict
> + * match to work properly. */
> +if (hash_act->hash_alg == OVS_HASH_ALG_L4) {
> +struct flow flow;
> +uint32_t hash;
> +
> +flow_extract(packet, md, &flow);
> +hash = flow_hash_5tuple(&flow, hash_act->hash_basis);
> +md->dp_hash = hash ? hash : 1;
> +} else {
> +/* Assert on unknown hash algorithm.  */
> +OVS_NOT_REACHED();
> +}
> +break;
> +}
> +
>  case OVS_ACTION_ATTR_PUSH_VLAN: {
>  const struct ovs_action_push_vlan *vlan = nl_attr_get(a);
>  eth_push_vlan(packet, htons(ETH_TYPE_VLAN), vlan->vlan_tci);
> -- 
> 1.9.1
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread YAMAMOTO Takashi
Acked-by: YAMAMOTO Takashi 

> Me too.
> 
> Acked-by: Ben Pfaff 
> 
> 
> On Tue, Jun 03, 2014 at 02:22:48PM -0700, Jarno Rajahalme wrote:
>> This better,
>> 
>> Acked-by: Jarno Rajahalme 
>> 
>> On Jun 3, 2014, at 2:20 PM, Gurucharan Shetty  wrote:
>> 
>> > May be we should add a new TODO section in BUILD.Windows like:
>> > 
>> > diff --git a/BUILD.Windows b/BUILD.Windows
>> > index ca0d252..41e6eab 100644
>> > --- a/BUILD.Windows
>> > +++ b/BUILD.Windows
>> > @@ -85,3 +85,11 @@ For example,
>> >   --with-openssl="C:/OpenSSL-Win32"
>> > 
>> > * Run make for the ported executables.
>> > +
>> > +TODO:
>> > +
>> > +
>> > +* OVS currently has no native support for atomics on Windows.  Pthreads
>> > +are used as a fallback, but some features, such as OVS-RCU are really
>> > +slow without native atomics support. Atomics support for Windows has to
>> > +be brought in.
>> > 
>> > On Tue, Jun 3, 2014 at 2:16 PM, Jarno Rajahalme  
>> > wrote:
>> >> 
>> >> On Jun 3, 2014, at 2:03 PM, Ben Pfaff  wrote:
>> >> 
>> >>> On Tue, Jun 03, 2014 at 01:09:11PM -0700, Jarno Rajahalme wrote:
>>  OVS is slow when compiled with pthreads atomics.  Add a generic note
>>  in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
>>  provides additional detail.
>>  
>>  Signed-off-by: Jarno Rajahalme 
>> >>> 
>> >>> I think this is only likely to affect Windows (everyone else uses GCC
>> >>> or Clang) so should we put it in BUILD.Windows?
>> >> 
>> >> 
>> >> So you think this should not be mentioned in INSTALL? If so, I think the 
>> >> text there should be modified to something like ?* GCC 4.x or clang C 
>> >> compiler? instead of the current ?A C compiler, such as??.
>> >> 
>> >> How about adding the following to BUILD.Windows:
>> >> 
>> >> diff --git a/BUILD.Windows b/BUILD.Windows
>> >> index ca0d252..05956ba 100644
>> >> --- a/BUILD.Windows
>> >> +++ b/BUILD.Windows
>> >> @@ -39,6 +39,10 @@ project from
>> >> ftp://sourceware.org/pub/pthreads-win32/prebuilt-dll-2-9-1-release to a
>> >> directory (e.g.: C:/pthread).
>> >> 
>> >> +OVS currently has no native support for atomics on Windows.  Pthreads
>> >> +are used as a fallback, but some features, such as OVS-RCU are really
>> >> +slow without native atomics support.
>> >> +
>> >> * Get the Open vSwitch sources from either cloning the repo using git
>> >> or from a distribution tar ball.
>> >> 
>> >>  Jarno
>> >> 
>> >> ___
>> >> dev mailing list
>> >> dev@openvswitch.org
>> >> http://openvswitch.org/mailman/listinfo/dev
>> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] INSTALL: Note about compiler atomics support.

2014-06-03 Thread YAMAMOTO Takashi
> OVS is slow when compiled with pthreads atomics.  Add a generic note
> in INSTALL, with a reference to lib/ovs-atomic.h, where a new comment
> provides additional detail.
> 
> Signed-off-by: Jarno Rajahalme 

Acked-by: YAMAMOTO Takashi 

> ---
>  INSTALL  |4 
>  lib/ovs-atomic.h |3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/INSTALL b/INSTALL
> index 5c869b2..7b08172 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -28,6 +28,10 @@ you will need the following software:
>analysis and thread-safety checks.  For Ubuntu, there are
>nightly built packages available on clang's website.
>  
> +* While OVS may be compatible with other compilers, optimal
> +  support for atomic operations may be missing, making OVS
> +  very slow (see lib/ovs-atomic.h).
> +
>  - libssl, from OpenSSL, is optional but recommended if you plan to
>connect the Open vSwitch to an OpenFlow controller.  libssl is
>required to establish confidentiality and authenticity in the
> diff --git a/lib/ovs-atomic.h b/lib/ovs-atomic.h
> index c21b7ee..2452846 100644
> --- a/lib/ovs-atomic.h
> +++ b/lib/ovs-atomic.h
> @@ -257,6 +257,9 @@
>  #elif HAVE_GCC4_ATOMICS
>  #include "ovs-atomic-gcc4+.h"
>  #else
> +/* ovs-atomic-pthreads implementation is provided for portability.
> + * It might be too slow for real use because Open vSwitch is
> + * optimized for platforms where real atomic ops are available. */
>  #include "ovs-atomic-pthreads.h"
>  #endif
>  #undef IN_OVS_ATOMIC_H
> -- 
> 1.7.10.4
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev