Re: [ovs-dev] [PATCH 01/14] socket-util: Move sock_errno() to socket-util.

2014-02-21 Thread Gurucharan Shetty
On Thu, Feb 20, 2014 at 1:31 PM, Ben Pfaff  wrote:
> On Wed, Feb 19, 2014 at 03:36:12PM -0800, Gurucharan Shetty wrote:
>> And add more users.
>>
>> Signed-off-by: Gurucharan Shetty 
>
> Bitwise operators don't make sense to me here:
>> +if (error == EINPROGRESS
>> +#ifdef _WIN32
>> +| WSAEALREADY | WSAEWOULDBLOCK
>> +#endif
Yikes. I will fix it up and push.
>
> Otherwise,
> Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 05/14] socket-util: getaddrinfo return values for Windows.

2014-02-21 Thread Gurucharan Shetty
On Thu, Feb 20, 2014 at 1:41 PM, Ben Pfaff  wrote:
> On Wed, Feb 19, 2014 at 03:36:16PM -0800, Gurucharan Shetty wrote:
>> Couple of return values need changes.
>> * EAI_NODATA is the same as EAI_NONAME. So we prevent duplicate cases.
>> * Windows does not have a EAI_SYSTEM.
>>
>> Signed-off-by: Gurucharan Shetty 
>
> To make this a little less Win32 specific, can we use "#if
> defined(EAI_NODATA) && EAI_NODATA != EAI_NONAME" instead of adding the
> #ifndef _WIN32?
Yes. I will make the change and push.
>
> Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] openvswitch-2.0.0 : ovs-vswitchd segfault in netdev_get_etheraddr

2014-02-21 Thread Ben Pfaff
On Thu, Feb 20, 2014 at 11:06:07PM -0800, Sridhar Samudrala wrote:
> I am seeing a segfault with ovs-vswitchd when a VM attached to a OVS
> bridge is shutdown.
> I am running openvswitch-2.0.0 on linux kernel:3.10.30
> 
> Is this a known issue with any fix in the git repository?

Can you reproduce the problem?  Can you try this fix?

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 328b215..35a9556 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -1550,6 +1550,9 @@ port_destruct(struct ofport *port_)
 bundle_remove(port_);
 set_cfm(port_, NULL);
 set_bfd(port_, NULL);
+if (ofport->stp_port) {
+stp_port_disable(ofport->stp_port);
+}
 if (ofproto->sflow) {
 dpif_sflow_del_port(ofproto->sflow, port->odp_port);
 }

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ofproto: Send port status message for port-mods, right away.

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 10:07:14AM +0800, Kmindg G wrote:
> On Fri, Feb 21, 2014 at 5:19 AM, Ben Pfaff  wrote:
> > On Thu, Feb 20, 2014 at 12:45:49PM +0800, Kmindg G wrote:
> >> On Thu, Feb 20, 2014 at 3:20 AM, Ben Pfaff  wrote:
> >> > Until now, when it processes OFPT_PORT_MOD message, Open vSwitch has 
> >> > waited
> >> > for ports to notify it that their status has changed before it sends a
> >> > port status update to controllers.
> >> >
> >> > Also, Open vSwitch never sent port config updates at all for port
> >> > modifications other than OFPPC_PORT_DOWN.  I guess that this is a relic
> >> > from the era when there was only one controller, since presumably the
> >> > controller already knew that it changed the port configuration.  But in 
> >> > the
> >> > multi-controller world, it makes sense to send such port status updates,
> >> > and I couldn't quickly find anything in OF1.3 or OF1.4 that said they
> >> > shouldn't be sent.
> >> >
> >> > Signed-off-by: Ben Pfaff 
> >> > Reported-by: Kmindg G 
> >> > ---
> >> >  AUTHORS   |1 +
> >> >  ofproto/ofproto.c |   25 +
> >> >  2 files changed, 14 insertions(+), 12 deletions(-)
> >> >
> >> > diff --git a/AUTHORS b/AUTHORS
> >> > index c557303..2fda8d7 100644
> >> > --- a/AUTHORS
> >> > +++ b/AUTHORS
> >> > @@ -197,6 +197,7 @@ John Hurley john.hur...@netronome.com
> >> >  Kevin Mancuso   kevin.manc...@rackspace.com
> >> >  Kiran Shanbhog  ki...@vmware.com
> >> >  Kirill Kabardin
> >> > +Kmindg Gkmi...@gmail.com
> >> >  Koichi Yagishitayagishita.koi...@jrc.co.jp
> >> >  Konstantin Khorenko khore...@openvz.org
> >> >  Kris zhang  zhang.k...@gmail.com
> >> > diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> >> > index 62c97a2..48e10ca 100644
> >> > --- a/ofproto/ofproto.c
> >> > +++ b/ofproto/ofproto.c
> >> > @@ -2986,22 +2986,23 @@ update_port_config(struct ofport *port,
> >> > enum ofputil_port_config config,
> >> > enum ofputil_port_config mask)
> >> >  {
> >> > -enum ofputil_port_config old_config = port->pp.config;
> >> > -enum ofputil_port_config toggle;
> >> > -
> >> > -toggle = (config ^ port->pp.config) & mask;
> >> > -if (toggle & OFPUTIL_PC_PORT_DOWN) {
> >> > -if (config & OFPUTIL_PC_PORT_DOWN) {
> >> > -netdev_turn_flags_off(port->netdev, NETDEV_UP, NULL);
> >> > -} else {
> >> > -netdev_turn_flags_on(port->netdev, NETDEV_UP, NULL);
> >> > -}
> >> > +enum ofputil_port_config toggle = (config ^ port->pp.config) & mask;
> >> > +
> >> > +if (toggle & OFPUTIL_PC_PORT_DOWN
> >> > +&& (config & OFPUTIL_PC_PORT_DOWN
> >> > +? netdev_turn_flags_off(port->netdev, NETDEV_UP, NULL)
> >> > +: netdev_turn_flags_on(port->netdev, NETDEV_UP, NULL))) {
> >> > +/* We tried to bring the port up or down, but it failed, so 
> >> > don't
> >> > + * update the "down" bit. */
> >> >  toggle &= ~OFPUTIL_PC_PORT_DOWN;
> >> >  }
> >> >
> >> > -port->pp.config ^= toggle;
> >> > -if (port->pp.config != old_config) {
> >> > +if (toggle) {
> >> > +enum ofputil_port_config old_config = port->pp.config;
> >> > +port->pp.config ^= toggle;
> >> >  port->ofproto->ofproto_class->port_reconfigured(port, 
> >> > old_config);
> >> > +connmgr_send_port_status(port->ofproto->connmgr, &port->pp,
> >> > + OFPPR_MODIFY);
> >> >  }
> >> >  }
> >> >
> >> > --
> >> > 1.7.10.4
> >> >
> >>
> >> looks good to me.
> >
> > Thanks.  What do you think of this version?  I think that it retains
> > better compatibility with OpenFlow 1.0 in particular.
> >
> > --8<--cut here-->8--
> >
> > From: Ben Pfaff 
> > Date: Thu, 20 Feb 2014 13:18:51 -0800
> > Subject: [PATCH] ofproto: Send port status message for port-mods, right 
> > away.
> >
> > Until now, when it processes OFPT_PORT_MOD message, Open vSwitch has waited
> > for ports to notify it that their status has changed before it sends a
> > port status update to controllers.
> >
> > Also, Open vSwitch never sent port config updates at all for port
> > modifications other than OFPPC_PORT_DOWN.  I guess that this is a relic
> > from the era when there was only one controller, since presumably the
> > controller already knew that it changed the port configuration.  But in the
> > multi-controller world, it makes sense to send such port status updates,
> > and I couldn't quickly find anything in OF1.3 or OF1.4 that said they
> > shouldn't be sent.
> >
> > Signed-off-by: Ben Pfaff 
> > Reported-by: Kmindg G 
> > ---
> >  ofproto/connmgr.c | 30 --
> >  ofproto/connmgr.h |  4 ++--
> >  ofproto/ofproto.c | 38 --
> >  3 files changed, 50 insertions(+), 22 deletions(-)
> >
> > diff --git a/ofproto/c

Re: [ovs-dev] [PATCH] FAQ: Describe how to add new OpenFlow messages.

2014-02-21 Thread Ben Pfaff
On Thu, Feb 20, 2014 at 05:10:55PM -0800, Jarno Rajahalme wrote:
> 
> > On Jan 21, 2014, at 9:34 AM, Ben Pfaff  wrote:
> > 
> > Signed-off-by: Ben Pfaff 
> > ---
> > FAQ |   17 +
> > 1 file changed, 17 insertions(+)
> > 
> > diff --git a/FAQ b/FAQ
> > index 75d9e6b..dac8fe1 100644
> > --- a/FAQ
> > +++ b/FAQ
> > @@ -1454,6 +1454,23 @@ A: These flows drop the ARP packets that IP hosts 
> > use to establish IP
> >   
> > priority=5,in_port=1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00,actions=2
> >   
> > priority=5,in_port=2,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00,actions=1
> > 
> > +
> > +Development
> > +---
> > +
> > +Q: How do I implement a new OpenFlow message?
> > +
> > +A: Add your new message to "enum ofpraw" and "enum ofptype" in
> > +   lib/ofp-msgs.h, following the existing pattern, then recompile and
> > +   fix all of the new warnings.
> 
> I would add here something like this:
> 
> ", implementing new functionality for the new message as needed."
> 
> Acked-by: Jarno Rajahalme 

Thanks, I made that change and I'll push this in a minute.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] openvswitch-2.0.0 : ovs-vswitchd segfault in netdev_get_etheraddr

2014-02-21 Thread Sridhar Samudrala

On 2/21/2014 8:38 AM, Ben Pfaff wrote:

On Thu, Feb 20, 2014 at 11:06:07PM -0800, Sridhar Samudrala wrote:

I am seeing a segfault with ovs-vswitchd when a VM attached to a OVS
bridge is shutdown.
I am running openvswitch-2.0.0 on linux kernel:3.10.30

Is this a known issue with any fix in the git repository?

Can you reproduce the problem?
Yes.  It happens everytime the VM is shutdown and STP is enabled on the 
bridge.

  Can you try this fix?

I had to change ofport->stp_port to port->stp_port to fix the compile error.
With that change, the following patch fixes the segfault.

Thanks
Sridhar


diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 328b215..35a9556 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -1550,6 +1550,9 @@ port_destruct(struct ofport *port_)
  bundle_remove(port_);
  set_cfm(port_, NULL);
  set_bfd(port_, NULL);
+if (ofport->stp_port) {
+stp_port_disable(ofport->stp_port);
+}
  if (ofproto->sflow) {
  dpif_sflow_del_port(ofproto->sflow, port->odp_port);
  }

Thanks,

Ben.


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [rwlock 3/5] ofproto-dpif-xlate: Avoid recursively taking read side of ofgroup rwlock.

2014-02-21 Thread Ben Pfaff
On Wed, Feb 19, 2014 at 02:53:11PM -0800, Joe Stringer wrote:
> This looks good to me, although aren't we meant to report something at the
> OpenFlow layer?
> 
> Is this equivalent to the OpenFlow "chaining" description? So if we don't
> support chaining groups together, we should return an error message with
> reason OFPGMFC_CHAINING_UNSUPPORTED, rather than accepting the rules then
> misbehaving?

You're right.  Here's a version that does that, and adds a test.

--8<--cut here-->8--

From: Ben Pfaff 
Date: Fri, 21 Feb 2014 10:45:00 -0800
Subject: [PATCH] ofproto-dpif-xlate: Avoid recursively taking read side of
 ofgroup rwlock.

With glibc, rwlocks by default allow recursive read-locking even if a
thread is blocked waiting for the write-lock.  POSIX allows such attempts
to deadlock, and it appears that the libc used in NetBSD, at least, does
deadlock.  ofproto-dpif-xlate could take the ofgroup rwlock recursively if
an ofgroup's actions caused the ofgroup to be executed again.  This commit
avoids that issue by preventing recursive translation of groups (the same
group or another group).  This is not the most user friendly solution,
but OpenFlow allows this restriction, and we can always remove the
restriction later (probably requiring more complicated code) if it
proves to be a real problem to real users.

Found by inspection.

Signed-off-by: Ben Pfaff 
---
 ofproto/ofproto-dpif-xlate.c |   32 +++-
 ofproto/ofproto-dpif.c   |   14 ++
 tests/ofproto-dpif.at|   11 +++
 3 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index ccf0b75..89d92af 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -178,6 +178,7 @@ struct xlate_ctx {
 /* Resubmit statistics, via xlate_table_action(). */
 int recurse;/* Current resubmit nesting depth. */
 int resubmits;  /* Total number of resubmits. */
+bool in_group;  /* Currently translating ofgroup, if true. */
 
 uint32_t orig_skb_priority; /* Priority when packet arrived. */
 uint8_t table_id;   /* OpenFlow table ID where flow was found. */
@@ -1980,6 +1981,8 @@ xlate_select_group(struct xlate_ctx *ctx, struct 
group_dpif *group)
 static void
 xlate_group_action__(struct xlate_ctx *ctx, struct group_dpif *group)
 {
+ctx->in_group = true;
+
 switch (group_dpif_get_type(group)) {
 case OFPGT11_ALL:
 case OFPGT11_INDIRECT:
@@ -1995,12 +1998,38 @@ xlate_group_action__(struct xlate_ctx *ctx, struct 
group_dpif *group)
 OVS_NOT_REACHED();
 }
 group_dpif_release(group);
+
+ctx->in_group = false;
+}
+
+static bool
+xlate_group_resource_check(struct xlate_ctx *ctx)
+{
+if (!xlate_resubmit_resource_check(ctx)) {
+return false;
+} else if (ctx->in_group) {
+/* Prevent nested translation of OpenFlow groups.
+ *
+ * OpenFlow allows this restriction.  We enforce this restriction only
+ * because, with the current architecture, we would otherwise have to
+ * take a possibly recursive read lock on the ofgroup rwlock, which is
+ * unsafe given that POSIX allows taking a read lock to block if there
+ * is a thread blocked on taking the write lock.  Other solutions
+ * without this restriction are also possible, but seem unwarranted
+ * given the current limited use of groups. */
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+
+VLOG_ERR_RL(&rl, "cannot recursively translate OpenFlow group");
+return false;
+} else {
+return true;
+}
 }
 
 static bool
 xlate_group_action(struct xlate_ctx *ctx, uint32_t group_id)
 {
-if (xlate_resubmit_resource_check(ctx)) {
+if (xlate_group_resource_check(ctx)) {
 struct group_dpif *group;
 bool got_group;
 
@@ -2974,6 +3003,7 @@ xlate_actions__(struct xlate_in *xin, struct xlate_out 
*xout)
 
 ctx.recurse = 0;
 ctx.resubmits = 0;
+ctx.in_group = false;
 ctx.orig_skb_priority = flow->skb_priority;
 ctx.table_id = 0;
 ctx.exit = false;
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 328b215..7c5b941 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -3291,6 +3291,20 @@ static enum ofperr
 group_construct(struct ofgroup *group_)
 {
 struct group_dpif *group = group_dpif_cast(group_);
+const struct ofputil_bucket *bucket;
+
+/* Prevent group chaining because our locking structure makes it hard to
+ * implement deadlock-free.  (See xlate_group_resource_check().) */
+LIST_FOR_EACH (bucket, list_node, &group->up.buckets) {
+const struct ofpact *a;
+
+OFPACT_FOR_EACH (a, bucket->ofpacts, bucket->ofpacts_len) {
+if (a->type == OFPACT_GROUP) {
+return OFPERR_OFPGMFC_CHAIN

Re: [ovs-dev] [rwlock 4/5] ovs-thread: Get rid of obsolete sparse wrappers.

2014-02-21 Thread Ben Pfaff
Thanks, applied.

On Wed, Feb 19, 2014 at 02:55:46PM -0800, Joe Stringer wrote:
> Acked-by: Joe Stringer 
> 
> 
> On 16 January 2014 10:07, Ben Pfaff  wrote:
> 
> > These were useful back when we were trying to use the sparse lock balance
> > annotations, but we removed those in commit 47b52c71232c0 (sparse: Remove
> > support for thread-safety annotations.) and so they serve no purpose any
> > longer.
> >
> > Signed-off-by: Ben Pfaff 
> > ---
> >  lib/ovs-thread.h |   11 ---
> >  1 file changed, 11 deletions(-)
> >
> > diff --git a/lib/ovs-thread.h b/lib/ovs-thread.h
> > index 8cf2ecc..f27553e 100644
> > --- a/lib/ovs-thread.h
> > +++ b/lib/ovs-thread.h
> > @@ -115,17 +115,6 @@ void xpthread_cond_destroy(pthread_cond_t *);
> >  void xpthread_cond_signal(pthread_cond_t *);
> >  void xpthread_cond_broadcast(pthread_cond_t *);
> >
> > -#ifdef __CHECKER__
> > -/* Replace these functions by the macros already defined in the
> > 
> > - * annotations, because the macro definitions have correct semantics for
> > the
> > - * conditional acquisition that can't be captured in a function
> > annotation.
> > - * The difference in semantics from pthread_*() to xpthread_*() does not
> > matter
> > - * because sparse is not a compiler. */
> > -#define xpthread_mutex_trylock pthread_mutex_trylock
> > -#define xpthread_rwlock_tryrdlock pthread_rwlock_tryrdlock
> > -#define xpthread_rwlock_trywrlock pthread_rwlock_trywrlock
> > -#endif
> > -
> >  void xpthread_key_create(pthread_key_t *, void (*destructor)(void *));
> >  void xpthread_key_delete(pthread_key_t);
> >  void xpthread_setspecific(pthread_key_t, const void *);
> > --
> > 1.7.10.4
> >
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> >
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [rwlock 5/5] ovs-thread: Use fair (but nonrecursive) rwlocks on glibc.

2014-02-21 Thread Ben Pfaff
On Wed, Feb 19, 2014 at 03:10:45PM -0800, Joe Stringer wrote:
> On 16 January 2014 10:07, Ben Pfaff  wrote:
> >
> > + * An ovs_rwlock does not support recursive readers, because POSIX allows
> > + * taking the reader lock recursively to deadlock when a thread is
> > waiting on
> > + * the write-lock.  (NetBSD does deadlock.)  glibc does not deadlock in
> > this
> > + * situation by default, but ovs_rwlock_init() initializes rwlocks to do
> > so
> > + * anyway for two reasons:
> 
> 
> The last sentence seems a bit odd. Perhaps something closer to "glibc does
> not deadlock in this situation by default, but ovs_rwlock_init()
> initializes rwlocks as non-recursive anyway for two reasons:"?
> 
> Otherwise,
> Acked-by: Joe Stringer 

You're right, the wording is odd.  I fixed it up.  I'll apply this
after patch 3/5 is re-reviewed.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 03/14] socket-util: Maximum file descriptors for windows.

2014-02-21 Thread Gurucharan Shetty
On Thu, Feb 20, 2014 at 1:34 PM, Ben Pfaff  wrote:
> On Wed, Feb 19, 2014 at 03:36:14PM -0800, Gurucharan Shetty wrote:
>> Windows does not have a getrlimit() function. As such,
>> there is no limit to number of file descriptors that
>> can be opened. So, set an aritificial limit of 1024.
>> This may be increased in the future if the limit is too
>> low.
>>
>> Co-authored-by: Linda Sun 
>> Signed-off-by: Linda Sun 
>> Signed-off-by: Gurucharan Shetty 
>
> get_max_fds() is only used by process_start() in process.c and is only
> relevant for the Unix "fork" model of creating processes, not for
> Windows.  I'd rather move the function to process.c and #ifdef it out
> entirely for Windows.  How does that sound?
Okay. I will have it in v2.

>
> Thanks,
>
> Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 05/10] datapath: Clarify locking.

2014-02-21 Thread Jarno Rajahalme
Remove unnecessary locking from functions that are always called with
appropriate locking.

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 8a2c0af..59b7f3f 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -175,6 +175,7 @@ static struct hlist_head *vport_hash_bucket(const struct 
datapath *dp,
return &dp->ports[port_no & (DP_VPORT_HASH_BUCKETS - 1)];
 }
 
+/* Called with ovs_mutex or RCU read lock. */
 struct vport *ovs_lookup_vport(const struct datapath *dp, u16 port_no)
 {
struct vport *vport;
@@ -653,7 +654,7 @@ static size_t ovs_flow_cmd_msg_size(const struct 
sw_flow_actions *acts)
+ nla_total_size(acts->actions_len); /* OVS_FLOW_ATTR_ACTIONS */
 }
 
-/* Called with ovs_mutex. */
+/* Called with ovs_mutex or RCU read lock. */
 static int ovs_flow_cmd_fill_info(struct sw_flow *flow, struct datapath *dp,
  struct sk_buff *skb, u32 portid,
  u32 seq, u32 flags, u8 cmd)
@@ -744,6 +745,7 @@ error:
return err;
 }
 
+/* Must be called with ovs_mutex. */
 static struct sk_buff *ovs_flow_cmd_alloc_info(struct sw_flow *flow,
   struct genl_info *info)
 {
@@ -754,6 +756,7 @@ static struct sk_buff *ovs_flow_cmd_alloc_info(struct 
sw_flow *flow,
return genlmsg_new_unicast(len, info, GFP_KERNEL);
 }
 
+/* Must be called with ovs_mutex. */
 static struct sk_buff *ovs_flow_cmd_build_info(struct sw_flow *flow,
   struct datapath *dp,
   struct genl_info *info,
@@ -1100,6 +1103,7 @@ static size_t ovs_dp_cmd_msg_size(void)
return msgsize;
 }
 
+/* Called with ovs_mutex or RCU read lock. */
 static int ovs_dp_cmd_fill_info(struct datapath *dp, struct sk_buff *skb,
u32 portid, u32 seq, u32 flags, u8 cmd)
 {
@@ -1115,9 +1119,7 @@ static int ovs_dp_cmd_fill_info(struct datapath *dp, 
struct sk_buff *skb,
 
ovs_header->dp_ifindex = get_dpifindex(dp);
 
-   rcu_read_lock();
err = nla_put_string(skb, OVS_DP_ATTR_NAME, ovs_dp_name(dp));
-   rcu_read_unlock();
if (err)
goto nla_put_failure;
 
@@ -1142,6 +1144,7 @@ error:
return -EMSGSIZE;
 }
 
+/* Must be called with ovs_mutex. */
 static struct sk_buff *ovs_dp_cmd_build_info(struct datapath *dp,
 struct genl_info *info, u8 cmd)
 {
@@ -1160,7 +1163,7 @@ static struct sk_buff *ovs_dp_cmd_build_info(struct 
datapath *dp,
return skb;
 }
 
-/* Called with ovs_mutex. */
+/* Called with rcu_read_lock or ovs_mutex. */
 static struct datapath *lookup_datapath(struct net *net,
struct ovs_header *ovs_header,
struct nlattr *a[OVS_DP_ATTR_MAX + 1])
@@ -1172,10 +1175,8 @@ static struct datapath *lookup_datapath(struct net *net,
else {
struct vport *vport;
 
-   rcu_read_lock();
vport = ovs_vport_locate(net, nla_data(a[OVS_DP_ATTR_NAME]));
dp = vport && vport->port_no == OVSP_LOCAL ? vport->dp : NULL;
-   rcu_read_unlock();
}
return dp ? dp : ERR_PTR(-ENODEV);
 }
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 08/10] datapath: Split ovs_flow_cmd_new_or_set().

2014-02-21 Thread Jarno Rajahalme
Following patch will be easier to reason about with separate
ovs_flow_cmd_new() and ovs_flow_cmd_set() functions.

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |  173 +++
 1 file changed, 121 insertions(+), 52 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index a6e8d63..3d4037d 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -784,16 +784,16 @@ static struct sk_buff *ovs_flow_cmd_build_info(struct 
sw_flow *flow,
return skb;
 }
 
-static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info)
+static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info)
 {
struct nlattr **a = info->attrs;
struct ovs_header *ovs_header = info->userhdr;
struct sw_flow_key key, masked_key;
-   struct sw_flow *flow = NULL;
+   struct sw_flow *flow;
struct sw_flow_mask mask;
struct sk_buff *reply = NULL;
struct datapath *dp;
-   struct sw_flow_actions *acts = NULL;
+   struct sw_flow_actions *acts;
struct sw_flow_match match;
int error;
 
@@ -809,29 +809,21 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
goto error;
 
/* Validate actions. */
-   if (a[OVS_FLOW_ATTR_ACTIONS]) {
-   acts = 
ovs_nla_alloc_flow_actions(nla_len(a[OVS_FLOW_ATTR_ACTIONS]));
-   error = PTR_ERR(acts);
-   if (IS_ERR(acts))
-   goto error;
+   error = -EINVAL;
+   if (!a[OVS_FLOW_ATTR_ACTIONS])
+   goto error;
 
-   ovs_flow_mask_key(&masked_key, &key, &mask);
-   error = ovs_nla_copy_actions(a[OVS_FLOW_ATTR_ACTIONS],
-&masked_key, 0, &acts);
-   if (error) {
-   OVS_NLERR("Flow actions may not be safe on all matching 
packets.\n");
-   goto err_kfree;
-   }
-   } else if (info->genlhdr->cmd == OVS_FLOW_CMD_SET) {
-   /* Need empty actions. */
-   acts = ovs_nla_alloc_flow_actions(0);
-   error = PTR_ERR(acts);
-   if (IS_ERR(acts))
-   goto error;
-   } else {
-   /* OVS_FLOW_CMD_NEW must have actions. */
-   error = -EINVAL;
+   acts = ovs_nla_alloc_flow_actions(nla_len(a[OVS_FLOW_ATTR_ACTIONS]));
+   error = PTR_ERR(acts);
+   if (IS_ERR(acts))
goto error;
+
+   ovs_flow_mask_key(&masked_key, &key, &mask);
+   error = ovs_nla_copy_actions(a[OVS_FLOW_ATTR_ACTIONS],
+&masked_key, 0, &acts);
+   if (error) {
+   OVS_NLERR("Flow actions may not be safe on all matching 
packets.\n");
+   goto err_kfree;
}
 
ovs_lock();
@@ -843,11 +835,6 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
/* Check if this is a duplicate flow */
flow = ovs_flow_tbl_lookup(&dp->table, &key);
if (!flow) {
-   /* Bail out if we're not allowed to create a new flow. */
-   error = -ENOENT;
-   if (info->genlhdr->cmd == OVS_FLOW_CMD_SET)
-   goto err_unlock_ovs;
-
/* Allocate flow. */
flow = ovs_flow_alloc();
if (IS_ERR(flow)) {
@@ -858,19 +845,12 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
flow->key = masked_key;
flow->unmasked_key = key;
rcu_assign_pointer(flow->sf_acts, acts);
+   acts = NULL;
 
/* Put flow in bucket. */
error = ovs_flow_tbl_insert(&dp->table, flow, &mask);
-   if (error) {
-   acts = NULL;
+   if (error)
goto err_flow_free;
-   }
-
-   if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group))
-   reply = ovs_flow_cmd_build_info(flow,
-   ovs_header->dp_ifindex,
-   info,
-   OVS_FLOW_CMD_NEW);
} else {
/* We found a matching flow. */
struct sw_flow_actions *old_acts;
@@ -882,8 +862,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
 * gets fixed.
 */
error = -EEXIST;
-   if (info->genlhdr->cmd == OVS_FLOW_CMD_NEW &&
-   info->nlhdr->nlmsg_flags & (NLM_F_CREATE | NLM_F_EXCL))
+   if (info->nlhdr->nlmsg_flags & (NLM_F_CREATE | NLM_F_EXCL))
goto err_unlock_ovs;
 
/* The unmasked key has to be the same for flow updates. */

[ovs-dev] [PATCH v3 02/10] datapath: Fix output of SCTP mask.

2014-02-21 Thread Jarno Rajahalme
The 'output' argument of the ovs_nla_put_flow() is the one from which
the bits are written to the netlink attributes.  For SCTP we
accidentally used the bits from the 'swkey' instead.  This caused the
mask attributes to include the bits from the actual flow key instead
of the mask.

Signed-off-by: Jarno Rajahalme 
---
 datapath/flow_netlink.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
index 40751cb..9d13b7a 100644
--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -1059,11 +1059,11 @@ int ovs_nla_put_flow(const struct sw_flow_key *swkey,
goto nla_put_failure;
sctp_key = nla_data(nla);
if (swkey->eth.type == htons(ETH_P_IP)) {
-   sctp_key->sctp_src = swkey->ipv4.tp.src;
-   sctp_key->sctp_dst = swkey->ipv4.tp.dst;
+   sctp_key->sctp_src = output->ipv4.tp.src;
+   sctp_key->sctp_dst = output->ipv4.tp.dst;
} else if (swkey->eth.type == htons(ETH_P_IPV6)) {
-   sctp_key->sctp_src = swkey->ipv6.tp.src;
-   sctp_key->sctp_dst = swkey->ipv6.tp.dst;
+   sctp_key->sctp_src = output->ipv6.tp.src;
+   sctp_key->sctp_dst = output->ipv6.tp.dst;
}
} else if (swkey->eth.type == htons(ETH_P_IP) &&
   swkey->ip.proto == IPPROTO_ICMP) {
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 10/10] datapath: Minimize dp and vport critical sections.

2014-02-21 Thread Jarno Rajahalme
Move most memory allocations away from the ovs_mutex critical
sections.  vport allocations still happen while the lock is taken, as
changing that would require major refactoring. Also, vports are
created very rarely so it should not matter.

Change ovs_dp_cmd_get() now only takes the rcu_read_lock(), rather
than ovs_lock(), as nothing need to be changed.  This was done by
ovs_vport_cmd_get() already.

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |  223 ++-
 1 file changed, 112 insertions(+), 111 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index ba00824..cb3838a 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -1252,23 +1252,9 @@ error:
return -EMSGSIZE;
 }
 
-/* Must be called with ovs_mutex. */
-static struct sk_buff *ovs_dp_cmd_build_info(struct datapath *dp,
-struct genl_info *info, u8 cmd)
+static struct sk_buff *ovs_dp_cmd_alloc_info(struct genl_info *info)
 {
-   struct sk_buff *skb;
-   int retval;
-
-   skb = genlmsg_new_unicast(ovs_dp_cmd_msg_size(), info, GFP_KERNEL);
-   if (!skb)
-   return ERR_PTR(-ENOMEM);
-
-   retval = ovs_dp_cmd_fill_info(dp, skb, info->snd_portid, info->snd_seq, 
0, cmd);
-   if (retval < 0) {
-   kfree_skb(skb);
-   return ERR_PTR(retval);
-   }
-   return skb;
+   return genlmsg_new_unicast(ovs_dp_cmd_msg_size(), info, GFP_KERNEL);
 }
 
 /* Called with rcu_read_lock or ovs_mutex. */
@@ -1321,12 +1307,14 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
if (!a[OVS_DP_ATTR_NAME] || !a[OVS_DP_ATTR_UPCALL_PID])
goto err;
 
-   ovs_lock();
+   reply = ovs_dp_cmd_alloc_info(info);
+   if (!reply)
+   return -ENOMEM;
 
err = -ENOMEM;
dp = kzalloc(sizeof(*dp), GFP_KERNEL);
if (dp == NULL)
-   goto err_unlock_ovs;
+   goto err_free_reply;
 
ovs_dp_set_net(dp, hold_net(sock_net(skb->sk)));
 
@@ -1361,6 +1349,9 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
 
ovs_dp_change(dp, a);
 
+   /* So far only local changes have been made, now need the lock. */
+   ovs_lock();
+
vport = new_vport(&parms);
if (IS_ERR(vport)) {
err = PTR_ERR(vport);
@@ -1379,10 +1370,9 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
goto err_destroy_ports_array;
}
 
-   reply = ovs_dp_cmd_build_info(dp, info, OVS_DP_CMD_NEW);
-   err = PTR_ERR(reply);
-   if (IS_ERR(reply))
-   goto err_destroy_local_port;
+   err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
+  info->snd_seq, 0, OVS_DP_CMD_NEW);
+   BUG_ON(err < 0);
 
ovs_net = net_generic(ovs_dp_get_net(dp), ovs_net_id);
list_add_tail_rcu(&dp->list_node, &ovs_net->dps);
@@ -1392,9 +1382,8 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
ovs_notify(reply, info, &ovs_dp_datapath_multicast_group);
return 0;
 
-err_destroy_local_port:
-   ovs_dp_detach_port(ovs_vport_ovsl(dp, OVSP_LOCAL));
 err_destroy_ports_array:
+   ovs_unlock();
kfree(dp->ports);
 err_destroy_percpu:
free_percpu(dp->stats_percpu);
@@ -1403,8 +1392,8 @@ err_destroy_table:
 err_free_dp:
release_net(ovs_dp_get_net(dp));
kfree(dp);
-err_unlock_ovs:
-   ovs_unlock();
+err_free_reply:
+   kfree_skb(reply);
 err:
return err;
 }
@@ -1442,25 +1431,29 @@ static int ovs_dp_cmd_del(struct sk_buff *skb, struct 
genl_info *info)
struct datapath *dp;
int err;
 
+   reply = ovs_dp_cmd_alloc_info(info);
+   if (!reply)
+   return -ENOMEM;
+
ovs_lock();
dp = lookup_datapath(sock_net(skb->sk), info->userhdr, info->attrs);
err = PTR_ERR(dp);
if (IS_ERR(dp))
-   goto unlock;
+   goto err_unlock_free;
 
-   reply = ovs_dp_cmd_build_info(dp, info, OVS_DP_CMD_DEL);
-   err = PTR_ERR(reply);
-   if (IS_ERR(reply))
-   goto unlock;
+   err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
+  info->snd_seq, 0, OVS_DP_CMD_DEL);
+   BUG_ON(err < 0);
 
__dp_destroy(dp);
-   ovs_unlock();
 
+   ovs_unlock();
ovs_notify(reply, info, &ovs_dp_datapath_multicast_group);
-
return 0;
-unlock:
+
+err_unlock_free:
ovs_unlock();
+   kfree_skb(reply);
return err;
 }
 
@@ -1470,29 +1463,29 @@ static int ovs_dp_cmd_set(struct sk_buff *skb, struct 
genl_info *info)
struct datapath *dp;
int err;
 
+   reply = ovs_dp_cmd_alloc_info(info);
+   if (!reply)
+   return -ENOMEM;
+
ovs_lock();
dp = lookup_datapath(sock_net(skb->sk), i

[ovs-dev] [PATCH v3 04/10] datapath: Compact sw_flow_key.

2014-02-21 Thread Jarno Rajahalme
Minimize padding in sw_flow_key and move 'tp' top the main struct.
These changes simplify code when accessing the transport port numbers
and the tcp flags, and makes the sw_flow_key 8 bytes smaller on 64-bit
systems (128->120 bytes).  These changes also make the keys for IPv4
packets to fit in one cache line.

There is a valid concern for safety of packing the struct
ovs_key_ipv4_tunnel, as it would be possible to take the address of
the tun_id member as a __be64 * which could result in unaligned access
in some systems. However:

- sw_flow_key itself is 64-bit aligned, so the tun_id within is always
  64-bit aligned.
- We never make arrays of ovs_key_ipv4_tunnel (which would force every
  second tun_key to be misaligned).
- We never take the address of the tun_id in to a __be64 *.
- Whereever we use struct ovs_key_ipv4_tunnel outside the sw_flow_key,
  it is in stack (on tunnel input functions), where compiler has full
  control of the alignment.

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |6 +++
 datapath/flow.c |   44 ---
 datapath/flow.h |   29 +---
 datapath/flow_netlink.c |  112 ++-
 4 files changed, 68 insertions(+), 123 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 130300f..8a2c0af 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -1929,6 +1929,12 @@ static int __init dp_init(void)
pr_info("Open vSwitch switching datapath %s, built "__DATE__" 
"__TIME__"\n",
VERSION);
 
+   pr_info("Datapath sw_flow_key size: %ld bytes. ip.frag at %ld, tp.flags 
at %ld, ipv4.addr at %ld\n",
+   sizeof(struct sw_flow_key),
+   offsetof(struct sw_flow_key, ip.frag),
+   offsetof(struct sw_flow_key, tp.flags),
+   offsetof(struct sw_flow_key, ipv4.addr));
+
err = ovs_flow_init();
if (err)
goto error;
diff --git a/datapath/flow.c b/datapath/flow.c
index 3cc4cdf..4e37e9b 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -65,17 +65,11 @@ u64 ovs_flow_used_time(unsigned long flow_jiffies)
 void ovs_flow_stats_update(struct sw_flow *flow, struct sk_buff *skb)
 {
struct flow_stats *stats;
-   __be16 tcp_flags = 0;
+   __be16 tcp_flags = flow->key.tp.flags;
int node = numa_node_id();
 
stats = rcu_dereference(flow->stats[node]);
 
-   if (likely(flow->key.ip.proto == IPPROTO_TCP)) {
-   if (likely(flow->key.eth.type == htons(ETH_P_IP)))
-   tcp_flags = flow->key.ipv4.tp.flags;
-   else if (likely(flow->key.eth.type == htons(ETH_P_IPV6)))
-   tcp_flags = flow->key.ipv6.tp.flags;
-   }
/* Check if already have node-specific stats. */
if (likely(stats)) {
spin_lock(&stats->lock);
@@ -358,8 +352,8 @@ static int parse_icmpv6(struct sk_buff *skb, struct 
sw_flow_key *key,
/* The ICMPv6 type and code fields use the 16-bit transport port
 * fields, so we need to store them in 16-bit network byte order.
 */
-   key->ipv6.tp.src = htons(icmp->icmp6_type);
-   key->ipv6.tp.dst = htons(icmp->icmp6_code);
+   key->tp.src = htons(icmp->icmp6_type);
+   key->tp.dst = htons(icmp->icmp6_code);
 
if (icmp->icmp6_code == 0 &&
(icmp->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION ||
@@ -520,21 +514,21 @@ int ovs_flow_extract(struct sk_buff *skb, u16 in_port, 
struct sw_flow_key *key)
if (key->ip.proto == IPPROTO_TCP) {
if (tcphdr_ok(skb)) {
struct tcphdr *tcp = tcp_hdr(skb);
-   key->ipv4.tp.src = tcp->source;
-   key->ipv4.tp.dst = tcp->dest;
-   key->ipv4.tp.flags = TCP_FLAGS_BE16(tcp);
+   key->tp.src = tcp->source;
+   key->tp.dst = tcp->dest;
+   key->tp.flags = TCP_FLAGS_BE16(tcp);
}
} else if (key->ip.proto == IPPROTO_UDP) {
if (udphdr_ok(skb)) {
struct udphdr *udp = udp_hdr(skb);
-   key->ipv4.tp.src = udp->source;
-   key->ipv4.tp.dst = udp->dest;
+   key->tp.src = udp->source;
+   key->tp.dst = udp->dest;
}
} else if (key->ip.proto == IPPROTO_SCTP) {
if (sctphdr_ok(skb)) {
struct sctphdr *sctp = sctp_hdr(skb);
-   key->ipv4.tp.src = sctp->source;
-   key->ipv4.tp.dst = sctp->dest;
+   key->tp.src = sctp->source;
+   key->tp.dst = sctp->dest;

[ovs-dev] [PATCH v3 07/10] datapath: Minimize ovs_flow_cmd_del critical section.

2014-02-21 Thread Jarno Rajahalme
ovs_flow_cmd_del() now allocates reply (if needed) after the flow has
already been removed from the flow table.  If the reply allocation
fails, a netlink error is signaled with netlink_set_err(), as is
already done in ovs_flow_cmd_new_or_set() in the similar situation.

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |   70 +--
 1 file changed, 34 insertions(+), 36 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 60c94ac..a6e8d63 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -664,7 +664,7 @@ static size_t ovs_flow_cmd_msg_size(const struct 
sw_flow_actions *acts)
 }
 
 /* Called with ovs_mutex or RCU read lock. */
-static int ovs_flow_cmd_fill_info(struct sw_flow *flow, struct datapath *dp,
+static int ovs_flow_cmd_fill_info(struct sw_flow *flow, int dp_ifindex,
  struct sk_buff *skb, u32 portid,
  u32 seq, u32 flags, u8 cmd)
 {
@@ -681,7 +681,7 @@ static int ovs_flow_cmd_fill_info(struct sw_flow *flow, 
struct datapath *dp,
if (!ovs_header)
return -EMSGSIZE;
 
-   ovs_header->dp_ifindex = get_dpifindex(dp);
+   ovs_header->dp_ifindex = dp_ifindex;
 
/* Fill flow key. */
nla = nla_nest_start(skb, OVS_FLOW_ATTR_KEY);
@@ -767,9 +767,8 @@ static struct sk_buff *ovs_flow_cmd_alloc_info(struct 
sw_flow *flow,
 
 /* Must be called with ovs_mutex. */
 static struct sk_buff *ovs_flow_cmd_build_info(struct sw_flow *flow,
-  struct datapath *dp,
-  struct genl_info *info,
-  u8 cmd)
+  int dp_ifindex,
+  struct genl_info *info, u8 cmd)
 {
struct sk_buff *skb;
int retval;
@@ -778,8 +777,9 @@ static struct sk_buff *ovs_flow_cmd_build_info(struct 
sw_flow *flow,
if (!skb)
return ERR_PTR(-ENOMEM);
 
-   retval = ovs_flow_cmd_fill_info(flow, dp, skb, info->snd_portid,
-   info->snd_seq, 0, cmd);
+   retval = ovs_flow_cmd_fill_info(flow, dp_ifindex, skb,
+   info->snd_portid, info->snd_seq, 0,
+   cmd);
BUG_ON(retval < 0);
return skb;
 }
@@ -867,7 +867,9 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
}
 
if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group))
-   reply = ovs_flow_cmd_build_info(flow, dp, info,
+   reply = ovs_flow_cmd_build_info(flow,
+   ovs_header->dp_ifindex,
+   info,
OVS_FLOW_CMD_NEW);
} else {
/* We found a matching flow. */
@@ -894,7 +896,9 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
ovs_nla_free_flow_actions(old_acts);
 
if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group))
-   reply = ovs_flow_cmd_build_info(flow, dp, info,
+   reply = ovs_flow_cmd_build_info(flow,
+   ovs_header->dp_ifindex,
+   info,
OVS_FLOW_CMD_NEW);
 
/* Clear stats. */
@@ -957,7 +961,8 @@ static int ovs_flow_cmd_get(struct sk_buff *skb, struct 
genl_info *info)
goto unlock;
}
 
-   reply = ovs_flow_cmd_build_info(flow, dp, info, OVS_FLOW_CMD_NEW);
+   reply = ovs_flow_cmd_build_info(flow, ovs_header->dp_ifindex, info,
+   OVS_FLOW_CMD_NEW);
if (IS_ERR(reply)) {
err = PTR_ERR(reply);
goto unlock;
@@ -975,57 +980,50 @@ static int ovs_flow_cmd_del(struct sk_buff *skb, struct 
genl_info *info)
struct nlattr **a = info->attrs;
struct ovs_header *ovs_header = info->userhdr;
struct sw_flow_key key;
-   struct sk_buff *reply = NULL;
struct sw_flow *flow;
struct datapath *dp;
struct sw_flow_match match;
int err;
 
+   if (a[OVS_FLOW_ATTR_KEY]) {
+   ovs_match_init(&match, &key, NULL);
+   err = ovs_nla_get_match(&match, a[OVS_FLOW_ATTR_KEY], NULL);
+   if (err)
+   return err;
+   }
+
ovs_lock();
dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex);
if (!dp) {
err = -ENODEV;
goto unlock;
}
-
if (!a[OVS_FLOW_ATTR_KEY]) {
err = ovs_flow_tbl_flu

[ovs-dev] [PATCH v3 06/10] datapath: Build flow cmd netlink reply only if needed.

2014-02-21 Thread Jarno Rajahalme
Use netlink_has_listeners() and NLM_F_ECHO flag to determine if a
reply is needed or not for OVS_FLOW_CMD_NEW, OVS_FLOW_CMD_SET, or
OVS_FLOW_CMD_DEL.

Note: This may need compat support for older kernels.

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |   56 +++
 1 file changed, 39 insertions(+), 17 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 59b7f3f..60c94ac 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -64,6 +64,15 @@
 
 int ovs_net_id __read_mostly;
 
+/* Check if need to build a reply message.
+ * OVS userspace sets the NLM_F_ECHO flag if it needs the reply. */
+static bool ovs_must_build_reply(struct genl_info *info,
+const struct genl_multicast_group *grp)
+{
+   return info->nlhdr->nlmsg_flags & NLM_F_ECHO ||
+   netlink_has_listeners(genl_info_net(info)->genl_sock, grp->id);
+}
+
 static void ovs_notify(struct sk_buff *skb, struct genl_info *info,
   struct genl_multicast_group *grp)
 {
@@ -782,7 +791,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
struct sw_flow_key key, masked_key;
struct sw_flow *flow = NULL;
struct sw_flow_mask mask;
-   struct sk_buff *reply;
+   struct sk_buff *reply = NULL;
struct datapath *dp;
struct sw_flow_actions *acts = NULL;
struct sw_flow_match match;
@@ -857,7 +866,9 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
goto err_flow_free;
}
 
-   reply = ovs_flow_cmd_build_info(flow, dp, info, 
OVS_FLOW_CMD_NEW);
+   if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group))
+   reply = ovs_flow_cmd_build_info(flow, dp, info,
+   OVS_FLOW_CMD_NEW);
} else {
/* We found a matching flow. */
struct sw_flow_actions *old_acts;
@@ -882,7 +893,9 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
rcu_assign_pointer(flow->sf_acts, acts);
ovs_nla_free_flow_actions(old_acts);
 
-   reply = ovs_flow_cmd_build_info(flow, dp, info, 
OVS_FLOW_CMD_NEW);
+   if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group))
+   reply = ovs_flow_cmd_build_info(flow, dp, info,
+   OVS_FLOW_CMD_NEW);
 
/* Clear stats. */
if (a[OVS_FLOW_ATTR_CLEAR])
@@ -890,11 +903,14 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
}
ovs_unlock();
 
-   if (!IS_ERR(reply))
-   ovs_notify(reply, info, &ovs_dp_flow_multicast_group);
-   else
-   netlink_set_err(sock_net(skb->sk)->genl_sock, 0,
-   ovs_dp_flow_multicast_group.id, PTR_ERR(reply));
+   if (reply) {
+   if (!IS_ERR(reply))
+   ovs_notify(reply, info, &ovs_dp_flow_multicast_group);
+   else
+   netlink_set_err(sock_net(skb->sk)->genl_sock, 0,
+   ovs_dp_flow_multicast_group.id,
+   PTR_ERR(reply));
+   }
return 0;
 
 err_flow_free:
@@ -959,7 +975,7 @@ static int ovs_flow_cmd_del(struct sk_buff *skb, struct 
genl_info *info)
struct nlattr **a = info->attrs;
struct ovs_header *ovs_header = info->userhdr;
struct sw_flow_key key;
-   struct sk_buff *reply;
+   struct sk_buff *reply = NULL;
struct sw_flow *flow;
struct datapath *dp;
struct sw_flow_match match;
@@ -988,22 +1004,28 @@ static int ovs_flow_cmd_del(struct sk_buff *skb, struct 
genl_info *info)
goto unlock;
}
 
-   reply = ovs_flow_cmd_alloc_info(flow, info);
-   if (!reply) {
-   err = -ENOMEM;
-   goto unlock;
+   if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group)) {
+   reply = ovs_flow_cmd_alloc_info(flow, info);
+   if (!reply) {
+   err = -ENOMEM;
+   goto unlock;
+   }
}
 
ovs_flow_tbl_remove(&dp->table, flow);
 
-   err = ovs_flow_cmd_fill_info(flow, dp, reply, info->snd_portid,
-info->snd_seq, 0, OVS_FLOW_CMD_DEL);
-   BUG_ON(err < 0);
+   if (reply) {
+   err = ovs_flow_cmd_fill_info(flow, dp, reply, info->snd_portid,
+info->snd_seq, 0,
+OVS_FLOW_CMD_DEL);
+   BUG_ON(err < 0);
+   }
 
ovs_flow_free(flow, true);
ovs_unlock();
 
-   ovs_notify(reply, info, &ovs_dp_flow_multi

[ovs-dev] [PATCH v3 03/10] datapath: Use TCP flags in the flow key for stats.

2014-02-21 Thread Jarno Rajahalme
We already extract the TCP flags for the key, might as well use that
for stats.

Signed-off-by: Jarno Rajahalme 
---
 datapath/flow.c |   12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/datapath/flow.c b/datapath/flow.c
index c3e3fcb..3cc4cdf 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -70,14 +70,12 @@ void ovs_flow_stats_update(struct sw_flow *flow, struct 
sk_buff *skb)
 
stats = rcu_dereference(flow->stats[node]);
 
-   if ((flow->key.eth.type == htons(ETH_P_IP) ||
-flow->key.eth.type == htons(ETH_P_IPV6)) &&
-   flow->key.ip.frag != OVS_FRAG_TYPE_LATER &&
-   flow->key.ip.proto == IPPROTO_TCP &&
-   likely(skb->len >= skb_transport_offset(skb) + sizeof(struct 
tcphdr))) {
-   tcp_flags = TCP_FLAGS_BE16(tcp_hdr(skb));
+   if (likely(flow->key.ip.proto == IPPROTO_TCP)) {
+   if (likely(flow->key.eth.type == htons(ETH_P_IP)))
+   tcp_flags = flow->key.ipv4.tp.flags;
+   else if (likely(flow->key.eth.type == htons(ETH_P_IPV6)))
+   tcp_flags = flow->key.ipv6.tp.flags;
}
-
/* Check if already have node-specific stats. */
if (likely(stats)) {
spin_lock(&stats->lock);
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 01/10] datapath: Avoid assigning a NULL pointer to flow actions.

2014-02-21 Thread Jarno Rajahalme
Flow SET can set an empty set of actions by not passing any actions.
Previously, we assigned the flow's actions pointer to NULL in this
case, but we never check for the NULL pointer later on.  This patch
modifies this case to allocate a valid but empty set of actions
instead.

Note: If might be prudent to allow missing actions also in
OVS_FLOW_CMD_NEW.

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index f7c3391..130300f 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -810,7 +810,14 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, 
struct genl_info *info)
OVS_NLERR("Flow actions may not be safe on all matching 
packets.\n");
goto err_kfree;
}
-   } else if (info->genlhdr->cmd == OVS_FLOW_CMD_NEW) {
+   } else if (info->genlhdr->cmd == OVS_FLOW_CMD_SET) {
+   /* Need empty actions. */
+   acts = ovs_nla_alloc_flow_actions(0);
+   error = PTR_ERR(acts);
+   if (IS_ERR(acts))
+   goto error;
+   } else {
+   /* OVS_FLOW_CMD_NEW must have actions. */
error = -EINVAL;
goto error;
}
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 09/10] datapath: Minimize ovs_flow_cmd_new_or_set critical section.

2014-02-21 Thread Jarno Rajahalme
Note that this has only a little performance benefit when responses
are not created (which is normal when there are no netlink multicast
listeners around).

Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c |  123 +--
 1 file changed, 71 insertions(+), 52 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 3d4037d..ba00824 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -754,15 +754,11 @@ error:
return err;
 }
 
-/* Must be called with ovs_mutex. */
-static struct sk_buff *ovs_flow_cmd_alloc_info(struct sw_flow *flow,
+static struct sk_buff *ovs_flow_cmd_alloc_info(const struct sw_flow_actions 
*acts,
   struct genl_info *info)
 {
-   size_t len;
-
-   len = ovs_flow_cmd_msg_size(ovsl_dereference(flow->sf_acts));
-
-   return genlmsg_new_unicast(len, info, GFP_KERNEL);
+   return genlmsg_new_unicast(ovs_flow_cmd_msg_size(acts), info,
+  GFP_KERNEL);
 }
 
 /* Must be called with ovs_mutex. */
@@ -773,7 +769,7 @@ static struct sk_buff *ovs_flow_cmd_build_info(struct 
sw_flow *flow,
struct sk_buff *skb;
int retval;
 
-   skb = ovs_flow_cmd_alloc_info(flow, info);
+   skb = ovs_flow_cmd_alloc_info(ovsl_dereference(flow->sf_acts), info);
if (!skb)
return ERR_PTR(-ENOMEM);
 
@@ -789,11 +785,11 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
struct nlattr **a = info->attrs;
struct ovs_header *ovs_header = info->userhdr;
struct sw_flow_key key, masked_key;
-   struct sw_flow *flow;
+   struct sw_flow *flow, *new_flow = NULL;
struct sw_flow_mask mask;
struct sk_buff *reply = NULL;
struct datapath *dp;
-   struct sw_flow_actions *acts;
+   struct sw_flow_actions *acts, *old_acts = NULL;
struct sw_flow_match match;
int error;
 
@@ -823,9 +819,28 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
 &masked_key, 0, &acts);
if (error) {
OVS_NLERR("Flow actions may not be safe on all matching 
packets.\n");
-   goto err_kfree;
+   goto err_kfree_acts;
+   }
+
+   if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group)) {
+   reply = ovs_flow_cmd_alloc_info(acts, info);
+   if (!reply) {
+   error = -ENOMEM;
+   goto err_kfree_acts;
+   }
}
 
+   /* Most of the time we need to allocate a new flow, do it before
+* locking.
+*/
+   new_flow = ovs_flow_alloc();
+   if (IS_ERR(new_flow)) {
+   error = PTR_ERR(new_flow);
+   goto err_kfree_reply;
+   }
+   new_flow->key = masked_key;
+   new_flow->unmasked_key = key;
+
ovs_lock();
dp = get_dp(sock_net(skb->sk), ovs_header->dp_ifindex);
error = -ENODEV;
@@ -835,25 +850,17 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
/* Check if this is a duplicate flow */
flow = ovs_flow_tbl_lookup(&dp->table, &key);
if (!flow) {
-   /* Allocate flow. */
-   flow = ovs_flow_alloc();
-   if (IS_ERR(flow)) {
-   error = PTR_ERR(flow);
-   goto err_unlock_ovs;
-   }
-
-   flow->key = masked_key;
-   flow->unmasked_key = key;
+   flow = new_flow;
rcu_assign_pointer(flow->sf_acts, acts);
acts = NULL;
 
/* Put flow in bucket. */
error = ovs_flow_tbl_insert(&dp->table, flow, &mask);
if (error)
-   goto err_flow_free;
+   goto err_unlock_ovs;
+   new_flow = NULL;
} else {
/* We found a matching flow. */
-   struct sw_flow_actions *old_acts;
 
/* Bail out if we're not allowed to modify an existing flow.
 * We accept NLM_F_CREATE in place of the intended NLM_F_EXCL
@@ -872,32 +879,36 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
/* Update actions. */
old_acts = ovsl_dereference(flow->sf_acts);
rcu_assign_pointer(flow->sf_acts, acts);
-   ovs_nla_free_flow_actions(old_acts);
}
 
-   if (ovs_must_build_reply(info, &ovs_dp_flow_multicast_group))
-   reply = ovs_flow_cmd_build_info(flow, ovs_header->dp_ifindex,
-   info, OVS_FLOW_CMD_NEW);
+   if (reply) {
+   error = ovs_flow_cmd_fill_info(flow, ovs_header->dp_ifindex,
+  reply, info->snd_portid,
+  

Re: [ovs-dev] [PATCH 06/14] socket-util: poll() for Windows.

2014-02-21 Thread Gurucharan Shetty
On Thu, Feb 20, 2014 at 5:14 PM, Ben Pfaff  wrote:
> On Wed, Feb 19, 2014 at 03:36:17PM -0800, Gurucharan Shetty wrote:
>> Windows send() does not have a MSG_DONTWAIT.
>> So, use the get_socket_error() function.
>>
>> Co-authored-by: Linda Sun 
>> Signed-off-by: Linda Sun 
>> Signed-off-by: Gurucharan Shetty 
>
> The use of MSG_DONTWAIT here is just caution, since it only makes sense
> to use check_connection_completion() on a nonblocking socket anyhow.  So
> I would prefer to just change MSG_DONTWAIT to 0 here.
Okay. Though I do not understand why we are ignoring POLLHUP for pfd.revents.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 08/14] socket-util: Unix socket related calls for non-windows platform.

2014-02-21 Thread Gurucharan Shetty
On Thu, Feb 20, 2014 at 5:21 PM, Ben Pfaff  wrote:
> On Wed, Feb 19, 2014 at 03:36:19PM -0800, Gurucharan Shetty wrote:
>> Don't try to compile Unix socket related functions for Windows.
>>
>> Signed-off-by: Gurucharan Shetty 
>
> Can we check for AF_UNIX instead of for _WIN32?
Unfortunately, AF_UNIX is defined in one of the windows header (winsock2.h)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 10/14] socket-util: Windows does not have /dev/null.

2014-02-21 Thread Gurucharan Shetty
On Thu, Feb 20, 2014 at 5:25 PM, Ben Pfaff  wrote:
> On Wed, Feb 19, 2014 at 03:36:21PM -0800, Gurucharan Shetty wrote:
>> Signed-off-by: Gurucharan Shetty 
>
> This is only used in daemon.c, which isn't compiled on Windows.  Can we
> just move it into there (as a static function)?
>
Okay. I will do this for v2.
> Alternatively, I'm pretty sure that Windows does have /dev/null under
> another name:
> http://en.wikipedia.org/wiki/%5CDevice%5CNull
Thanks, I guess, I did not search enough.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3 00/10] datapath: Reduce lock contention.

2014-02-21 Thread Jarno Rajahalme
Here are some indicative netperf TCP_CRR performance numbers for these
patches.

Jarno Rajahalme (10):
44.7k  (master)
   datapath: Avoid assigning a NULL pointer to flow actions.
   datapath: Fix output of SCTP mask.
   datapath: Use TCP flags in the flow key for stats.
45.0k  datapath: Compact sw_flow_key.
   datapath: Clarify locking.
49.4k  datapath: Build flow cmd netlink reply only if needed.
52.6k  datapath: Minimize ovs_flow_cmd_del critical section.
   datapath: Split ovs_flow_cmd_new_or_set().
52.8k  datapath: Minimize ovs_flow_cmd_new_or_set critical section.
53.0k  datapath: Minimize dp and vport critical sections.

 datapath/datapath.c |  537 +--
 datapath/flow.c |   46 ++--
 datapath/flow.h |   29 +--
 datapath/flow_netlink.c |  112 +++---
 4 files changed, 392 insertions(+), 332 deletions(-)

-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 06/14] socket-util: poll() for Windows.

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 11:38:19AM -0800, Gurucharan Shetty wrote:
> On Thu, Feb 20, 2014 at 5:14 PM, Ben Pfaff  wrote:
> > On Wed, Feb 19, 2014 at 03:36:17PM -0800, Gurucharan Shetty wrote:
> >> Windows send() does not have a MSG_DONTWAIT.
> >> So, use the get_socket_error() function.
> >>
> >> Co-authored-by: Linda Sun 
> >> Signed-off-by: Linda Sun 
> >> Signed-off-by: Gurucharan Shetty 
> >
> > The use of MSG_DONTWAIT here is just caution, since it only makes sense
> > to use check_connection_completion() on a nonblocking socket anyhow.  So
> > I would prefer to just change MSG_DONTWAIT to 0 here.
> Okay. Though I do not understand why we are ignoring POLLHUP for pfd.revents.

I think POLLHUP would indicate that the connection completed and was
then closed.  I think it's OK for the caller to deal with that, since
it has to deal with any other connection that closes anyway.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 08/14] socket-util: Unix socket related calls for non-windows platform.

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 11:39:24AM -0800, Gurucharan Shetty wrote:
> On Thu, Feb 20, 2014 at 5:21 PM, Ben Pfaff  wrote:
> > On Wed, Feb 19, 2014 at 03:36:19PM -0800, Gurucharan Shetty wrote:
> >> Don't try to compile Unix socket related functions for Windows.
> >>
> >> Signed-off-by: Gurucharan Shetty 
> >
> > Can we check for AF_UNIX instead of for _WIN32?
> Unfortunately, AF_UNIX is defined in one of the windows header (winsock2.h)

OK, I guess we'll stick with _WIN32 then, thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 1/4] socket-util: Move get_max_fds() to process.c.

2014-02-21 Thread Gurucharan Shetty
get_max_fds() is used only from process.c. Move it there
along with rlim_is_finite(). Since process_start() can only
be called before any additional threads are created, we
no longer need the thread safety checks in get_max_fds().

Signed-off-by: Gurucharan Shetty 
---
 lib/process.c |   43 +++
 lib/socket-util.c |   43 ---
 lib/socket-util.h |2 --
 3 files changed, 43 insertions(+), 45 deletions(-)

diff --git a/lib/process.c b/lib/process.c
index d0e1882..b479d00 100644
--- a/lib/process.c
+++ b/lib/process.c
@@ -165,6 +165,49 @@ process_register(const char *name, pid_t pid)
 return p;
 }
 
+#ifndef _WIN32
+static bool
+rlim_is_finite(rlim_t limit)
+{
+if (limit == RLIM_INFINITY) {
+return false;
+}
+
+#ifdef RLIM_SAVED_CUR   /* FreeBSD 8.0 lacks RLIM_SAVED_CUR. */
+if (limit == RLIM_SAVED_CUR) {
+return false;
+}
+#endif
+
+#ifdef RLIM_SAVED_MAX   /* FreeBSD 8.0 lacks RLIM_SAVED_MAX. */
+if (limit == RLIM_SAVED_MAX) {
+return false;
+}
+#endif
+
+return true;
+}
+
+/* Returns the maximum valid FD value, plus 1. */
+static int
+get_max_fds(void)
+{
+static int max_fds;
+
+if (!max_fds) {
+struct rlimit r;
+if (!getrlimit(RLIMIT_NOFILE, &r) && rlim_is_finite(r.rlim_cur)) {
+max_fds = r.rlim_cur;
+} else {
+VLOG_WARN("failed to obtain fd limit, defaulting to 1024");
+max_fds = 1024;
+}
+}
+
+return max_fds;
+}
+#endif /* _WIN32 */
+
 /* Starts a subprocess with the arguments in the null-terminated argv[] array.
  * argv[0] is used as the name of the process.  Searches the PATH environment
  * variable to find the program to execute.
diff --git a/lib/socket-util.c b/lib/socket-util.c
index 20caddd..7f385b8 100644
--- a/lib/socket-util.c
+++ b/lib/socket-util.c
@@ -122,49 +122,6 @@ set_dscp(int fd, uint8_t dscp)
 return 0;
 }
 
-static bool
-rlim_is_finite(rlim_t limit)
-{
-if (limit == RLIM_INFINITY) {
-return false;
-}
-
-#ifdef RLIM_SAVED_CUR   /* FreeBSD 8.0 lacks RLIM_SAVED_CUR. */
-if (limit == RLIM_SAVED_CUR) {
-return false;
-}
-#endif
-
-#ifdef RLIM_SAVED_MAX   /* FreeBSD 8.0 lacks RLIM_SAVED_MAX. */
-if (limit == RLIM_SAVED_MAX) {
-return false;
-}
-#endif
-
-return true;
-}
-
-/* Returns the maximum valid FD value, plus 1. */
-int
-get_max_fds(void)
-{
-static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
-static int max_fds;
-
-if (ovsthread_once_start(&once)) {
-struct rlimit r;
-if (!getrlimit(RLIMIT_NOFILE, &r) && rlim_is_finite(r.rlim_cur)) {
-max_fds = r.rlim_cur;
-} else {
-VLOG_WARN("failed to obtain fd limit, defaulting to 1024");
-max_fds = 1024;
-}
-ovsthread_once_done(&once);
-}
-
-return max_fds;
-}
-
 /* Translates 'host_name', which must be a string representation of an IP
  * address, into a numeric IP address in '*addr'.  Returns 0 if successful,
  * otherwise a positive errno value. */
diff --git a/lib/socket-util.h b/lib/socket-util.h
index 1695985..5566fcd 100644
--- a/lib/socket-util.h
+++ b/lib/socket-util.h
@@ -31,8 +31,6 @@ int set_nonblocking(int fd);
 void xset_nonblocking(int fd);
 int set_dscp(int fd, uint8_t dscp);
 
-int get_max_fds(void);
-
 int lookup_ip(const char *host_name, struct in_addr *address);
 int lookup_ipv6(const char *host_name, struct in6_addr *address);
 
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 4/4] socket-util: Move get_null_fd() to daemon.c.

2014-02-21 Thread Gurucharan Shetty
get_null_fd() is only called from daemon.c.
It does not need thread safety features anymore as
it is called either through daemonize_start() or
indirectly through daemonize_complete() once.

Signed-off-by: Gurucharan Shetty 
---
 lib/daemon.c  |   20 
 lib/socket-util.c |   22 --
 lib/socket-util.h |1 -
 3 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/lib/daemon.c b/lib/daemon.c
index f9290ef..9d96cba 100644
--- a/lib/daemon.c
+++ b/lib/daemon.c
@@ -441,6 +441,26 @@ monitor_daemon(pid_t daemon_pid)
 set_subprogram_name("");
 }
 
+/* Returns a readable and writable fd for /dev/null, if successful, otherwise
+ * a negative errno value.  The caller must not close the returned fd (because
+ * the same fd will be handed out to subsequent callers). */
+static int
+get_null_fd(void)
+{
+static int null_fd;
+
+if (!null_fd) {
+null_fd = open("/dev/null", O_RDWR);
+if (null_fd < 0) {
+int error = errno;
+VLOG_ERR("could not open /dev/null: %s", ovs_strerror(error));
+null_fd = -error;
+}
+}
+
+return null_fd;
+}
+
 /* Close standard file descriptors (except any that the client has requested we
  * leave open by calling daemon_save_fd()).  If we're started from e.g. an SSH
  * session, then this keeps us from holding that session open artificially. */
diff --git a/lib/socket-util.c b/lib/socket-util.c
index b59ce65..b6cc35a 100644
--- a/lib/socket-util.c
+++ b/lib/socket-util.c
@@ -927,28 +927,6 @@ error:
 return -error;
 }
 
-/* Returns a readable and writable fd for /dev/null, if successful, otherwise
- * a negative errno value.  The caller must not close the returned fd (because
- * the same fd will be handed out to subsequent callers). */
-int
-get_null_fd(void)
-{
-static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
-static int null_fd;
-
-if (ovsthread_once_start(&once)) {
-null_fd = open("/dev/null", O_RDWR);
-if (null_fd < 0) {
-int error = errno;
-VLOG_ERR("could not open /dev/null: %s", ovs_strerror(error));
-null_fd = -error;
-}
-ovsthread_once_done(&once);
-}
-
-return null_fd;
-}
-
 int
 read_fully(int fd, void *p_, size_t size, size_t *bytes_read)
 {
diff --git a/lib/socket-util.h b/lib/socket-util.h
index 5566fcd..1dcc828 100644
--- a/lib/socket-util.h
+++ b/lib/socket-util.h
@@ -44,7 +44,6 @@ int make_unix_socket(int style, bool nonblock,
  const char *bind_path, const char *connect_path);
 int get_unix_name_len(socklen_t sun_len);
 ovs_be32 guess_netmask(ovs_be32 ip);
-int get_null_fd(void);
 
 bool inet_parse_active(const char *target, uint16_t default_port,
struct sockaddr_storage *ssp);
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 3/4] socket-util: poll() for Windows.

2014-02-21 Thread Gurucharan Shetty
Also, Windows does not have a MSG_DONTWAIT. Get rid of it
as we always use non-blocking sockets.

Co-authored-by: Linda Sun 
Signed-off-by: Linda Sun 
Signed-off-by: Gurucharan Shetty 
---
 lib/socket-util.c |   18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/lib/socket-util.c b/lib/socket-util.c
index 3654951..b59ce65 100644
--- a/lib/socket-util.c
+++ b/lib/socket-util.c
@@ -223,15 +223,24 @@ check_connection_completion(int fd)
 int retval;
 
 pfd.fd = fd;
+#ifndef _WIN32
 pfd.events = POLLOUT;
+#else
+pfd.events = POLLWRNORM;
+#endif
+
+#ifndef _WIN32
 do {
 retval = poll(&pfd, 1, 0);
 } while (retval < 0 && errno == EINTR);
+#else
+retval = WSAPoll(&pfd, 1, 0);
+#endif
 if (retval == 1) {
 if (pfd.revents & POLLERR) {
-ssize_t n = send(fd, "", 1, MSG_DONTWAIT);
+ssize_t n = send(fd, "", 1, 0);
 if (n < 0) {
-return errno;
+return sock_errno();
 } else {
 VLOG_ERR_RL(&rl, "poll return POLLERR but send succeeded");
 return EPROTO;
@@ -239,7 +248,7 @@ check_connection_completion(int fd)
 }
 return 0;
 } else if (retval < 0) {
-VLOG_ERR_RL(&rl, "poll: %s", ovs_strerror(errno));
+VLOG_ERR_RL(&rl, "poll: %s", sock_strerror(sock_errno()));
 return errno;
 } else {
 return EAGAIN;
@@ -271,8 +280,7 @@ drain_rcvbuf(int fd)
  * On other Unix-like OSes, MSG_TRUNC has no effect in the flags
  * argument. */
 char buffer[LINUX_DATAPATH ? 1 : 2048];
-ssize_t n_bytes = recv(fd, buffer, sizeof buffer,
-   MSG_TRUNC | MSG_DONTWAIT);
+ssize_t n_bytes = recv(fd, buffer, sizeof buffer, MSG_TRUNC);
 if (n_bytes <= 0 || n_bytes >= rcvbuf) {
 break;
 }
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 2/4] Replace inet_aton() with inet_pton().

2014-02-21 Thread Gurucharan Shetty
Windows does not have inet_aton(), but does have a inet_pton().
inet_aton() is not defined in POSIX. But inet_pton() is.

Signed-off-by: Gurucharan Shetty 
---
 lib/bfd.c |2 +-
 lib/socket-util.c |4 ++--
 vswitchd/bridge.c |7 ---
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index a8c2294..5413105 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -878,7 +878,7 @@ bfd_forwarding__(struct bfd *bfd) OVS_REQUIRES(mutex)
 static bool
 bfd_lookup_ip(const char *host_name, struct in_addr *addr)
 {
-if (!inet_aton(host_name, addr)) {
+if (!inet_pton(AF_INET, host_name, addr)) {
 VLOG_ERR_RL(&rl, "\"%s\" is not a valid IP address", host_name);
 return false;
 }
diff --git a/lib/socket-util.c b/lib/socket-util.c
index 7f385b8..3654951 100644
--- a/lib/socket-util.c
+++ b/lib/socket-util.c
@@ -128,7 +128,7 @@ set_dscp(int fd, uint8_t dscp)
 int
 lookup_ip(const char *host_name, struct in_addr *addr)
 {
-if (!inet_aton(host_name, addr)) {
+if (!inet_pton(AF_INET, host_name, addr)) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
 VLOG_ERR_RL(&rl, "\"%s\" is not a valid IP address", host_name);
 return ENOENT;
@@ -165,7 +165,7 @@ lookup_hostname(const char *host_name, struct in_addr *addr)
 struct addrinfo *result;
 struct addrinfo hints;
 
-if (inet_aton(host_name, addr)) {
+if (inet_pton(AF_INET, host_name, addr)) {
 return 0;
 }
 
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index e7dd597..aa4ab31 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -2823,7 +2823,8 @@ bridge_configure_local_iface_netdev(struct bridge *br,
 
 /* If there's no local interface or no IP address, give up. */
 local_iface = iface_from_ofp_port(br, OFPP_LOCAL);
-if (!local_iface || !c->local_ip || !inet_aton(c->local_ip, &ip)) {
+if (!local_iface || !c->local_ip
+|| !inet_pton(AF_INET, c->local_ip, &ip)) {
 return;
 }
 
@@ -2833,7 +2834,7 @@ bridge_configure_local_iface_netdev(struct bridge *br,
 
 /* Configure the IP address and netmask. */
 if (!c->local_netmask
-|| !inet_aton(c->local_netmask, &mask)
+|| !inet_pton(AF_INET, c->local_netmask, &mask)
 || !mask.s_addr) {
 mask.s_addr = guess_netmask(ip.s_addr);
 }
@@ -2844,7 +2845,7 @@ bridge_configure_local_iface_netdev(struct bridge *br,
 
 /* Configure the default gateway. */
 if (c->local_gateway
-&& inet_aton(c->local_gateway, &gateway)
+&& inet_pton(AF_INET, c->local_gateway, &gateway)
 && gateway.s_addr) {
 if (!netdev_add_router(netdev, gateway)) {
 VLOG_INFO("bridge %s: configured gateway "IP_FMT,
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] openvswitch-2.0.0 : ovs-vswitchd segfault in netdev_get_etheraddr

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 09:37:10AM -0800, Sridhar Samudrala wrote:
> On 2/21/2014 8:38 AM, Ben Pfaff wrote:
> >On Thu, Feb 20, 2014 at 11:06:07PM -0800, Sridhar Samudrala wrote:
> >>I am seeing a segfault with ovs-vswitchd when a VM attached to a OVS
> >>bridge is shutdown.
> >>I am running openvswitch-2.0.0 on linux kernel:3.10.30
> >>
> >>Is this a known issue with any fix in the git repository?
> >Can you reproduce the problem?
> Yes.  It happens everytime the VM is shutdown and STP is enabled on
> the bridge.
> >  Can you try this fix?
> I had to change ofport->stp_port to port->stp_port to fix the compile error.
> With that change, the following patch fixes the segfault.

Thanks, I applied this fix to master and branches for 2.1, 2.0, 1.11,
and 1.9.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 1/4] socket-util: Move get_max_fds() to process.c.

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 01:14:33PM -0800, Gurucharan Shetty wrote:
> get_max_fds() is used only from process.c. Move it there
> along with rlim_is_finite(). Since process_start() can only
> be called before any additional threads are created, we
> no longer need the thread safety checks in get_max_fds().
> 
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 2/4] Replace inet_aton() with inet_pton().

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 01:14:34PM -0800, Gurucharan Shetty wrote:
> Windows does not have inet_aton(), but does have a inet_pton().
> inet_aton() is not defined in POSIX. But inet_pton() is.
> 
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 3/4] socket-util: poll() for Windows.

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 01:14:35PM -0800, Gurucharan Shetty wrote:
> +#ifndef _WIN32
>  pfd.events = POLLOUT;
> +#else
> +pfd.events = POLLWRNORM;
> +#endif

Does this mean that Windows has POLLWRNORM but not POLLOUT?  POSIX
says they're equivalent, so we could either use POLLWRNORM everywhere
or just #define POLLOUT POLLWRNORM in poll-loop.h.  I lean toward the
latter, since POLLOUT is more "idiomatic" in my experience.

Otherwise,
Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 4/4] socket-util: Move get_null_fd() to daemon.c.

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 01:14:36PM -0800, Gurucharan Shetty wrote:
> get_null_fd() is only called from daemon.c.
> It does not need thread safety features anymore as
> it is called either through daemonize_start() or
> indirectly through daemonize_complete() once.
> 
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Call ovs_refcount_destroy() before free().

2014-02-21 Thread Ben Pfaff
On Thu, Feb 20, 2014 at 10:04:53PM -0800, Alex Wang wrote:
> This commit makes dp_netdev_flow_unref() and dp_netdev_actions_unref()
> invoke the ovs_refcount_destroy() before freeing the corresponding
> pointer.
> 
> Signed-off-by: Alex Wang 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] dpif-netdev: Fix memory leak.

2014-02-21 Thread Ben Pfaff
On Thu, Feb 20, 2014 at 10:04:54PM -0800, Alex Wang wrote:
> In dpif_netdev_flow_del() and dp_netdev_port_input(), the
> referenced 'netdev_flow' is not un-referenced.  This causes
> the leak of the struct's memory.
> 
> This commit fixes the above issue by calling dp_netdev_flow_unref()
> after using the reference.
> 
> Signed-off-by: Alex Wang 

Please backport these fixes, if needed.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ovs-vswitchd: Don't register for SIGHUP.

2014-02-21 Thread Ben Pfaff
On Thu, Feb 20, 2014 at 03:17:15PM -0800, Gurucharan Shetty wrote:
> We are registering an interest in SIGHUP to reopen
> log files. But there is an 'ovs-appctl vlog/reopen'
> command that does the same and is used in the
> logrotate config for the distributions.
> 
> So remove the redundant functionality.
> 
> Signed-off-by: Gurucharan Shetty 

I'm a little worried that some third-party script might be using
SIGHUP for this purpose, so please add an item to NEWS that mentions
this change.

Acked-by: Ben Pfaff 

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] signals: Remove unused functions.

2014-02-21 Thread Ben Pfaff
On Thu, Feb 20, 2014 at 03:17:16PM -0800, Gurucharan Shetty wrote:
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 3/4] socket-util: poll() for Windows.

2014-02-21 Thread Gurucharan Shetty
On Fri, Feb 21, 2014 at 1:39 PM, Ben Pfaff  wrote:
> On Fri, Feb 21, 2014 at 01:14:35PM -0800, Gurucharan Shetty wrote:
>> +#ifndef _WIN32
>>  pfd.events = POLLOUT;
>> +#else
>> +pfd.events = POLLWRNORM;
>> +#endif
>
> Does this mean that Windows has POLLWRNORM but not POLLOUT?  POSIX
> says they're equivalent, so we could either use POLLWRNORM everywhere
> or just #define POLLOUT POLLWRNORM in poll-loop.h.  I lean toward the
> latter, since POLLOUT is more "idiomatic" in my experience.
I see that winsock2.h has (The documentation did not mention it):
#define POLLOUT (POLLWRNORM)

So, I will get rid of the #ifdef

>
> Otherwise,
> Acked-by: Ben Pfaff 
Thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] dpif-netdev: Fix memory leak.

2014-02-21 Thread Alex Wang
>
> > This commit fixes the above issue by calling dp_netdev_flow_unref()
> > after using the reference.
> >
> > Signed-off-by: Alex Wang 
>
> Please backport these fixes, if needed.
>
> Acked-by: Ben Pfaff 
>

Thanks for the review, Ben,

Pushed both patches.  Checked the previous branches, no need to backport.

Alex Wang,
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ovs-vswitchd: Don't register for SIGHUP.

2014-02-21 Thread Gurucharan Shetty
On Fri, Feb 21, 2014 at 1:44 PM, Ben Pfaff  wrote:
> On Thu, Feb 20, 2014 at 03:17:15PM -0800, Gurucharan Shetty wrote:
>> We are registering an interest in SIGHUP to reopen
>> log files. But there is an 'ovs-appctl vlog/reopen'
>> command that does the same and is used in the
>> logrotate config for the distributions.
>>
>> So remove the redundant functionality.
>>
>> Signed-off-by: Gurucharan Shetty 
>
> I'm a little worried that some third-party script might be using
> SIGHUP for this purpose, so please add an item to NEWS that mentions
> this change.
How about the following incremental?

diff --git a/NEWS b/NEWS
index 34dde6b..26d731f 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,9 @@ Post-v2.1.0
  poorly maintained and not widely used.
- New "check-ryu" Makefile target for running Ryu tests for OpenFlow
  controllers against Open vSwitch.  See INSTALL for details.
+   - Upon the receipt of a SIGHUP signal, ovs-vswitchd no longer reopens its
+ log file (it will terminate instead). Please use 'ovs-appctl vlog/reopen'
+ instead.

 v2.1.0 - xx xxx 
 -


>
> Acked-by: Ben Pfaff 
>
> Thanks,
>
> Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] ovs-vswitchd: Don't register for SIGHUP.

2014-02-21 Thread Ben Pfaff
On Fri, Feb 21, 2014 at 02:56:00PM -0800, Gurucharan Shetty wrote:
> On Fri, Feb 21, 2014 at 1:44 PM, Ben Pfaff  wrote:
> > On Thu, Feb 20, 2014 at 03:17:15PM -0800, Gurucharan Shetty wrote:
> >> We are registering an interest in SIGHUP to reopen
> >> log files. But there is an 'ovs-appctl vlog/reopen'
> >> command that does the same and is used in the
> >> logrotate config for the distributions.
> >>
> >> So remove the redundant functionality.
> >>
> >> Signed-off-by: Gurucharan Shetty 
> >
> > I'm a little worried that some third-party script might be using
> > SIGHUP for this purpose, so please add an item to NEWS that mentions
> > this change.
> How about the following incremental?

That's good, thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [rwlock 3/5] ofproto-dpif-xlate: Avoid recursively taking read side of ofgroup rwlock.

2014-02-21 Thread Joe Stringer
Looks good to me.

Acked-by: Joe Stringer 


On 21 February 2014 10:46, Ben Pfaff  wrote:

> On Wed, Feb 19, 2014 at 02:53:11PM -0800, Joe Stringer wrote:
> > This looks good to me, although aren't we meant to report something at
> the
> > OpenFlow layer?
> >
> > Is this equivalent to the OpenFlow "chaining" description? So if we don't
> > support chaining groups together, we should return an error message with
> > reason OFPGMFC_CHAINING_UNSUPPORTED, rather than accepting the rules then
> > misbehaving?
>
> You're right.  Here's a version that does that, and adds a test.
>
> --8<--cut here-->8--
>
> From: Ben Pfaff 
> Date: Fri, 21 Feb 2014 10:45:00 -0800
> Subject: [PATCH] ofproto-dpif-xlate: Avoid recursively taking read side of
>  ofgroup rwlock.
>
> With glibc, rwlocks by default allow recursive read-locking even if a
> thread is blocked waiting for the write-lock.  POSIX allows such attempts
> to deadlock, and it appears that the libc used in NetBSD, at least, does
> deadlock.  ofproto-dpif-xlate could take the ofgroup rwlock recursively if
> an ofgroup's actions caused the ofgroup to be executed again.  This commit
> avoids that issue by preventing recursive translation of groups (the same
> group or another group).  This is not the most user friendly solution,
> but OpenFlow allows this restriction, and we can always remove the
> restriction later (probably requiring more complicated code) if it
> proves to be a real problem to real users.
>
> Found by inspection.
>
> Signed-off-by: Ben Pfaff 
> ---
>  ofproto/ofproto-dpif-xlate.c |   32 +++-
>  ofproto/ofproto-dpif.c   |   14 ++
>  tests/ofproto-dpif.at|   11 +++
>  3 files changed, 56 insertions(+), 1 deletion(-)
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index ccf0b75..89d92af 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -178,6 +178,7 @@ struct xlate_ctx {
>  /* Resubmit statistics, via xlate_table_action(). */
>  int recurse;/* Current resubmit nesting depth. */
>  int resubmits;  /* Total number of resubmits. */
> +bool in_group;  /* Currently translating ofgroup, if
> true. */
>
>  uint32_t orig_skb_priority; /* Priority when packet arrived. */
>  uint8_t table_id;   /* OpenFlow table ID where flow was
> found. */
> @@ -1980,6 +1981,8 @@ xlate_select_group(struct xlate_ctx *ctx, struct
> group_dpif *group)
>  static void
>  xlate_group_action__(struct xlate_ctx *ctx, struct group_dpif *group)
>  {
> +ctx->in_group = true;
> +
>  switch (group_dpif_get_type(group)) {
>  case OFPGT11_ALL:
>  case OFPGT11_INDIRECT:
> @@ -1995,12 +1998,38 @@ xlate_group_action__(struct xlate_ctx *ctx, struct
> group_dpif *group)
>  OVS_NOT_REACHED();
>  }
>  group_dpif_release(group);
> +
> +ctx->in_group = false;
> +}
> +
> +static bool
> +xlate_group_resource_check(struct xlate_ctx *ctx)
> +{
> +if (!xlate_resubmit_resource_check(ctx)) {
> +return false;
> +} else if (ctx->in_group) {
> +/* Prevent nested translation of OpenFlow groups.
> + *
> + * OpenFlow allows this restriction.  We enforce this restriction
> only
> + * because, with the current architecture, we would otherwise
> have to
> + * take a possibly recursive read lock on the ofgroup rwlock,
> which is
> + * unsafe given that POSIX allows taking a read lock to block if
> there
> + * is a thread blocked on taking the write lock.  Other solutions
> + * without this restriction are also possible, but seem
> unwarranted
> + * given the current limited use of groups. */
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> +
> +VLOG_ERR_RL(&rl, "cannot recursively translate OpenFlow group");
> +return false;
> +} else {
> +return true;
> +}
>  }
>
>  static bool
>  xlate_group_action(struct xlate_ctx *ctx, uint32_t group_id)
>  {
> -if (xlate_resubmit_resource_check(ctx)) {
> +if (xlate_group_resource_check(ctx)) {
>  struct group_dpif *group;
>  bool got_group;
>
> @@ -2974,6 +3003,7 @@ xlate_actions__(struct xlate_in *xin, struct
> xlate_out *xout)
>
>  ctx.recurse = 0;
>  ctx.resubmits = 0;
> +ctx.in_group = false;
>  ctx.orig_skb_priority = flow->skb_priority;
>  ctx.table_id = 0;
>  ctx.exit = false;
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 328b215..7c5b941 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -3291,6 +3291,20 @@ static enum ofperr
>  group_construct(struct ofgroup *group_)
>  {
>  struct group_dpif *group = group_dpif_cast(group_);
> +const struct ofputil_bucket *bucket;
> +
> +/* Prevent group chaining because our locking structure makes it hard

Re: [ovs-dev] [rwlock 3/5] ofproto-dpif-xlate: Avoid recursively taking read side of ofgroup rwlock.

2014-02-21 Thread Ben Pfaff
Thanks.  I applied this and patch 5 (the only other remaining patch).

On Fri, Feb 21, 2014 at 03:16:28PM -0800, Joe Stringer wrote:
> Looks good to me.
> 
> Acked-by: Joe Stringer 
> 
> 
> On 21 February 2014 10:46, Ben Pfaff  wrote:
> 
> > On Wed, Feb 19, 2014 at 02:53:11PM -0800, Joe Stringer wrote:
> > > This looks good to me, although aren't we meant to report something at
> > the
> > > OpenFlow layer?
> > >
> > > Is this equivalent to the OpenFlow "chaining" description? So if we don't
> > > support chaining groups together, we should return an error message with
> > > reason OFPGMFC_CHAINING_UNSUPPORTED, rather than accepting the rules then
> > > misbehaving?
> >
> > You're right.  Here's a version that does that, and adds a test.
> >
> > --8<--cut here-->8--
> >
> > From: Ben Pfaff 
> > Date: Fri, 21 Feb 2014 10:45:00 -0800
> > Subject: [PATCH] ofproto-dpif-xlate: Avoid recursively taking read side of
> >  ofgroup rwlock.
> >
> > With glibc, rwlocks by default allow recursive read-locking even if a
> > thread is blocked waiting for the write-lock.  POSIX allows such attempts
> > to deadlock, and it appears that the libc used in NetBSD, at least, does
> > deadlock.  ofproto-dpif-xlate could take the ofgroup rwlock recursively if
> > an ofgroup's actions caused the ofgroup to be executed again.  This commit
> > avoids that issue by preventing recursive translation of groups (the same
> > group or another group).  This is not the most user friendly solution,
> > but OpenFlow allows this restriction, and we can always remove the
> > restriction later (probably requiring more complicated code) if it
> > proves to be a real problem to real users.
> >
> > Found by inspection.
> >
> > Signed-off-by: Ben Pfaff 
> > ---
> >  ofproto/ofproto-dpif-xlate.c |   32 +++-
> >  ofproto/ofproto-dpif.c   |   14 ++
> >  tests/ofproto-dpif.at|   11 +++
> >  3 files changed, 56 insertions(+), 1 deletion(-)
> >
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index ccf0b75..89d92af 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -178,6 +178,7 @@ struct xlate_ctx {
> >  /* Resubmit statistics, via xlate_table_action(). */
> >  int recurse;/* Current resubmit nesting depth. */
> >  int resubmits;  /* Total number of resubmits. */
> > +bool in_group;  /* Currently translating ofgroup, if
> > true. */
> >
> >  uint32_t orig_skb_priority; /* Priority when packet arrived. */
> >  uint8_t table_id;   /* OpenFlow table ID where flow was
> > found. */
> > @@ -1980,6 +1981,8 @@ xlate_select_group(struct xlate_ctx *ctx, struct
> > group_dpif *group)
> >  static void
> >  xlate_group_action__(struct xlate_ctx *ctx, struct group_dpif *group)
> >  {
> > +ctx->in_group = true;
> > +
> >  switch (group_dpif_get_type(group)) {
> >  case OFPGT11_ALL:
> >  case OFPGT11_INDIRECT:
> > @@ -1995,12 +1998,38 @@ xlate_group_action__(struct xlate_ctx *ctx, struct
> > group_dpif *group)
> >  OVS_NOT_REACHED();
> >  }
> >  group_dpif_release(group);
> > +
> > +ctx->in_group = false;
> > +}
> > +
> > +static bool
> > +xlate_group_resource_check(struct xlate_ctx *ctx)
> > +{
> > +if (!xlate_resubmit_resource_check(ctx)) {
> > +return false;
> > +} else if (ctx->in_group) {
> > +/* Prevent nested translation of OpenFlow groups.
> > + *
> > + * OpenFlow allows this restriction.  We enforce this restriction
> > only
> > + * because, with the current architecture, we would otherwise
> > have to
> > + * take a possibly recursive read lock on the ofgroup rwlock,
> > which is
> > + * unsafe given that POSIX allows taking a read lock to block if
> > there
> > + * is a thread blocked on taking the write lock.  Other solutions
> > + * without this restriction are also possible, but seem
> > unwarranted
> > + * given the current limited use of groups. */
> > +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> > +
> > +VLOG_ERR_RL(&rl, "cannot recursively translate OpenFlow group");
> > +return false;
> > +} else {
> > +return true;
> > +}
> >  }
> >
> >  static bool
> >  xlate_group_action(struct xlate_ctx *ctx, uint32_t group_id)
> >  {
> > -if (xlate_resubmit_resource_check(ctx)) {
> > +if (xlate_group_resource_check(ctx)) {
> >  struct group_dpif *group;
> >  bool got_group;
> >
> > @@ -2974,6 +3003,7 @@ xlate_actions__(struct xlate_in *xin, struct
> > xlate_out *xout)
> >
> >  ctx.recurse = 0;
> >  ctx.resubmits = 0;
> > +ctx.in_group = false;
> >  ctx.orig_skb_priority = flow->skb_priority;
> >  ctx.table_id = 0;
> >  ctx.exit = false;
> > diff --git a/ofproto/ofproto-dpif.c b/ofproto

[ovs-dev] [PATCH] lib/process, socket-util: Update necessary headers

2014-02-21 Thread YAMAMOTO Takashi
Fix the regression introduced by commit 4f57ad10.
("socket-util: Move get_max_fds() to process.c")

Signed-off-by: YAMAMOTO Takashi 
---
 lib/process.c | 1 +
 lib/socket-util.c | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/process.c b/lib/process.c
index b479d00..313f11f 100644
--- a/lib/process.c
+++ b/lib/process.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/lib/socket-util.c b/lib/socket-util.c
index 6bc5d2c..eec2713 100644
--- a/lib/socket-util.c
+++ b/lib/socket-util.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] lib/process, socket-util: Update necessary headers

2014-02-21 Thread Gurucharan Shetty
On Fri, Feb 21, 2014 at 5:33 PM, YAMAMOTO Takashi
 wrote:
> Fix the regression introduced by commit 4f57ad10.
> ("socket-util: Move get_max_fds() to process.c")
>
> Signed-off-by: YAMAMOTO Takashi 
I suppose this did not fail the build? Otherwise looks like the
correct thing to do.

> ---
>  lib/process.c | 1 +
>  lib/socket-util.c | 1 -
>  2 files changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/process.c b/lib/process.c
> index b479d00..313f11f 100644
> --- a/lib/process.c
> +++ b/lib/process.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/lib/socket-util.c b/lib/socket-util.c
> index 6bc5d2c..eec2713 100644
> --- a/lib/socket-util.c
> +++ b/lib/socket-util.c
> @@ -27,7 +27,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> --
> 1.8.3.1
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ofproto: Send port status message for port-mods, right away.

2014-02-21 Thread Kmindg G
On Sat, Feb 22, 2014 at 12:42 AM, Ben Pfaff  wrote:
> On Fri, Feb 21, 2014 at 10:07:14AM +0800, Kmindg G wrote:
>> On Fri, Feb 21, 2014 at 5:19 AM, Ben Pfaff  wrote:
>> > On Thu, Feb 20, 2014 at 12:45:49PM +0800, Kmindg G wrote:
>> >> On Thu, Feb 20, 2014 at 3:20 AM, Ben Pfaff  wrote:
>> >> > Until now, when it processes OFPT_PORT_MOD message, Open vSwitch has 
>> >> > waited
>> >> > for ports to notify it that their status has changed before it sends a
>> >> > port status update to controllers.
>> >> >
>> >> > Also, Open vSwitch never sent port config updates at all for port
>> >> > modifications other than OFPPC_PORT_DOWN.  I guess that this is a relic
>> >> > from the era when there was only one controller, since presumably the
>> >> > controller already knew that it changed the port configuration.  But in 
>> >> > the
>> >> > multi-controller world, it makes sense to send such port status updates,
>> >> > and I couldn't quickly find anything in OF1.3 or OF1.4 that said they
>> >> > shouldn't be sent.
>> >> >
>> >> > Signed-off-by: Ben Pfaff 
>> >> > Reported-by: Kmindg G 
>> >> > ---
>> >> >  AUTHORS   |1 +
>> >> >  ofproto/ofproto.c |   25 +
>> >> >  2 files changed, 14 insertions(+), 12 deletions(-)
>> >> >
>> >> > diff --git a/AUTHORS b/AUTHORS
>> >> > index c557303..2fda8d7 100644
>> >> > --- a/AUTHORS
>> >> > +++ b/AUTHORS
>> >> > @@ -197,6 +197,7 @@ John Hurley john.hur...@netronome.com
>> >> >  Kevin Mancuso   kevin.manc...@rackspace.com
>> >> >  Kiran Shanbhog  ki...@vmware.com
>> >> >  Kirill Kabardin
>> >> > +Kmindg Gkmi...@gmail.com
>> >> >  Koichi Yagishitayagishita.koi...@jrc.co.jp
>> >> >  Konstantin Khorenko khore...@openvz.org
>> >> >  Kris zhang  zhang.k...@gmail.com
>> >> > diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
>> >> > index 62c97a2..48e10ca 100644
>> >> > --- a/ofproto/ofproto.c
>> >> > +++ b/ofproto/ofproto.c
>> >> > @@ -2986,22 +2986,23 @@ update_port_config(struct ofport *port,
>> >> > enum ofputil_port_config config,
>> >> > enum ofputil_port_config mask)
>> >> >  {
>> >> > -enum ofputil_port_config old_config = port->pp.config;
>> >> > -enum ofputil_port_config toggle;
>> >> > -
>> >> > -toggle = (config ^ port->pp.config) & mask;
>> >> > -if (toggle & OFPUTIL_PC_PORT_DOWN) {
>> >> > -if (config & OFPUTIL_PC_PORT_DOWN) {
>> >> > -netdev_turn_flags_off(port->netdev, NETDEV_UP, NULL);
>> >> > -} else {
>> >> > -netdev_turn_flags_on(port->netdev, NETDEV_UP, NULL);
>> >> > -}
>> >> > +enum ofputil_port_config toggle = (config ^ port->pp.config) & 
>> >> > mask;
>> >> > +
>> >> > +if (toggle & OFPUTIL_PC_PORT_DOWN
>> >> > +&& (config & OFPUTIL_PC_PORT_DOWN
>> >> > +? netdev_turn_flags_off(port->netdev, NETDEV_UP, NULL)
>> >> > +: netdev_turn_flags_on(port->netdev, NETDEV_UP, NULL))) {
>> >> > +/* We tried to bring the port up or down, but it failed, so 
>> >> > don't
>> >> > + * update the "down" bit. */
>> >> >  toggle &= ~OFPUTIL_PC_PORT_DOWN;
>> >> >  }
>> >> >
>> >> > -port->pp.config ^= toggle;
>> >> > -if (port->pp.config != old_config) {
>> >> > +if (toggle) {
>> >> > +enum ofputil_port_config old_config = port->pp.config;
>> >> > +port->pp.config ^= toggle;
>> >> >  port->ofproto->ofproto_class->port_reconfigured(port, 
>> >> > old_config);
>> >> > +connmgr_send_port_status(port->ofproto->connmgr, &port->pp,
>> >> > + OFPPR_MODIFY);
>> >> >  }
>> >> >  }
>> >> >
>> >> > --
>> >> > 1.7.10.4
>> >> >
>> >>
>> >> looks good to me.
>> >
>> > Thanks.  What do you think of this version?  I think that it retains
>> > better compatibility with OpenFlow 1.0 in particular.
>> >
>> > --8<--cut here-->8--
>> >
>> > From: Ben Pfaff 
>> > Date: Thu, 20 Feb 2014 13:18:51 -0800
>> > Subject: [PATCH] ofproto: Send port status message for port-mods, right 
>> > away.
>> >
>> > Until now, when it processes OFPT_PORT_MOD message, Open vSwitch has waited
>> > for ports to notify it that their status has changed before it sends a
>> > port status update to controllers.
>> >
>> > Also, Open vSwitch never sent port config updates at all for port
>> > modifications other than OFPPC_PORT_DOWN.  I guess that this is a relic
>> > from the era when there was only one controller, since presumably the
>> > controller already knew that it changed the port configuration.  But in the
>> > multi-controller world, it makes sense to send such port status updates,
>> > and I couldn't quickly find anything in OF1.3 or OF1.4 that said they
>> > shouldn't be sent.
>> >
>> > Signed-off-by: Ben Pfaff 
>> > Reported-by: Kmindg G 
>> > ---
>> >  ofproto/connmgr.c | 30 --
>> >  ofproto