[ovs-dev] Returned mail: see transcript for details

2015-08-21 Thread Mail Delivery Subsystem
This message was not delivered due to the following reason(s):

Your message could not be delivered because the destination computer was
unreachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message was not delivered within 4 days:
Host 12.240.189.187 is not responding.

The following recipients could not receive this message:


Please reply to postmas...@openvswitch.org
if you feel this message to be in error.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Try it now to show her who's her daddy!

2015-08-21 Thread Carissa




Make sure that you made your woman happy tonight.

style2
{   text-align: center; color: #D90003; font-family: 
Cambria, 'Hoefler Text', 'Liberation Serif', Times, 'Times New Roman', serif; 
font-size: 20px; font-weight: bold;
}
style4
{border: 1px solid #707070;
}
style5
{text-align: center; color: #D90003; font-family: Cambria, 'Hoefler Text', 
'Liberation Serif', Times, 'Times New Roman', serif; font-size: 18px; 
font-weight: bold;
}
h1{color:#606060 !important;
display:block;
font-family:Helvetica;
font-size:20px;
font-style:normal;
font-weight:bold;
line-height:180%;
letter-spacing:0px;
margin:0;
text-align:center;
}





  

   
   If you decided to take medications to treat your erectile dysfunction, 
choose the best one! 
  
  
   


   
  
  
If your marriage needs a new breath, breathe together in your bedroom

  

  - Free pills only for You!

  - Free shipping

  
  
  
Only this week! 
Special discount - SAVE 95%
  

  




   


   
  



It will help, we have got it!   
   


  
   


   
  Billing is done under a discreet name for your privacy
   


   
  





unsubscribe from this list
   

  



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 1/9] classifier: Fix comment.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 07, 2015 at 04:57:34PM -0700, Jarno Rajahalme wrote:
> Signed-off-by: Jarno Rajahalme 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 3/9] test-classifier: Add benchmark.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 07, 2015 at 04:57:36PM -0700, Jarno Rajahalme wrote:
> Add a benchmark command for classifier lookup performance testing.
>
> Usage:

This usage note is good, but putting it just in the commit log will mean
that it gets lost.  It should be in a --help message, or failing that in
a source code comment.

I'm not sure I believe in the realism of random priorities.  Random
priorities are are worst case for the optimization that skips subtables
based on priorities.  Our NSDI paper showed that subtables tend to have
a small number of priorities (often just one) in practice.

If n_rules < n_subtables, or if the random numbers come out just right,
then I think that the classifier will have fewer than the requested
number of subtables.  Also, if the same rule is generated more than
once, the classifier will have fewer than the requested number of rules.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 3/9] test-classifier: Add benchmark.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 21, 2015 at 07:37:51AM -0700, Ben Pfaff wrote:
> On Fri, Aug 07, 2015 at 04:57:36PM -0700, Jarno Rajahalme wrote:
> > Add a benchmark command for classifier lookup performance testing.
> >
> > Usage:
> 
> This usage note is good, but putting it just in the commit log will mean
> that it gets lost.  It should be in a --help message, or failing that in
> a source code comment.
> 
> I'm not sure I believe in the realism of random priorities.  Random
> priorities are are worst case for the optimization that skips subtables
> based on priorities.  Our NSDI paper showed that subtables tend to have
> a small number of priorities (often just one) in practice.
> 
> If n_rules < n_subtables, or if the random numbers come out just right,
> then I think that the classifier will have fewer than the requested
> number of subtables.  Also, if the same rule is generated more than
> once, the classifier will have fewer than the requested number of rules.
> 
> Acked-by: Ben Pfaff 

Also a minor spelling fix:

diff --git a/tests/test-classifier.c b/tests/test-classifier.c
index 54b595f..b2d4afd 100644
--- a/tests/test-classifier.c
+++ b/tests/test-classifier.c
@@ -1302,7 +1302,7 @@ run_benchmarks(struct ovs_cmdl_context *ctx)
 n_lookups = strtol(ctx->argv[5], NULL, 10);
 
 printf("\nBenchmarking with:\n"
-   "%d rules with %d prioritites in %d tables, "
+   "%d rules with %d priorities in %d tables, "
"%d threads doing %d lookups each\n",
n_rules, n_priorities, n_tables, n_threads, n_lookups);
 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] Prepare for 2.4.1.

2015-08-21 Thread Ben Pfaff
On Thu, Aug 20, 2015 at 05:28:26PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 

*Finally*.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/2] Set release dates for 2.4.0.

2015-08-21 Thread Ben Pfaff
On Thu, Aug 20, 2015 at 05:28:25PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 

Hurray.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] classifier: Replace MINIFLOW_IN_MAP macro by function.

2015-08-21 Thread Ben Pfaff
Functions are nicer.

Signed-off-by: Ben Pfaff 
---
 lib/classifier.c |  2 +-
 lib/flow.h   | 16 ++--
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/classifier.c b/lib/classifier.c
index e6227b9..a91d936 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -1680,7 +1680,7 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], 
unsigned int n_tries,
 uint8_t be64ofs = be32ofs / 2;
 
 /* Is the trie field within the current range of fields? */
-if (MINIFLOW_IN_MAP(range_map, be64ofs)) {
+if (miniflow_in_map(range_map, be64ofs)) {
 /* On-demand trie lookup. */
 if (!ctx->lookup_done) {
 memset(&ctx->match_plens, 0, sizeof ctx->match_plens);
diff --git a/lib/flow.h b/lib/flow.h
index d0a354c..da09c90 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -599,15 +599,19 @@ miniflow_get__(const struct miniflow *mf, size_t u64_idx)
 : miniflow_values_get__(miniflow_get_values(mf), mf->tnl_map, u64_idx);
 }
 
-#define MINIFLOW_IN_MAP(MF, U64_IDX)\
-(OVS_LIKELY(U64_IDX >= FLOW_TNL_U64S)   \
- ? (MF)->pkt_map & (UINT64_C(1) << ((U64_IDX) - FLOW_TNL_U64S)) \
- : (MF)->tnl_map & (UINT64_C(1) << (U64_IDX)))
+static inline bool
+miniflow_in_map(const struct miniflow *mf, size_t u64_idx)
+{
+return (OVS_LIKELY(u64_idx >= FLOW_TNL_U64S)
+? mf->pkt_map & (UINT64_C(1) << (u64_idx - FLOW_TNL_U64S))
+: mf->tnl_map & (UINT64_C(1) << u64_idx)) != 0;
+
+}
 
 /* Get the value of 'FIELD' of an up to 8 byte wide integer type 'TYPE' of
  * a miniflow. */
 #define MINIFLOW_GET_TYPE(MF, TYPE, OFS)\
-(MINIFLOW_IN_MAP(MF, (OFS) / sizeof(uint64_t))  \
+(miniflow_in_map(MF, (OFS) / sizeof(uint64_t))  \
  ? ((OVS_FORCE const TYPE *)miniflow_get__(MF, (OFS) / sizeof(uint64_t))) \
  [(OFS) % sizeof(uint64_t) / sizeof(TYPE)]  \
  : 0)
@@ -696,7 +700,7 @@ minimask_is_catchall(const struct minimask *mask)
 static inline uint64_t miniflow_get(const struct miniflow *flow,
 unsigned int u64_ofs)
 {
-return MINIFLOW_IN_MAP(flow, u64_ofs)
+return miniflow_in_map(flow, u64_ofs)
 ? *miniflow_get__(flow, u64_ofs) : 0;
 }
 
-- 
2.1.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 4/9] classifier: Pre-compute stage masks.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 07, 2015 at 04:57:37PM -0700, Jarno Rajahalme wrote:
> This makes stage mask computation happen only when a subtable is
> inserted and allows simplification of the main lookup function.
> 
> Classifier benchmark shows that this speeds up the classification
> (with wildcards) about 5%.
> 
> Signed-off-by: Jarno Rajahalme 

I don't think that flow_hash_in_minimask_range() actually does what its
name implies anymore (where's the range?).  It at least needs a comment
update since the comment still talks about parameters it doesn't have.
Ditto for minimatch_hash_range().

In a construction like this:
if (map->tnl_map) {
MAP_FOR_EACH_INDEX(idx, map->tnl_map) {
dst_u64[idx] |= *p++;
}
}
it appears to me that the 'if' is redundant.  In one case I see there's
an OVS_UNLIKELY on the 'if' but I wonder whether that's sufficient
benefit.

If we believe that tnl_map is usually 0, then it would be profitable to
test pkt_map first below to allow short-circuiting to bail out earlier:

static inline bool
miniflow_equal_maps(const struct miniflow *a, const struct miniflow *b)
{
return a->tnl_map == b->tnl_map && a->pkt_map == b->pkt_map;
}


While reading code, I came up with some additional comments that are not
directly related to the changes here.

First, classifier.c has a lot of "static inline" functions.  Does the
"inline" actually produce measurable performance improvement?  If not,
then it's better to avoid "inline" in .c files since it suppresses
otherwise useful "unused function" warnings.

Second, the amount of duplicated code because of tnl_map versus pkt_map
is starting to bug me.  If these were just a 2-element array, for
example, then miniflow_and_mask_matches_flow_wc() could be in my opinion
tons cleaner as something like this:

{
const uint64_t *flowp = miniflow_get_values(flow);
const uint64_t *maskp = miniflow_get_values(&mask->masks);
const uint64_t *target_u64 = (const uint64_t *)target;
uint64_t *wc_u64 = (uint64_t *)&wc->masks;

for (int i = 0; i < 2; i++) {
size_t idx;

MAP_FOR_EACH_INDEX(idx, mask->masks.maps[i]) {
uint64_t msk = *maskp++;

uint64_t diff = (*flowp++ ^ target_u64[idx]) & msk;
if (diff) {
/* Only unwildcard if none of the differing bits is already
 * exact-matched. */
if (!(wc_u64[idx] & diff)) {
/* Keep one bit of the difference.  The selected bit may be
 * different in big-endian v.s. little-endian systems. */
wc_u64[idx] |= rightmost_1bit(diff);
}
return false;
}

/* Fill in the bits that were looked at. */
wc_u64[idx] |= msk;
}
target_u64 += FLOW_TNL_U64S;
wc_u64 += FLOW_TNL_U64S;
}

return true;
}

I noticed that MINIFLOW_IN_MAP could be a function, so I send out a
patch (which applies following this one):
http://openvswitch.org/pipermail/dev/2015-August/059029.html
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 8/9] flow: Add struct flowmap.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 07, 2015 at 04:57:41PM -0700, Jarno Rajahalme wrote:
> Struct miniflow is now sometimes used just as a map.  Define a new
> struct flowmap for that purpose.  The flowmap is defined as an array of
> maps, and it is automatically sized according to the size of struct
> flow, so it will be easier to maintain in the future.
> 
> It would have been tempting to use the existing struct bitmap for this
> purpose. The main reason this is not feasible at the moment is that
> some flowmap algorithms are simpler when it can be assumed that no
> struct flow member requires more bits than can fit to a single map
> unit. The tunnel member already requires more than 32 bits, so the map
> unit needs to be 64 bits wide.
> 
> Performance critical algorithms enumerate the flowmap array units
> explicitly, as it is easier for the compiler to optimize, compared to
> the normal iterator.  Without this optimization a classifier lookup
> without wildcard masks would be about 25% slower.
> 
> With this more general (and maintainable) algorithm the classifier
> lookups are about 5% slower, when the struct flow actually becomes big
> enough to require a second map.  This negates the performance gained
> in the "Pre-compute stage masks" patch earlier in the series.
> 
> Requested-by: Ben Pfaff 
> Signed-off-by: Jarno Rajahalme 

Having some trouble applying this, would you mind reposting what remains
of the series?

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/8] lib/dynamic-string: coding style fix

2015-08-21 Thread Ben Pfaff
On Tue, Aug 11, 2015 at 05:55:07PM -0700, Andy Zhou wrote:
> Remove tabs per coding style
> 
> Signed-off-by: Andy Zhou 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/8] lib/ofpbuf: refactor ofpbuf_use__() API

2015-08-21 Thread Ben Pfaff
On Tue, Aug 11, 2015 at 05:55:08PM -0700, Andy Zhou wrote:
> Add the size to its parameter list.
> 
> Signed-off-by: Andy Zhou 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/8] jsonrpc: use json_to_ds to speed up jsonrpc_send

2015-08-21 Thread Ben Pfaff
On Tue, Aug 11, 2015 at 05:55:10PM -0700, Andy Zhou wrote:
> This change reuses the string length that available from 'ds', saving
> a strlen() call.
> 
> Signed-off-by: Andy Zhou 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] classifier: Replace MINIFLOW_IN_MAP macro by function.

2015-08-21 Thread Justin Pettit
Don't let Keith see that comment.

Acked-by: Justin Pettit 

--Justin


> On Aug 21, 2015, at 8:26 AM, Ben Pfaff  wrote:
> 
> Functions are nicer.
> 
> Signed-off-by: Ben Pfaff 
> ---
> lib/classifier.c |  2 +-
> lib/flow.h   | 16 ++--
> 2 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/classifier.c b/lib/classifier.c
> index e6227b9..a91d936 100644
> --- a/lib/classifier.c
> +++ b/lib/classifier.c
> @@ -1680,7 +1680,7 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], 
> unsigned int n_tries,
> uint8_t be64ofs = be32ofs / 2;
> 
> /* Is the trie field within the current range of fields? */
> -if (MINIFLOW_IN_MAP(range_map, be64ofs)) {
> +if (miniflow_in_map(range_map, be64ofs)) {
> /* On-demand trie lookup. */
> if (!ctx->lookup_done) {
> memset(&ctx->match_plens, 0, sizeof ctx->match_plens);
> diff --git a/lib/flow.h b/lib/flow.h
> index d0a354c..da09c90 100644
> --- a/lib/flow.h
> +++ b/lib/flow.h
> @@ -599,15 +599,19 @@ miniflow_get__(const struct miniflow *mf, size_t 
> u64_idx)
> : miniflow_values_get__(miniflow_get_values(mf), mf->tnl_map, 
> u64_idx);
> }
> 
> -#define MINIFLOW_IN_MAP(MF, U64_IDX)\
> -(OVS_LIKELY(U64_IDX >= FLOW_TNL_U64S)   \
> - ? (MF)->pkt_map & (UINT64_C(1) << ((U64_IDX) - FLOW_TNL_U64S)) \
> - : (MF)->tnl_map & (UINT64_C(1) << (U64_IDX)))
> +static inline bool
> +miniflow_in_map(const struct miniflow *mf, size_t u64_idx)
> +{
> +return (OVS_LIKELY(u64_idx >= FLOW_TNL_U64S)
> +? mf->pkt_map & (UINT64_C(1) << (u64_idx - FLOW_TNL_U64S))
> +: mf->tnl_map & (UINT64_C(1) << u64_idx)) != 0;
> +
> +}
> 
> /* Get the value of 'FIELD' of an up to 8 byte wide integer type 'TYPE' of
>  * a miniflow. */
> #define MINIFLOW_GET_TYPE(MF, TYPE, OFS)\
> -(MINIFLOW_IN_MAP(MF, (OFS) / sizeof(uint64_t))  \
> +(miniflow_in_map(MF, (OFS) / sizeof(uint64_t))  \
>  ? ((OVS_FORCE const TYPE *)miniflow_get__(MF, (OFS) / sizeof(uint64_t))) 
> \
>  [(OFS) % sizeof(uint64_t) / sizeof(TYPE)]  \
>  : 0)
> @@ -696,7 +700,7 @@ minimask_is_catchall(const struct minimask *mask)
> static inline uint64_t miniflow_get(const struct miniflow *flow,
> unsigned int u64_ofs)
> {
> -return MINIFLOW_IN_MAP(flow, u64_ofs)
> +return miniflow_in_map(flow, u64_ofs)
> ? *miniflow_get__(flow, u64_ofs) : 0;
> }
> 
> -- 
> 2.1.3
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] Prepare for 2.4.1.

2015-08-21 Thread Justin Pettit

> On Aug 21, 2015, at 7:38 AM, Ben Pfaff  wrote:
> 
> On Thu, Aug 20, 2015 at 05:28:26PM -0700, Justin Pettit wrote:
>> Signed-off-by: Justin Pettit 
> 
> *Finally*.
> 
> Acked-by: Ben Pfaff 

Amen.  I pushed the series and uploaded the binaries.  Can you tag the release 
to the previous commit?  I'll send out the announcement soon.

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH/RFC] ovs-ctl: do not attempt to restore flows when called with --delete-bridges

2015-08-21 Thread Gurucharan Shetty
On Thu, Aug 20, 2015 at 4:45 PM, Simon Horman
 wrote:
> When called with --delete-bridges saved flows cannot be restored as the
> bridges to which they belong no longer exist. This results in the following
> error messages on restart.
>
> ovs-ofctl: br0 is not a bridge or a socket
> Restoring saved flows ... failed!
>
> Although there is no effect of this error other than the message
> it seems worth avoiding. This patch does so by skipping saving of flows
> when --delete-bridges is in effect.
>
> As flows are no longer saved when --delete-bridges is in effect
> a side-effect of this change is that restart may be faster when
> there are many flows.
>
> Signed-off-by: Simon Horman 
Acked-by: Gurucharan Shetty 
 (I will apply this in a few hours, if no one else has any comments)

> ---
>  utilities/ovs-ctl.in | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
> index 99d789c68b0a..918be2c9c188 100755
> --- a/utilities/ovs-ctl.in
> +++ b/utilities/ovs-ctl.in
> @@ -322,6 +322,12 @@ save_ofports_if_required () {
>  esac
>  }
>
> +save_flows_if_required () {
> +if test X"$DELETE_BRIDGES" != Xyes; then
> +action "Saving flows" ovs_save save-flows "${script_flows}"
> +fi
> +}
> +
>  save_interfaces () {
>  "$datadir/scripts/ovs-save" save-interfaces ${ifaces} \
>  > "${script_interfaces}"
> @@ -372,9 +378,7 @@ force_reload_kmod () {
>  action "Detected internal interfaces: $ifaces" true
>
>  init_restore_scripts
> -
> -action "Saving flows" ovs_save save-flows "${script_flows}"
> -
> +save_flows_if_required
>  save_ofports_if_required
>
>  # Restart the database first, since a large database may take a
> @@ -450,7 +454,7 @@ restart () {
>  if daemon_is_running ovsdb-server && daemon_is_running ovs-vswitchd; then
>  init_restore_scripts
>  save_interfaces_if_required
> -action "Saving flows" ovs_save save-flows "${script_flows}"
> +save_flows_if_required
>  save_ofports_if_required
>  fi
>
> --
> 2.1.4
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovsdb-idl: Add additional support for change tracking.

2015-08-21 Thread Ansari, Shad
Ovsdb-idl notifies a client that something changed; it does not track
which table, row changed in what way (insert, modify or delete).
As a result, a client has to scan or reconfigure the entire idl after
ovsdb_idl_run(). This is presumably fine for typical ovs schemas where
tables are relatively small. In use-cases where ovsdb is used with schemas
that can have very large tables, the current ovsdb-idl notification
mechanism does not appear to scale - clients need to do a lot of
processing to determine the exact change delta.

This change adds support for:
- Table and row based change sequence numbers to record the
   most recent IDL change sequence numbers associated with insert,
   modify or delete update on that table or row.
- Allow change tracking of specific columns. This ensures that changed
   rows (inserted, modified, deleted) that have tracked columns, are
   tracked by IDL. The client can directly access the changed rows
   with get_first, get_next operations without the need to scan the
   entire table.
   The tracking functionality is not enabled by default and needs to
   be turned on per-column by the client after ovsdb_idl_create()
   and before ovsdb_idl_run().

Signed-off-by: Shad Ansari shad.ans...@hp.com
---
lib/ovsdb-idl-provider.h |   5 ++
lib/ovsdb-idl.c  | 175 ---
lib/ovsdb-idl.h  |  29 
ovsdb/ovsdb-idlc.in  |  32 +
4 files changed, 230 insertions(+), 11 deletions(-)

diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 2ed78a7..3dddf69 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -36,6 +36,9 @@ struct ovsdb_idl_row {
 unsigned long int *prereqs; /* Bitmap of columns to verify in "old". */
 unsigned long int *written; /* Bitmap of columns from "new" to write. */
 struct hmap_node txn_node;  /* Node in ovsdb_idl_txn's list. */
+
+unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
+struct ovs_list track_node;
};

 struct ovsdb_idl_column {
@@ -62,6 +65,8 @@ struct ovsdb_idl_table {
 struct shash columns;/* Contains "const struct ovsdb_idl_column *"s. */
 struct hmap rows;/* Contains "struct ovsdb_idl_row"s. */
 struct ovsdb_idl *idl;   /* Containing idl. */
+unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
+struct ovs_list track_list; /* Tracked rows (ovsdb_idl_row.track_node). */
};

 struct ovsdb_idl_class {
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 00b900d..e35de11 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -156,7 +156,7 @@ static struct ovsdb_idl_row *ovsdb_idl_row_create__(
 const struct ovsdb_idl_table_class *);
static struct ovsdb_idl_row *ovsdb_idl_row_create(struct ovsdb_idl_table *,
   const struct uuid *);
-static void ovsdb_idl_row_destroy(struct ovsdb_idl_row *);
+static void ovsdb_idl_row_destroy(struct ovsdb_idl_row *, bool track);

 static void ovsdb_idl_row_parse(struct ovsdb_idl_row *);
static void ovsdb_idl_row_unparse(struct ovsdb_idl_row *);
@@ -174,6 +174,10 @@ static void ovsdb_idl_parse_lock_reply(struct ovsdb_idl *,
static void ovsdb_idl_parse_lock_notify(struct ovsdb_idl *,
 const struct json *params,
 bool new_has_lock);
+static struct ovsdb_idl_table *
+ovsdb_idl_table_from_class(const struct ovsdb_idl *,
+   const struct ovsdb_idl_table_class *);
+static bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table);

 /* Creates and returns a connection to database 'remote', which should be in a
  * form acceptable to jsonrpc_session_open().  The connection will maintain an
@@ -227,6 +231,10 @@ ovsdb_idl_create(const char *remote, const struct 
ovsdb_idl_class *class,
 shash_add_assert(&table->columns, column->name, column);
 }
 hmap_init(&table->rows);
+list_init(&table->track_list);
+table->change_seqno[OVSDB_IDL_CHANGE_INSERT]
+= table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
+= table->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0;
 table->idl = idl;
 }

@@ -292,10 +300,16 @@ ovsdb_idl_clear(struct ovsdb_idl *idl)
 /* No need to do anything with dst_arcs: some node has those arcs
  * as forward arcs and will destroy them itself. */

-ovsdb_idl_row_destroy(row);
+if (!list_is_empty(&row->track_node)) {
+list_remove(&row->track_node);
+}
+
+ovsdb_idl_row_destroy(row, false);
 }
 }

+ovsdb_idl_track_clear(idl);
+
 if (changed) {
 idl->change_seqno++;
 }
@@ -591,6 +605,124 @@ ovsdb_idl_omit(struct ovsdb_idl *idl, const struct 
ovsdb_idl_column *column)
{
 *ovsdb_idl_get_mode(idl, column) = 0;
}
+
+/* Returns the most recent IDL change sequence number that caused a
+ * insert, modify or delete 

Re: [ovs-dev] [PATCH 2/2] Prepare for 2.4.1.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 21, 2015 at 10:23:46AM -0700, Justin Pettit wrote:
> 
> > On Aug 21, 2015, at 7:38 AM, Ben Pfaff  wrote:
> > 
> > On Thu, Aug 20, 2015 at 05:28:26PM -0700, Justin Pettit wrote:
> >> Signed-off-by: Justin Pettit 
> > 
> > *Finally*.
> > 
> > Acked-by: Ben Pfaff 
> 
> Amen.  I pushed the series and uploaded the binaries.  Can you tag the
> release to the previous commit?  I'll send out the announcement soon.

Tagged.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovs-sandbox: Add support for ovn-controller-vtep.

2015-08-21 Thread Russell Bryant
When ovs-sandbox is run with ovn enabled, create the vtep database and
run ovn-controller-vtep.  This lets you do some basic testing with
ovn-controller-vtep.  For example:

$ make sandbox SANDBOXFLAGS="--ovn"
$ vtep-ctl add-ps ps0

After those commands, you can see that ovn-controller-vtep added a
Chassis row to OVN_Southbound for the physical switch.

Signed-off-by: Russell Bryant 
---
 tutorial/ovs-sandbox | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
index db469cc..c8fc32f 100755
--- a/tutorial/ovs-sandbox
+++ b/tutorial/ovs-sandbox
@@ -59,6 +59,8 @@ gdb_ovn_northd=false
 gdb_ovn_northd_ex=false
 gdb_ovn_controller=false
 gdb_ovn_controller_ex=false
+gdb_ovn_controller_vtep=false
+gdb_ovn_controller_vtep_ex=false
 builddir=
 srcdir=
 schema=
@@ -109,6 +111,7 @@ These options force ovs-sandbox to use an installed Open 
vSwitch:
   -d, --gdb-ovsdb  run ovsdb-server under gdb
   --gdb-ovn-northd run ovn-northd under gdb
   --gdb-ovn-controller run ovn-controller under gdb
+  --gdb-ovn-controller-vtep run ovn-controller-vtep under gdb
   -R, --gdb-runautomatically start running the daemon in gdb
for any daemon set to run under gdb
   -S, --schema=FILEuse FILE as vswitch.ovsschema
@@ -169,6 +172,9 @@ EOF
 --gdb-ovn-controller)
 gdb_ovn_controller=true
 ;;
+--gdb-ovn-controller-vtep)
+gdb_ovn_controller_vtep=true
+;;
 -o|--ovn)
 ovn=true
 ;;
@@ -177,6 +183,7 @@ EOF
 gdb_ovsdb_ex=true
 gdb_ovn_northd_ex=true
 gdb_ovn_controller_ex=true
+gdb_ovn_controller_vtep_ex=true
 ;;
 -*)
 echo "unrecognized option $option (use --help for help)" >&2
@@ -239,6 +246,11 @@ if $built; then
 echo >&2 'source directory not found, please use --srcdir'
 exit 1
 fi
+vtep_schema=$srcdir/vtep/vtep.ovsschema
+if test ! -e "$vtep_schema"; then
+echo >&2 'source directory not found, please use --srcdir'
+exit 1
+fi
 fi
 
 # Put built tools early in $PATH.
@@ -248,7 +260,7 @@ if $built; then
 fi
 PATH=$builddir/ovsdb:$builddir/vswitchd:$builddir/utilities:$PATH
 if $ovn; then
-
PATH=$builddir/ovn:$builddir/ovn/controller:$builddir/ovn/northd:$builddir/ovn/utilities:$PATH
+
PATH=$builddir/ovn:$builddir/ovn/controller:$builddir/ovn/controller-vtep:$builddir/ovn/northd:$builddir/ovn/utilities:$PATH
 fi
 export PATH
 else
@@ -305,7 +317,8 @@ if $ovn; then
 touch "$sandbox"/.ovnnb.db.~lock~
 run ovsdb-tool create ovnsb.db "$ovnsb_schema"
 run ovsdb-tool create ovnnb.db "$ovnnb_schema"
-ovsdb_server_args="ovnsb.db ovnnb.db conf.db"
+run ovsdb-tool create vtep.db "$vtep_schema"
+ovsdb_server_args="ovnsb.db ovnnb.db vtep.db conf.db"
 fi
 rungdb $gdb_ovsdb $gdb_ovsdb_ex ovsdb-server --detach --no-chdir --pidfile 
-vconsole:off --log-file \
 --remote=punix:"$sandbox"/db.sock $ovsdb_server_args
@@ -339,6 +352,7 @@ if $ovn; then
 
 rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach --no-chdir 
--pidfile -vconsole:off --log-file
 rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller --detach 
--no-chdir --pidfile -vconsole:off --log-file
+rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex 
ovn-controller-vtep --detach --no-chdir --pidfile -vconsole:off --log-file
 fi
 
 cat 

Re: [ovs-dev] [PATCH 8/8] json: increase Json string initial buffer size

2015-08-21 Thread Ben Pfaff
On Tue, Aug 11, 2015 at 05:55:14PM -0700, Andy Zhou wrote:
> Json string are usually created and freed immediately. Thus, there
> should not be any downside in creating a larger buffer initially to
> avoid the cost of moving strings around in memory due to realloc()
> call.
> 
> The following script is used as benchmark to measure number of
> bytes reallocated:
> 
> ovs-vsctl add-br br0
> for i in `seq 0 100`; do
>ovs-vsctl add-port br0 p$i
> done
> ovs-vsctl del-br br0
> 
> 'ds_bytes_relocated' coverage counter shows that the bytes relocated
> are 2.5Mbytes/sec and 17Kbytes/sec, before and after this patch applied,
> respectively.

Is there a noticeable performance difference?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH/RFC] ovs-ctl: do not attempt to restore flows when called with --delete-bridges

2015-08-21 Thread Ben Pfaff
On Fri, Aug 21, 2015 at 10:25:22AM -0700, Gurucharan Shetty wrote:
> On Thu, Aug 20, 2015 at 4:45 PM, Simon Horman
>  wrote:
> > When called with --delete-bridges saved flows cannot be restored as the
> > bridges to which they belong no longer exist. This results in the following
> > error messages on restart.
> >
> > ovs-ofctl: br0 is not a bridge or a socket
> > Restoring saved flows ... failed!
> >
> > Although there is no effect of this error other than the message
> > it seems worth avoiding. This patch does so by skipping saving of flows
> > when --delete-bridges is in effect.
> >
> > As flows are no longer saved when --delete-bridges is in effect
> > a side-effect of this change is that restart may be faster when
> > there are many flows.
> >
> > Signed-off-by: Simon Horman 
> Acked-by: Gurucharan Shetty 
>  (I will apply this in a few hours, if no one else has any comments)

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 8/9] flow: Add struct flowmap.

2015-08-21 Thread Jarno Rajahalme

> On Aug 21, 2015, at 8:33 AM, Ben Pfaff  wrote:
> 
> On Fri, Aug 07, 2015 at 04:57:41PM -0700, Jarno Rajahalme wrote:
>> Struct miniflow is now sometimes used just as a map.  Define a new
>> struct flowmap for that purpose.  The flowmap is defined as an array of
>> maps, and it is automatically sized according to the size of struct
>> flow, so it will be easier to maintain in the future.
>> 
>> It would have been tempting to use the existing struct bitmap for this
>> purpose. The main reason this is not feasible at the moment is that
>> some flowmap algorithms are simpler when it can be assumed that no
>> struct flow member requires more bits than can fit to a single map
>> unit. The tunnel member already requires more than 32 bits, so the map
>> unit needs to be 64 bits wide.
>> 
>> Performance critical algorithms enumerate the flowmap array units
>> explicitly, as it is easier for the compiler to optimize, compared to
>> the normal iterator.  Without this optimization a classifier lookup
>> without wildcard masks would be about 25% slower.
>> 
>> With this more general (and maintainable) algorithm the classifier
>> lookups are about 5% slower, when the struct flow actually becomes big
>> enough to require a second map.  This negates the performance gained
>> in the "Pre-compute stage masks" patch earlier in the series.
>> 
>> Requested-by: Ben Pfaff 
>> Signed-off-by: Jarno Rajahalme 
> 
> Having some trouble applying this, would you mind reposting what remains
> of the series?
> 

Will do.

  Jarno

> Thanks,
> 
> Ben.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovsdb-idl: Add additional support for change tracking.

2015-08-21 Thread Russell Bryant
On 08/21/2015 01:30 PM, Ansari, Shad wrote:
> Ovsdb-idl notifies a client that something changed; it does not track
> which table, row changed in what way (insert, modify or delete).
> As a result, a client has to scan or reconfigure the entire idl after
> ovsdb_idl_run(). This is presumably fine for typical ovs schemas where
> tables are relatively small. In use-cases where ovsdb is used with schemas
> that can have very large tables, the current ovsdb-idl notification
> mechanism does not appear to scale - clients need to do a lot of
> processing to determine the exact change delta.
> 
> This change adds support for:
> - Table and row based change sequence numbers to record the
>most recent IDL change sequence numbers associated with insert,
>modify or delete update on that table or row.
> - Allow change tracking of specific columns. This ensures that changed
>rows (inserted, modified, deleted) that have tracked columns, are
>tracked by IDL. The client can directly access the changed rows
>with get_first, get_next operations without the need to scan the
>entire table.
>The tracking functionality is not enabled by default and needs to
>be turned on per-column by the client after ovsdb_idl_create()
>and before ovsdb_idl_run().
> 
> Signed-off-by: Shad Ansari shad.ans...@hp.com

It looks like something messed up your signed-off-by header.

I haven't reviewed the patch yet but wanted to say thanks for looking
into this!  This sounds really useful for OVN.

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] OVS-DPDK performance problem on ixgbe vector PMD

2015-08-21 Thread Zoltan Kiss

Hi,

I've set up a simple packet forwarding perf test on a dual-port 10G 
82599ES: one port receives 64 byte UDP packets, the other sends it out, 
one core used. I've used latest OVS with DPDK 2.1, and the first result 
was only 13.2 Mpps, which was a bit far from the 13.9 I've seen last 
year with the same test. The first thing I've changed was to revert back 
to the old behaviour about this issue:


http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/22731

So instead of the new default I've passed 2048 + RTE_PKTMBUF_HEADROOM. 
That increased the performance to 13.5, but to figure out what's wrong 
started to play with the receive functions. First I've disabled vector 
PMD, but ixgbe_recv_pkts_bulk_alloc() was even worse, only 12.5 Mpps. So 
then I've enabled scattered RX, and with 
ixgbe_recv_pkts_lro_bulk_alloc() I could manage to get 13.98 Mpps, which 
is I guess as close as possible to the 14.2 line rate (on my HW at 
least, with one core)
Does anyone has a good explanation about why the vector PMD performs so 
significantly worse? I would expect that on a 3.2 GHz i5-4570 one core 
should be able to reach ~14 Mpps, SG and vector PMD shouldn't make a 
difference.
I've tried to look into it with oprofile, but the results were quite 
strange: 35% of the samples were from miniflow_extract, the part where 
parse_vlan calls data_pull to jump after the MAC addresses. The oprofile 
snippet (1M samples):


  511454 190.0037  flow.c:511
  511458 149   0.0292  dp-packet.h:266
  51145f 4264  0.8357  dp-packet.h:267
  511466 180.0035  dp-packet.h:268
  51146d 430.0084  dp-packet.h:269
  511474 172   0.0337  flow.c:511
  51147a 4320  0.8467  string3.h:51
  51147e 358763   70.3176  flow.c:99
  511482 23.9e-04  string3.h:51
  511485 3060  0.5998  string3.h:51
  511488 1693  0.3318  string3.h:51
  51148c 2933  0.5749  flow.c:326
  511491 470.0092  flow.c:326

And the corresponding disassembled code:

  511454:   49 83 f9 0d cmpr9,0xd
  511458:   c6 83 81 00 00 00 00movBYTE PTR [rbx+0x81],0x0
  51145f:   66 89 83 82 00 00 00movWORD PTR [rbx+0x82],ax
  511466:   66 89 93 84 00 00 00movWORD PTR [rbx+0x84],dx
  51146d:   66 89 8b 86 00 00 00movWORD PTR [rbx+0x86],cx
  511474:   0f 86 af 01 00 00   jbe511629 


  51147a:   48 8b 45 00 movrax,QWORD PTR [rbp+0x0]
  51147e:   4c 8d 5d 0c lear11,[rbp+0xc]
  511482:   49 89 00movQWORD PTR [r8],rax
  511485:   8b 45 08moveax,DWORD PTR [rbp+0x8]
  511488:   41 89 40 08 movDWORD PTR [r8+0x8],eax
  51148c:   44 0f b7 55 0c  movzx  r10d,WORD PTR [rbp+0xc]
  511491:   66 41 81 fa 81 00   cmpr10w,0x81

My only explanation to this so far is that I misunderstand something 
about the oprofile results.


Regards,

Zoltan
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] rhel: define rundir for older distros

2015-08-21 Thread Ben Pfaff
On Sun, Aug 16, 2015 at 07:05:28PM -0400, Russell Bryant wrote:
> On 08/13/2015 02:39 PM, Flavio Leitner wrote:
> > Some older distros might not define _rundir yet so in this
> > case the RPM build breaks.  This patch defines it to the
> > Fedora's default.
> 
> minor commit message typo ...  need to change "the Fedora's" to either
> "the Fedora" or "Fedora's".

I fixed that up.

> > Signed-off-by: Flavio Leitner 
> 
> Acked-by: Russell Bryant 

Thanks Flavio and Russell, I applied this to master and branch-2.4.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] coverage: Add coverage_try_clear() for performance-critical threads.

2015-08-21 Thread Ben Pfaff
On Thu, Aug 13, 2015 at 11:48:49AM -0700, Alex Wang wrote:
> For performance-critical threads like pmd threads, we currently make them
> never call coverage_clear() to avoid contention over the global mutex
> 'coverage_mutex'.  So, even though pmd thread still keeps updating their
> thread-local coverage count, the count is never attributed to the global
> total.  But it is useful to have them available.
> 
> This commit makes this happen by implementing a non-contending version
> of the clear function, coverage_try_clear().  The function will use
> the ovs_mutex_trylock() and return immediately if the mutex cannot
> be acquired.  Since threads like pmd thread are always busy-looping,
> the lock will eventually be acquired.
> 
> Requested-by: Ilya Maximets 
> Signed-off-by: Alex Wang 

This seems like an improvement.  I can imagine better data structures
but I don't know whether they're worthwhile.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller-vtep.at: Skip test when using Windows setup.

2015-08-21 Thread Ben Pfaff
On Mon, Aug 17, 2015 at 03:09:46PM -0700, Alex Wang wrote:
> The 'ovs-vtep' simulator is not ported to Windows.  So, for now,
> just skip all tests in ovn-controller-vtep.at when running in
> Windows.
> 
> Signed-off-by: Alex Wang 

...

> -  [OVS_RUNDIR=`pwd`; export OVS_RUNDIR
> +  [
> +   # this will cause skip when 'make check' using Windows setup.
> +   AT_SKIP_IF([test $HAVE_PYTHON = no])

Is it a Python or a Windows incompatibility?  If it's really the latter,
then I'd recommend using "$IS_WIN32" = yes" instead.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller-vtep.at: Skip test when using Windows setup.

2015-08-21 Thread Gurucharan Shetty
On Fri, Aug 21, 2015 at 12:59 PM, Ben Pfaff  wrote:
> On Mon, Aug 17, 2015 at 03:09:46PM -0700, Alex Wang wrote:
>> The 'ovs-vtep' simulator is not ported to Windows.  So, for now,
>> just skip all tests in ovn-controller-vtep.at when running in
>> Windows.
>>
>> Signed-off-by: Alex Wang 
>
> ...
>
>> -  [OVS_RUNDIR=`pwd`; export OVS_RUNDIR
>> +  [
>> +   # this will cause skip when 'make check' using Windows setup.
>> +   AT_SKIP_IF([test $HAVE_PYTHON = no])
>
> Is it a Python or a Windows incompatibility?  If it's really the latter,
> then I'd recommend using "$IS_WIN32" = yes" instead.

My thinking was along the lines that all tests that use python code
need to have HAVE_PYTHON test. That has been the strategy for all
other python tests being skipped in Windows to prevent many individual
tests from having IS_WIN32 check in addition to HAVE_PYTHON test. I am
fine either way.

>
> Acked-by: Ben Pfaff 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] lib/automake.mk: Re-instate dependency generation

2015-08-21 Thread Ben Pfaff
On Tue, Aug 18, 2015 at 04:43:07PM +1200, Mark Wutzke wrote:
> Commit 5b3c194f (lib/automake.mk: Fix dependency typos)
> introduced dependencies that do not guarantee that
> lib/ofp-msgs.inc is generated before lib/ofp-msgs.c
> is compiled. This results in compilation issues when
> using parallel builds, or where the script is updated.
> 
> The original dependencies were correct, so re-instate
> them as they were.
> 
> Signed-off-by: Mark Wutzke 

I don't understand.  Even when I apply the following, I can't provoke
such a failure:

diff --git a/lib/automake.mk b/lib/automake.mk
index 15a9373..7cde717 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -494,7 +494,7 @@ CLEANFILES += lib/ofp-errors.inc
 EXTRA_DIST += build-aux/extract-ofp-errors
 
 lib/ofp-msgs.inc: lib/ofp-msgs.h $(srcdir)/build-aux/extract-ofp-msgs
-   $(AM_V_GEN)$(run_python) $(srcdir)/build-aux/extract-ofp-msgs \
+   sleep 1; $(AM_V_GEN)$(run_python) $(srcdir)/build-aux/extract-ofp-msgs \
$(srcdir)/lib/ofp-msgs.h $@ > $@.tmp && mv $@.tmp $@
 lib/ofp-msgs.lo: lib/ofp-msgs.inc
 CLEANFILES += lib/ofp-msgs.inc
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 3/9] test-classifier: Add benchmark.

2015-08-21 Thread Jarno Rajahalme

> On Aug 21, 2015, at 7:37 AM, Ben Pfaff  wrote:
> 
> On Fri, Aug 07, 2015 at 04:57:36PM -0700, Jarno Rajahalme wrote:
>> Add a benchmark command for classifier lookup performance testing.
>> 
>> Usage:
> 
> This usage note is good, but putting it just in the commit log will mean
> that it gets lost.  It should be in a --help message, or failing that in
> a source code comment.
> 

I the usage (without the explanation after the itemized list and example) to 
“ovstest test-classifier benchmark". The usage is shown if “—help” is given as 
an argument, or if there are less than the required five arguments.  I also 
added a ‘—help’ command to the test-classifier itself.

> I'm not sure I believe in the realism of random priorities.  Random
> priorities are are worst case for the optimization that skips subtables
> based on priorities.  Our NSDI paper showed that subtables tend to have
> a small number of priorities (often just one) in practice.
> 

Right, thats why I made the number of priorities an option, in a lack of a 
better model of a “typical classifier table”. I changed the example in the 
commit message to use only one priority.

More significantly, this is not a pipeline test, but a test on a single 
classifier instance. Typical pipelines would likely have only a small number of 
subtables in a single classifier, and would also amortize the cost of clearing 
‘wc’ across multiple lookups/resubmits.

> If n_rules < n_subtables, or if the random numbers come out just right,
> then I think that the classifier will have fewer than the requested
> number of subtables.  Also, if the same rule is generated more than
> once, the classifier will have fewer than the requested number of rules.
> 

The value in the benchmark is in being able to repeat the same test (same 
arguments -> same test) across different classifier code variations, and maybe 
across different C compilers and compiler options. It clearly shows the benefit 
of compiling with “-march=native”, for example (with modern CPUs). The exact 
number of rules or subtables is less important, IMO, but obviously the code 
could be improved in these regards.

> Acked-by: Ben Pfaff 

Thanks for the review, pushed to master,

  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 3/9] test-classifier: Add benchmark.

2015-08-21 Thread Jarno Rajahalme
I applied this before pushing to master, thanks!

  Jarno

> On Aug 21, 2015, at 7:39 AM, Ben Pfaff  wrote:
> 
> On Fri, Aug 21, 2015 at 07:37:51AM -0700, Ben Pfaff wrote:
>> On Fri, Aug 07, 2015 at 04:57:36PM -0700, Jarno Rajahalme wrote:
>>> Add a benchmark command for classifier lookup performance testing.
>>> 
>>> Usage:
>> 
>> This usage note is good, but putting it just in the commit log will mean
>> that it gets lost.  It should be in a --help message, or failing that in
>> a source code comment.
>> 
>> I'm not sure I believe in the realism of random priorities.  Random
>> priorities are are worst case for the optimization that skips subtables
>> based on priorities.  Our NSDI paper showed that subtables tend to have
>> a small number of priorities (often just one) in practice.
>> 
>> If n_rules < n_subtables, or if the random numbers come out just right,
>> then I think that the classifier will have fewer than the requested
>> number of subtables.  Also, if the same rule is generated more than
>> once, the classifier will have fewer than the requested number of rules.
>> 
>> Acked-by: Ben Pfaff 
> 
> Also a minor spelling fix:
> 
> diff --git a/tests/test-classifier.c b/tests/test-classifier.c
> index 54b595f..b2d4afd 100644
> --- a/tests/test-classifier.c
> +++ b/tests/test-classifier.c
> @@ -1302,7 +1302,7 @@ run_benchmarks(struct ovs_cmdl_context *ctx)
> n_lookups = strtol(ctx->argv[5], NULL, 10);
> 
> printf("\nBenchmarking with:\n"
> -   "%d rules with %d prioritites in %d tables, "
> +   "%d rules with %d priorities in %d tables, "
>"%d threads doing %d lookups each\n",
>n_rules, n_priorities, n_tables, n_threads, n_lookups);
> 

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller-vtep.at: Skip test when using Windows setup.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 21, 2015 at 01:15:14PM -0700, Gurucharan Shetty wrote:
> On Fri, Aug 21, 2015 at 12:59 PM, Ben Pfaff  wrote:
> > On Mon, Aug 17, 2015 at 03:09:46PM -0700, Alex Wang wrote:
> >> The 'ovs-vtep' simulator is not ported to Windows.  So, for now,
> >> just skip all tests in ovn-controller-vtep.at when running in
> >> Windows.
> >>
> >> Signed-off-by: Alex Wang 
> >
> > ...
> >
> >> -  [OVS_RUNDIR=`pwd`; export OVS_RUNDIR
> >> +  [
> >> +   # this will cause skip when 'make check' using Windows setup.
> >> +   AT_SKIP_IF([test $HAVE_PYTHON = no])
> >
> > Is it a Python or a Windows incompatibility?  If it's really the latter,
> > then I'd recommend using "$IS_WIN32" = yes" instead.
> 
> My thinking was along the lines that all tests that use python code
> need to have HAVE_PYTHON test. That has been the strategy for all
> other python tests being skipped in Windows to prevent many individual
> tests from having IS_WIN32 check in addition to HAVE_PYTHON test. I am
> fine either way.

It's already committed (I didn't realize that when I wrote my text
above), so let's leave it.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [dpdk-dev] OVS-DPDK performance problem on ixgbe vector PMD

2015-08-21 Thread Stephen Hemminger
Use perf top it gives much better data than oprofile

On Fri, Aug 21, 2015 at 11:05 AM, Zoltan Kiss 
wrote:

> Hi,
>
> I've set up a simple packet forwarding perf test on a dual-port 10G
> 82599ES: one port receives 64 byte UDP packets, the other sends it out, one
> core used. I've used latest OVS with DPDK 2.1, and the first result was
> only 13.2 Mpps, which was a bit far from the 13.9 I've seen last year with
> the same test. The first thing I've changed was to revert back to the old
> behaviour about this issue:
>
> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/22731
>
> So instead of the new default I've passed 2048 + RTE_PKTMBUF_HEADROOM.
> That increased the performance to 13.5, but to figure out what's wrong
> started to play with the receive functions. First I've disabled vector PMD,
> but ixgbe_recv_pkts_bulk_alloc() was even worse, only 12.5 Mpps. So then
> I've enabled scattered RX, and with ixgbe_recv_pkts_lro_bulk_alloc() I
> could manage to get 13.98 Mpps, which is I guess as close as possible to
> the 14.2 line rate (on my HW at least, with one core)
> Does anyone has a good explanation about why the vector PMD performs so
> significantly worse? I would expect that on a 3.2 GHz i5-4570 one core
> should be able to reach ~14 Mpps, SG and vector PMD shouldn't make a
> difference.
> I've tried to look into it with oprofile, but the results were quite
> strange: 35% of the samples were from miniflow_extract, the part where
> parse_vlan calls data_pull to jump after the MAC addresses. The oprofile
> snippet (1M samples):
>
>   511454 190.0037  flow.c:511
>   511458 149   0.0292  dp-packet.h:266
>   51145f 4264  0.8357  dp-packet.h:267
>   511466 180.0035  dp-packet.h:268
>   51146d 430.0084  dp-packet.h:269
>   511474 172   0.0337  flow.c:511
>   51147a 4320  0.8467  string3.h:51
>   51147e 358763   70.3176  flow.c:99
>   511482 23.9e-04  string3.h:51
>   511485 3060  0.5998  string3.h:51
>   511488 1693  0.3318  string3.h:51
>   51148c 2933  0.5749  flow.c:326
>   511491 470.0092  flow.c:326
>
> And the corresponding disassembled code:
>
>   511454:   49 83 f9 0d cmpr9,0xd
>   511458:   c6 83 81 00 00 00 00movBYTE PTR [rbx+0x81],0x0
>   51145f:   66 89 83 82 00 00 00movWORD PTR [rbx+0x82],ax
>   511466:   66 89 93 84 00 00 00movWORD PTR [rbx+0x84],dx
>   51146d:   66 89 8b 86 00 00 00movWORD PTR [rbx+0x86],cx
>   511474:   0f 86 af 01 00 00   jbe511629
> 
>   51147a:   48 8b 45 00 movrax,QWORD PTR [rbp+0x0]
>   51147e:   4c 8d 5d 0c lear11,[rbp+0xc]
>   511482:   49 89 00movQWORD PTR [r8],rax
>   511485:   8b 45 08moveax,DWORD PTR [rbp+0x8]
>   511488:   41 89 40 08 movDWORD PTR [r8+0x8],eax
>   51148c:   44 0f b7 55 0c  movzx  r10d,WORD PTR [rbp+0xc]
>   511491:   66 41 81 fa 81 00   cmpr10w,0x81
>
> My only explanation to this so far is that I misunderstand something about
> the oprofile results.
>
> Regards,
>
> Zoltan
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] tun-metadata: Always set option present when copying data.

2015-08-21 Thread Ben Pfaff
On Thu, Aug 13, 2015 at 07:59:29PM -0700, Jesse Gross wrote:
> Whenever we write into a tunnel option field, we also need to mark
> it as significant. If we don't, then the data will later be ignored.
> 
> We currently do this in every case except for flow metadata. This causes
> us to not correctly serialize the tunnel metadata for Packet Ins to the
> controller.
> 
> Rather than separately writing the data and marking the options as present,
> it is better to combine the two steps to ensure that one can never be
> done without the other.
> 
> Signed-off-by: Jesse Gross 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] hmap: Ensure iterator is NULL after iteration.

2015-08-21 Thread Ben Pfaff
On Tue, Aug 18, 2015 at 11:43:18AM -0700, Russell Bryant wrote:
> The HMAP_FOR_EACH_()* macros had a usability issue where the iterator
> was only NULL at the completion of iteration if the hmap_node was the
> first struct member.  This change ensures that the iterator is set to
> NULL when iteration ends normally without a 'break'.
> 
> Signed-off-by: Russell Bryant 

I applied this, thanks.

Let's try to keep in mind that this could cause nightmarishly subtle
bugs for backporting: if a bug fix takes advantage of this new behavior,
a backport of that bug fix needs to avoid it.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/3] ovs-vsctl: Add the missing ssl bootstrapping option parsing.

2015-08-21 Thread Ben Pfaff
On Thu, Aug 20, 2015 at 10:03:38AM -0700, Gurucharan Shetty wrote:
> 'man ovs-vsctl' mentions that ovs-vsctl can bootstrap itself
> by getting the certificate from the server. But the option
> was never parsed in the code.
> 
> Signed-off-by: Gurucharan Shetty 

I guess this is a bug fix so it should get backported.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] tun-metadata: Always set option present when copying data.

2015-08-21 Thread Jesse Gross
On Fri, Aug 21, 2015 at 1:28 PM, Ben Pfaff  wrote:
> On Thu, Aug 13, 2015 at 07:59:29PM -0700, Jesse Gross wrote:
>> Whenever we write into a tunnel option field, we also need to mark
>> it as significant. If we don't, then the data will later be ignored.
>>
>> We currently do this in every case except for flow metadata. This causes
>> us to not correctly serialize the tunnel metadata for Packet Ins to the
>> controller.
>>
>> Rather than separately writing the data and marking the options as present,
>> it is better to combine the two steps to ensure that one can never be
>> done without the other.
>>
>> Signed-off-by: Jesse Gross 
>
> Acked-by: Ben Pfaff 

Thanks, applied to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 4/9] classifier: Pre-compute stage masks.

2015-08-21 Thread Jarno Rajahalme

> On Aug 21, 2015, at 8:30 AM, Ben Pfaff  wrote:
> 
> On Fri, Aug 07, 2015 at 04:57:37PM -0700, Jarno Rajahalme wrote:
>> This makes stage mask computation happen only when a subtable is
>> inserted and allows simplification of the main lookup function.
>> 
>> Classifier benchmark shows that this speeds up the classification
>> (with wildcards) about 5%.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> I don't think that flow_hash_in_minimask_range() actually does what its
> name implies anymore (where's the range?).  It at least needs a comment
> update since the comment still talks about parameters it doesn't have.
> Ditto for minimatch_hash_range().
> 

The ‘maps’ argument really needs to be a successive/continuous subset of the 
bits in the masks map. I renamed it as “range” and fixed and elaborated the 
comments.

> In a construction like this:
>if (map->tnl_map) {
>MAP_FOR_EACH_INDEX(idx, map->tnl_map) {
>dst_u64[idx] |= *p++;
>}
>}
> it appears to me that the 'if' is redundant.  In one case I see there's
> an OVS_UNLIKELY on the 'if' but I wonder whether that's sufficient
> benefit.
> 
> If we believe that tnl_map is usually 0, then it would be profitable to
> test pkt_map first below to allow short-circuiting to bail out earlier:
> 
>static inline bool
>miniflow_equal_maps(const struct miniflow *a, const struct miniflow *b)
>{
>return a->tnl_map == b->tnl_map && a->pkt_map == b->pkt_map;
>}
> 

Right, it has likely only very marginal benefit, so I removed the extra if’s.

> 
> While reading code, I came up with some additional comments that are not
> directly related to the changes here.
> 
> First, classifier.c has a lot of "static inline" functions.  Does the
> "inline" actually produce measurable performance improvement?  If not,
> then it's better to avoid "inline" in .c files since it suppresses
> otherwise useful "unused function" warnings.
> 

Currently, the compiler inlines all these functions as intended. Earlier I 
noticed from the disassembly that inlining was not happening without the 
explicit “inline” keywords. Now that the classifier benchmark is in, we can 
test if it makes a difference in performance. I’ll do this sometime later.

> Second, the amount of duplicated code because of tnl_map versus pkt_map
> is starting to bug me.  If these were just a 2-element array, for
> example, then miniflow_and_mask_matches_flow_wc() could be in my opinion
> tons cleaner as something like this:
> 

A later patch of the series does this in a general way (sizing the array 
automatically).

> {
>const uint64_t *flowp = miniflow_get_values(flow);
>const uint64_t *maskp = miniflow_get_values(&mask->masks);
>const uint64_t *target_u64 = (const uint64_t *)target;
>uint64_t *wc_u64 = (uint64_t *)&wc->masks;
> 
>for (int i = 0; i < 2; i++) {
>size_t idx;
> 
>MAP_FOR_EACH_INDEX(idx, mask->masks.maps[i]) {
>uint64_t msk = *maskp++;
> 
>uint64_t diff = (*flowp++ ^ target_u64[idx]) & msk;
>if (diff) {
>/* Only unwildcard if none of the differing bits is already
> * exact-matched. */
>if (!(wc_u64[idx] & diff)) {
>/* Keep one bit of the difference.  The selected bit may be
> * different in big-endian v.s. little-endian systems. */
>wc_u64[idx] |= rightmost_1bit(diff);
>}
>return false;
>}
> 
>/* Fill in the bits that were looked at. */
>wc_u64[idx] |= msk;
>}
>target_u64 += FLOW_TNL_U64S;
>wc_u64 += FLOW_TNL_U64S;
>}
> 
>return true;
> }
> 
> I noticed that MINIFLOW_IN_MAP could be a function, so I send out a
> patch (which applies following this one):
>http://openvswitch.org/pipermail/dev/2015-August/059029.html 
> 


There might be other macros that could be functions as well in the series, I’ll 
check for that.

  Jarno


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/3] ovsdb-server: Add the ability to push peer-cert.

2015-08-21 Thread Ben Pfaff
On Thu, Aug 20, 2015 at 10:03:39AM -0700, Gurucharan Shetty wrote:
> In OVN, ovsdb-server is the daemon that manages the databases
> and can be called as the central controller. So it would be
> nice for ovsdb-server to be able to push its self-signed
> certificate to all the other nodes where ovn-controller runs.
> 
> Signed-off-by: Gurucharan Shetty 

This is a good idea, especially the test.

The test passes a plain --log-file option and a --log-file option with a
full path.  The first one fails:

2015-08-21T21:34:19Z|1|vlog|WARN|failed to open
/var/log/openvswitch/ovsdb-server.log for logging: No such file or
directory

so I'd replace it by the one with the full path.

The log message from ovs-vsctl on the disconnection is confusing:

ovs-vsctl: ssl:127.0.0.1:34766: database connection failed ()

It looks like this improves it, at least to "(Protocol error)":

diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c
index ae51b42..1e312a2 100644
--- a/lib/jsonrpc.c
+++ b/lib/jsonrpc.c
@@ -948,6 +948,7 @@ jsonrpc_session_run(struct jsonrpc_session *s)
 reconnect_connect_failed(s->reconnect, time_msec(), error);
 stream_close(s->stream);
 s->stream = NULL;
+s->last_error = error;
 }
 }
 

The test runs over 10x faster on my system with 1024-bit keys:

diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
index 6a2189a..caaa497 100644
--- a/tests/ovs-vsctl.at
+++ b/tests/ovs-vsctl.at
@@ -1315,9 +1315,9 @@ AT_KEYWORDS([ovs-vsctl ssl])
 AT_SKIP_IF([test "$HAVE_OPENSSL" = no])
 PKIDIR=`pwd`
 OVS_PKI="sh $abs_top_srcdir/utilities/ovs-pki.in --dir=$PKIDIR/pki 
--log=$PKIDIR/ovs-pki.log"
-$OVS_PKI init && \
-$OVS_PKI req+sign vsctl switch && \
-$OVS_PKI req ovsdbserver && $OVS_PKI self-sign ovsdbserver
+$OVS_PKI -B 1024 init && \
+$OVS_PKI -B 1024 req+sign vsctl switch && \
+$OVS_PKI -B 1024 req ovsdbserver && $OVS_PKI self-sign ovsdbserver
 
 dnl Create database.
 touch .conf.db.~lock~

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 3/3] ovn: Add bootstrap options for OVN controllers.

2015-08-21 Thread Ben Pfaff
On Thu, Aug 20, 2015 at 10:03:40AM -0700, Gurucharan Shetty wrote:
> This lets the central controller to push
> its certificate to the OVN controllers.
> 
> Signed-off-by: Gurucharan Shetty 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 8/8] json: increase Json string initial buffer size

2015-08-21 Thread Andy Zhou
On Fri, Aug 21, 2015 at 10:48 AM, Ben Pfaff  wrote:
> On Tue, Aug 11, 2015 at 05:55:14PM -0700, Andy Zhou wrote:
>> Json string are usually created and freed immediately. Thus, there
>> should not be any downside in creating a larger buffer initially to
>> avoid the cost of moving strings around in memory due to realloc()
>> call.
>>
>> The following script is used as benchmark to measure number of
>> bytes reallocated:
>>
>> ovs-vsctl add-br br0
>> for i in `seq 0 100`; do
>>ovs-vsctl add-port br0 p$i
>> done
>> ovs-vsctl del-br br0
>>
>> 'ds_bytes_relocated' coverage counter shows that the bytes relocated
>> are 2.5Mbytes/sec and 17Kbytes/sec, before and after this patch applied,
>> respectively.
>
> Is there a noticeable performance difference?

Unfortunately, No. Using the script mentioned in the commit message,
but increase the number
of ports to 500, I noticed on average, there are about 3% drop  (0.1ms
drop over 3.3s) in user space time.
It is not very noticeable. It is also within the variants of the tests.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] OVN: Broken pipe race

2015-08-21 Thread Ben Pfaff
On Mon, Aug 17, 2015 at 11:24:48AM -0700, Alex Wang wrote:
> Hey,
> 
> Want to open a thread to discuss the following race I encountered while
> unit testing ovn.
> 
> The most simple case is when I run ovn-nbctl to add a lport in unit test:
> 1. ovn-nbctl first creates/commits the logical_port entry in ovn-nb
> database.  the new entry's "up" column is empty,
> 2. then assume ovn-nbctl execution got suspended after
> ovsdb_idl_txn_commit_block(),
> 3. next, ovn-northd will update the ovn-sb database and finds that the
> new logical port is not bound.  so it goes ahead update the "up"
> column of the entry to "false"...
> 4. since ovn-nbctl is still running and is set to monitor everything, the
> ovsdb-server will try sending the "update" to ovn-nbctl...
> 5. now consider this race:  if ovn-nbctl execution resumes and exits right
> before ovsdb-server sending the update,...  the send will fail with
> (Broken Pipe) error, resulting in a WARN log in ovsdb-server.log.
> 
> Even if we set the "up" column to "false" at creation, we can still run into
> similar race if the ovn-controller quickly binds the lport to chassis and
> ovn-northd now updates "up" column to "true".
> 
> I also found similar race for other command combinations...  e.g.
> deleting vtep switch physical port and deleting ovs port while running
> ovs-vtep simulator...
> 
> I'm thinking instead of trying to fix every case (which may not be even
> possible), we can try removing all monitor request right after
> ovsdb_idl_txn_commit_block() and try waiting until receiving the
> monitor request ack from ovsdb-server.  After that ovsdb-server will
> never try sending anything to "*-*ctl" commands,
> 
> Would like to hear what you think?~

I think the warning is harmless (since we know the cause) so I'd be
inclined to just ignore it in the testsuite.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/3] ovs-vsctl: Add the missing ssl bootstrapping option parsing.

2015-08-21 Thread Gurucharan Shetty
> I guess this is a bug fix so it should get backported.
>
> Acked-by: Ben Pfaff 
I think this has been forever. How long back do we generally backport?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 1/3] Implement Openflow 1.4 Vacancy Events for OFPT_TABLE_MOD.

2015-08-21 Thread Ben Pfaff
On Wed, Aug 19, 2015 at 12:53:16PM +0530, saloni.jai...@gmail.com wrote:
> From: Saloni Jain 
> 
> OpenFlow 1.4 introduces the ability to turn on vacancy events with an
> OFPT_TABLE_MOD message specifying OFPTC_VACANCY_EVENTS. This commit adds
> support for the new feature in ovs-ofctl mod-table.
> As per the openflow specification-1.4, vacancy event adds a mechanism
> enabling the controller to get an early warning based on capacity
> threshold chosen by the controller.
> 
> With this commit, vacancy events can be configured as:
> ovs-ofctl -O OpenFlow14 mod-table   vacancy-
> The syntax of  as .
>  specify vacancy threshold values, vacancy down and vacancy up.
> 
> To disable vacancy events, following command should be given:
> ovs-ofctl -O OpenFlow14 mod-table   novacancy
> 
> Signed-off-by: Saloni Jain 
> Signed-off-by: Shashwat Srivastava 
> Signed-off-by: Sandeep Kumar 

Please explain the signoff chain.  Presumably, you are the author.  As
the person providing the patch, that means you need to be the final
signoff.  Who are Shashwat and Sandeep?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 2/3] Implement Vacancy Events for OFPMP_TABLE_DESC.

2015-08-21 Thread Ben Pfaff
On Wed, Aug 19, 2015 at 12:54:00PM +0530, saloni.jai...@gmail.com wrote:
> From: Saloni Jain 
> 
> This patch adds support for vacancy events in table-desc.
> 
> ovs-ofctl -O OpenFlow14 dump-tables-desc 
> -This command is enhanced to display the Vacancy Event configuration
>  of the tables on a , which is set using the mod-table command.
> 
> Signed-off-by: Saloni Jain 
> Signed-off-by: Hiteshi Kalra 

Presumably you are the author.  Since you're providing the patch, you
should provide the final signoff.  Who is Hiteshi?

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 3/3] Implement OFPT_TABLE_STATUS Message.

2015-08-21 Thread Ben Pfaff
On Wed, Aug 19, 2015 at 12:54:37PM +0530, saloni.jai...@gmail.com wrote:
> From: Saloni Jain 
> 
> On change in a table state, the controller needs to be informed with
> the OFPT_TABLE_STATUS message. The message is sent with reason
> OFPTR_VACANCY_DOWN or OFPTR_VACANCY_UP in case of change in remaining
> space eventually crossing any one of the threshold.
> 
> Signed-off-by: Saloni Jain 
> Signed-off-by: Rishi Bamba 

Presumably you are the author.  Since you are providing the patch, that
means that you should have the final signoff.  Who is Rishi?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/3] ovs-vsctl: Add the missing ssl bootstrapping option parsing.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 21, 2015 at 03:06:08PM -0700, Gurucharan Shetty wrote:
> > I guess this is a bug fix so it should get backported.
> >
> > Acked-by: Ben Pfaff 
> I think this has been forever. How long back do we generally backport?

In this case I would backport to 2.3 and 2.4 and not bother otherwise.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] OVN: Broken pipe race

2015-08-21 Thread Alex Wang
On Fri, Aug 21, 2015 at 3:02 PM, Ben Pfaff  wrote:

> On Mon, Aug 17, 2015 at 11:24:48AM -0700, Alex Wang wrote:
> > Hey,
> >
> > Want to open a thread to discuss the following race I encountered while
> > unit testing ovn.
> >
> > The most simple case is when I run ovn-nbctl to add a lport in unit test:
> > 1. ovn-nbctl first creates/commits the logical_port entry in ovn-nb
> > database.  the new entry's "up" column is empty,
> > 2. then assume ovn-nbctl execution got suspended after
> > ovsdb_idl_txn_commit_block(),
> > 3. next, ovn-northd will update the ovn-sb database and finds that the
> > new logical port is not bound.  so it goes ahead update the "up"
> > column of the entry to "false"...
> > 4. since ovn-nbctl is still running and is set to monitor everything, the
> > ovsdb-server will try sending the "update" to ovn-nbctl...
> > 5. now consider this race:  if ovn-nbctl execution resumes and exits
> right
> > before ovsdb-server sending the update,...  the send will fail with
> > (Broken Pipe) error, resulting in a WARN log in ovsdb-server.log.
> >
> > Even if we set the "up" column to "false" at creation, we can still run
> into
> > similar race if the ovn-controller quickly binds the lport to chassis and
> > ovn-northd now updates "up" column to "true".
> >
> > I also found similar race for other command combinations...  e.g.
> > deleting vtep switch physical port and deleting ovs port while running
> > ovs-vtep simulator...
> >
> > I'm thinking instead of trying to fix every case (which may not be even
> > possible), we can try removing all monitor request right after
> > ovsdb_idl_txn_commit_block() and try waiting until receiving the
> > monitor request ack from ovsdb-server.  After that ovsdb-server will
> > never try sending anything to "*-*ctl" commands,
> >
> > Would like to hear what you think?~
>
> I think the warning is harmless (since we know the cause) so I'd be
> inclined to just ignore it in the testsuite.
>

That's that alternative I received from talking with Andy~

Makes sense, I'll ignore the warning,

Thanks,
Alex Wang,
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovs-sandbox: Add support for ovn-controller-vtep.

2015-08-21 Thread Ben Pfaff
On Fri, Aug 21, 2015 at 01:42:18PM -0400, Russell Bryant wrote:
> When ovs-sandbox is run with ovn enabled, create the vtep database and
> run ovn-controller-vtep.  This lets you do some basic testing with
> ovn-controller-vtep.  For example:
> 
> $ make sandbox SANDBOXFLAGS="--ovn"
> $ vtep-ctl add-ps ps0
> 
> After those commands, you can see that ovn-controller-vtep added a
> Chassis row to OVN_Southbound for the physical switch.
> 
> Signed-off-by: Russell Bryant 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v5 2/2] ofproto: Implement OF1.4 Group & Meter change notification messages

2015-08-21 Thread Ben Pfaff
On Wed, Aug 19, 2015 at 12:17:33PM +0530, niti1...@gmail.com wrote:
> From: Niti Rohilla 
> 
> This patch adds support for Openflow1.4 Group & meter change notification
> messages. In a multi controller environment, when a controller modifies the
> state of group and meter table, the request that successfully modifies this
> state is forwarded to other controllers. Other controllers are informed with
> the OFPT_REQUESTFORWARD message. Request forwarding is enabled on a per
> controller channel basis using the Set Asynchronous Configuration Message.
> 
> Signed-off-by: Niti Rohilla 

I don't see patch 1/2 anywhere.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 0/4] datapath-windows: Update netlink attribute parsing

2015-08-21 Thread Ben Pfaff
On Mon, Jul 13, 2015 at 03:26:52PM +, Alin Serdean wrote:
> Update netlink attribute parsing to use the array size of the
> appropriate policy.

Do these patches still need review?  I don't see any.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 2/6] classifier: Pre-compute stage masks.

2015-08-21 Thread Jarno Rajahalme
This makes stage mask computation happen only when a subtable is
inserted and allows simplification of the main lookup function.

Classifier benchmark shows that this speeds up the classification
(with wildcards) about 5%.

Signed-off-by: Jarno Rajahalme 
---
 lib/classifier-private.h | 127 +-
 lib/classifier.c | 172 ---
 lib/flow.c   |   2 +-
 lib/flow.h   |   8 +++
 4 files changed, 157 insertions(+), 152 deletions(-)

diff --git a/lib/classifier-private.h b/lib/classifier-private.h
index 3b7ea3a..a3c0f8c 100644
--- a/lib/classifier-private.h
+++ b/lib/classifier-private.h
@@ -42,7 +42,7 @@ struct cls_subtable {
 /* These fields are accessed by readers who care about wildcarding. */
 const tag_type tag;   /* Tag generated from mask for partitioning. */
 const uint8_t n_indices;   /* How many indices to use. */
-const uint8_t index_ofs[CLS_MAX_INDICES];  /* u64 segment boundaries. */
+const struct miniflow index_maps[CLS_MAX_INDICES + 1]; /* Stage maps. */
 unsigned int trie_plen[CLS_MAX_TRIES];  /* Trie prefix length in 'mask'
  * (runtime configurable). */
 const int ports_mask_len;
@@ -226,54 +226,6 @@ struct trie_node {
  * These are only used by the classifier, so place them here to allow
  * for better optimization. */
 
-/* Initializes 'map->tnl_map' and 'map->pkt_map' with a subset of 'miniflow'
- * that includes only the portions with u64-offset 'i' such that start <= i <
- * end.  Does not copy any data from 'miniflow' to 'map'.
- *
- * TODO: Ensure that 'start' and 'end' are compile-time constants. */
-static inline unsigned int /* offset */
-miniflow_get_map_in_range(const struct miniflow *miniflow,
-  uint8_t start, uint8_t end, struct miniflow *map)
-{
-unsigned int offset = 0;
-
-map->tnl_map = miniflow->tnl_map;
-map->pkt_map = miniflow->pkt_map;
-
-if (start >= FLOW_TNL_U64S) {
-offset += count_1bits(map->tnl_map);
-map->tnl_map = 0;
-if (start > FLOW_TNL_U64S) {
-/* Clear 'start - FLOW_TNL_U64S' LSBs from pkt_map. */
-start -= FLOW_TNL_U64S;
-uint64_t msk = (UINT64_C(1) << start) - 1;
-
-offset += count_1bits(map->pkt_map & msk);
-map->pkt_map &= ~msk;
-}
-} else if (start > 0) {
-/* Clear 'start' LSBs from tnl_map. */
-uint64_t msk = (UINT64_C(1) << start) - 1;
-
-offset += count_1bits(map->tnl_map & msk);
-map->tnl_map &= ~msk;
-}
-
-if (end <= FLOW_TNL_U64S) {
-map->pkt_map = 0;
-if (end < FLOW_TNL_U64S) {
-/* Keep 'end' LSBs in tnl_map. */
-map->tnl_map &= (UINT64_C(1) << end) - 1;
-}
-} else {
-if (end < FLOW_U64S) {
-/* Keep 'end - FLOW_TNL_U64S' LSBs in pkt_map. */
-map->pkt_map &= (UINT64_C(1) << (end - FLOW_TNL_U64S)) - 1;
-}
-}
-return offset;
-}
-
 /* Returns a hash value for the bits of 'flow' where there are 1-bits in
  * 'mask', given 'basis'.
  *
@@ -325,36 +277,43 @@ miniflow_hash_in_minimask(const struct miniflow *flow,
 return hash_finish(hash, (p - mask_values) * 8);
 }
 
-/* Returns a hash value for the bits of range [start, end) in 'flow',
- * where there are 1-bits in 'mask', given 'hash'.
+/* Returns a hash value for the values of 'flow', indicated by 'range', where
+ * there are 1-bits in 'mask', given 'basis'.  'range' must be a continuous
+ * subset of the bits in 'mask''s map, representing a continuous range of the
+ * minimask's mask data.  '*offset' must be the number of 64-bit units of the
+ * minimask's data to skip to get to the first unit covered by 'range'. On
+ * return '*offset' is updated with the number of 64-bit units of the minimask
+ * consumed.
+ *
+ * Typically this function is called for successive ranges of minimask's masks,
+ * and the first invocation passes '*offset' as zero.
  *
  * The hash values returned by this function are the same as those returned by
  * minimatch_hash_range(), only the form of the arguments differ. */
 static inline uint32_t
 flow_hash_in_minimask_range(const struct flow *flow,
 const struct minimask *mask,
-uint8_t start, uint8_t end, uint32_t *basis)
+const struct miniflow *range,
+unsigned int *offset,
+uint32_t *basis)
 {
 const uint64_t *mask_values = miniflow_get_values(&mask->masks);
+const uint64_t *p = mask_values + *offset;
 const uint64_t *flow_u64 = (const uint64_t *)flow;
-unsigned int offset;
-struct miniflow map;
-const uint64_t *p;
 uint32_t hash = *basis;
 unsigned int idx;
 
-offset = miniflow_get_map_in_range(&mask->masks, start, end, &map);
-p = ma

[ovs-dev] [PATCH v5 1/6] flow: Use unsigned int for counts.

2015-08-21 Thread Jarno Rajahalme
Reserve the use of size_t for sizes of objects in bytes.

Suggested-by: Joe Stringer 
Signed-off-by: Jarno Rajahalme 
---
 lib/classifier-private.h | 10 +-
 lib/classifier.c | 45 +++--
 lib/classifier.h |  6 +++---
 lib/flow.c   | 31 ---
 lib/flow.h   | 30 --
 5 files changed, 63 insertions(+), 59 deletions(-)

diff --git a/lib/classifier-private.h b/lib/classifier-private.h
index 4c5ad80..3b7ea3a 100644
--- a/lib/classifier-private.h
+++ b/lib/classifier-private.h
@@ -287,7 +287,7 @@ flow_hash_in_minimask(const struct flow *flow, const struct 
minimask *mask,
 const uint64_t *flow_u64 = (const uint64_t *)flow;
 const uint64_t *p = mask_values;
 uint32_t hash;
-size_t idx;
+unsigned int idx;
 
 hash = basis;
 MAP_FOR_EACH_INDEX(idx, mask->masks.tnl_map) {
@@ -341,7 +341,7 @@ flow_hash_in_minimask_range(const struct flow *flow,
 struct miniflow map;
 const uint64_t *p;
 uint32_t hash = *basis;
-size_t idx;
+unsigned int idx;
 
 offset = miniflow_get_map_in_range(&mask->masks, start, end, &map);
 p = mask_values + offset;
@@ -375,7 +375,7 @@ flow_wildcards_fold_minimask_range(struct flow_wildcards 
*wc,
 const uint64_t *p = miniflow_get_values(&mask->masks);
 uint64_t *dst_u64 = (uint64_t *)&wc->masks;
 struct miniflow map;
-size_t idx;
+unsigned int idx;
 
 p += miniflow_get_map_in_range(&mask->masks, start, end, &map);
 MAP_FOR_EACH_INDEX(idx, map.tnl_map) {
@@ -392,10 +392,10 @@ static inline uint32_t
 minimask_hash(const struct minimask *mask, uint32_t basis)
 {
 const uint64_t *p = miniflow_get_values(&mask->masks);
-size_t n_values = miniflow_n_values(&mask->masks);
+unsigned int n_values = miniflow_n_values(&mask->masks);
 uint32_t hash = basis;
 
-for (size_t i = 0; i < n_values; i++) {
+for (unsigned int i = 0; i < n_values; i++) {
 hash = hash_add64(hash, *p++);
 }
 hash = hash_add64(hash, mask->masks.tnl_map);
diff --git a/lib/classifier.c b/lib/classifier.c
index 6484a34..8b59cff 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -57,8 +57,8 @@ BUILD_ASSERT_DECL(TP_PORTS_OFS32 == offsetof(struct flow, 
tp_dst) / 4);
 BUILD_ASSERT_DECL(TP_PORTS_OFS32 % 2 == 0);
 #define TP_PORTS_OFS64 (TP_PORTS_OFS32 / 2)
 
-static size_t
-cls_conjunction_set_size(size_t n)
+static unsigned int
+cls_conjunction_set_size(unsigned int n)
 {
 return (sizeof(struct cls_conjunction_set)
 + n * sizeof(struct cls_conjunction));
@@ -66,11 +66,11 @@ cls_conjunction_set_size(size_t n)
 
 static struct cls_conjunction_set *
 cls_conjunction_set_alloc(struct cls_match *match,
-  const struct cls_conjunction conj[], size_t n)
+  const struct cls_conjunction conj[], unsigned int n)
 {
 if (n) {
-size_t min_n_clauses = conj[0].n_clauses;
-for (size_t i = 1; i < n; i++) {
+unsigned int min_n_clauses = conj[0].n_clauses;
+for (unsigned int i = 1; i < n; i++) {
 min_n_clauses = MIN(min_n_clauses, conj[i].n_clauses);
 }
 
@@ -88,9 +88,9 @@ cls_conjunction_set_alloc(struct cls_match *match,
 
 static struct cls_match *
 cls_match_alloc(const struct cls_rule *rule, cls_version_t version,
-const struct cls_conjunction conj[], size_t n)
+const struct cls_conjunction conj[], unsigned int n)
 {
-size_t count = miniflow_n_values(rule->match.flow);
+unsigned int count = miniflow_n_values(rule->match.flow);
 
 struct cls_match *cls_match
 = xmalloc(sizeof *cls_match + MINIFLOW_VALUES_SIZE(count));
@@ -242,7 +242,7 @@ cls_rule_destroy(struct cls_rule *rule)
 
 void
 cls_rule_set_conjunctions(struct cls_rule *cr,
-  const struct cls_conjunction *conj, size_t n)
+  const struct cls_conjunction *conj, unsigned int n)
 {
 struct cls_match *match = cr->cls_match;
 struct cls_conjunction_set *old
@@ -573,14 +573,14 @@ subtable_replace_head_rule(struct classifier *cls 
OVS_UNUSED,
 const struct cls_rule *
 classifier_replace(struct classifier *cls, const struct cls_rule *rule,
cls_version_t version,
-   const struct cls_conjunction *conjs, size_t n_conjs)
+   const struct cls_conjunction *conjs, unsigned int n_conjs)
 {
 struct cls_match *new;
 struct cls_subtable *subtable;
 uint32_t ihash[CLS_MAX_INDICES];
 uint8_t prev_be64ofs = 0;
 struct cls_match *head;
-size_t n_rules = 0;
+unsigned int n_rules = 0;
 uint32_t basis;
 uint32_t hash;
 int i;
@@ -759,7 +759,7 @@ classifier_replace(struct classifier *cls, const struct 
cls_rule *rule,
 void
 classifier_insert(struct classifier *cls, const struct cls_rule *rule,
   cls_version_t version, const struct

[ovs-dev] [PATCH v5 4/6] classifier: Retire partitions.

2015-08-21 Thread Jarno Rajahalme
Classifier partitions allowed skipping subtables when if was known
from the flow's metadata field that the subtable cannot possibly
match.  This functionality was later implemented in a more general
fashion by staged lookup, where the first stage also covers the
metadata field, among the rest of the non-packet fields in the struct
flow.  While in theory skipping a subtable on the basis of the
metadata field alone could produce more effective wildcards, on the
basis of our testsuite coverage it does not seem to be the case, as
removing the partitioning feature did not result in any test failures.

Removing the partitioning feature makes classifier lookups roughly 20%
faster when a wildcard mask is not needed, and roughly 10% faster when
a wildcard mask is needed, as tested with the test-classifier
benchmark with one lookup thread.

Found by profiling with 'perf'.

Signed-off-by: Jarno Rajahalme 
---
 lib/automake.mk  |   2 -
 lib/classifier-private.h |  15 ---
 lib/classifier.c | 107 ---
 lib/tag.c|  64 
 lib/tag.h| 100 ---
 5 files changed, 288 deletions(-)
 delete mode 100644 lib/tag.c
 delete mode 100644 lib/tag.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 15a9373..c262af8 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -245,8 +245,6 @@ lib_libopenvswitch_la_SOURCES = \
lib/syslog-provider.h \
lib/table.c \
lib/table.h \
-   lib/tag.c \
-   lib/tag.h \
lib/timer.c \
lib/timer.h \
lib/timeval.c \
diff --git a/lib/classifier-private.h b/lib/classifier-private.h
index 13d523c..ce082d6 100644
--- a/lib/classifier-private.h
+++ b/lib/classifier-private.h
@@ -21,7 +21,6 @@
 #include "flow.h"
 #include "hash.h"
 #include "rculist.h"
-#include "tag.h"
 
 /* Classifier internal definitions, subject to change at any time. */
 
@@ -40,7 +39,6 @@ struct cls_subtable {
  * following data structures. */
 
 /* These fields are accessed by readers who care about wildcarding. */
-const tag_type tag;   /* Tag generated from mask for partitioning. */
 const uint8_t n_indices;   /* How many indices to use. */
 const struct flowmap index_maps[CLS_MAX_INDICES + 1]; /* Stage maps. */
 unsigned int trie_plen[CLS_MAX_TRIES];  /* Trie prefix length in 'mask'
@@ -55,16 +53,6 @@ struct cls_subtable {
 /* 'mask' must be the last field. */
 };
 
-/* Associates a metadata value (that is, a value of the OpenFlow 1.1+ metadata
- * field) with tags for the "cls_subtable"s that contain rules that match that
- * metadata value.  */
-struct cls_partition {
-struct cmap_node cmap_node; /* In struct classifier's 'partitions' map. */
-ovs_be64 metadata;  /* metadata value for this partition. */
-tag_type tags;  /* OR of each flow's cls_subtable tag. */
-struct tag_tracker tracker; /* Tracks the bits in 'tags'. */
-};
-
 /* Internal representation of a rule in a "struct cls_subtable".
  *
  * The 'next' member is an element in a singly linked, null-terminated list.
@@ -77,9 +65,6 @@ struct cls_match {
 OVSRCU_TYPE(struct cls_match *) next; /* Equal, lower-priority matches. */
 OVSRCU_TYPE(struct cls_conjunction_set *) conj_set;
 
-/* Accessed only by writers. */
-struct cls_partition *partition;
-
 /* Accessed by readers interested in wildcarding. */
 const int priority; /* Larger numbers are higher priorities. */
 struct cmap_node index_nodes[CLS_MAX_INDICES]; /* Within subtable's
diff --git a/lib/classifier.c b/lib/classifier.c
index f8880a6..c5f98d7 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -321,7 +321,6 @@ classifier_init(struct classifier *cls, const uint8_t 
*flow_segments)
 cls->n_rules = 0;
 cmap_init(&cls->subtables_map);
 pvector_init(&cls->subtables);
-cmap_init(&cls->partitions);
 cls->n_flow_segments = 0;
 if (flow_segments) {
 while (cls->n_flow_segments < CLS_MAX_INDICES
@@ -343,7 +342,6 @@ void
 classifier_destroy(struct classifier *cls)
 {
 if (cls) {
-struct cls_partition *partition;
 struct cls_subtable *subtable;
 int i;
 
@@ -356,11 +354,6 @@ classifier_destroy(struct classifier *cls)
 }
 cmap_destroy(&cls->subtables_map);
 
-CMAP_FOR_EACH (partition, cmap_node, &cls->partitions) {
-ovsrcu_postpone(free, partition);
-}
-cmap_destroy(&cls->partitions);
-
 pvector_destroy(&cls->subtables);
 }
 }
@@ -493,43 +486,6 @@ classifier_count(const struct classifier *cls)
 return cls->n_rules;
 }
 
-static uint32_t
-hash_metadata(ovs_be64 metadata)
-{
-return hash_uint64((OVS_FORCE uint64_t) metadata);
-}
-
-static struct cls_partition *
-find_partition(const struct classifier *cls, ovs_be64 metadata, uint32_t hash)
-{
-struct cls_partiti

[ovs-dev] [PATCH v5 5/6] ofproto-dpif-rid: Make lookups cheaper.

2015-08-21 Thread Jarno Rajahalme
Tunnel metadata has grown large since the addition of Geneve options.
Copying tunnel metadata for performing a lookup is not necessary.
Change recirc_metadata to use a pointer to struct flow_tnl, and only
copy the tunnel metadata when needed, and only copy as little of it as
possible.

Signed-off-by: Jarno Rajahalme 
---
 lib/flow.h | 22 +---
 lib/packets.h  | 65 --
 ofproto/ofproto-dpif-rid.c | 30 -
 ofproto/ofproto-dpif-rid.h | 64 +
 4 files changed, 124 insertions(+), 57 deletions(-)

diff --git a/lib/flow.h b/lib/flow.h
index 958e7cb..f9da6b3 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -65,28 +65,8 @@ BUILD_ASSERT_DECL(FLOW_N_REGS % 2 == 0); /* Even. */
 BUILD_ASSERT_DECL(FLOW_NW_FRAG_ANY == NX_IP_FRAG_ANY);
 BUILD_ASSERT_DECL(FLOW_NW_FRAG_LATER == NX_IP_FRAG_LATER);
 
-/* Some flags are exposed through OpenFlow while others are used only
- * internally. */
-
-/* Public flags */
-#define FLOW_TNL_F_OAM (1 << 0)
-
-#define FLOW_TNL_PUB_F_MASK ((1 << 1) - 1)
 BUILD_ASSERT_DECL(FLOW_TNL_F_OAM == NX_TUN_FLAG_OAM);
 
-/* Private flags */
-#define FLOW_TNL_F_DONT_FRAGMENT (1 << 1)
-#define FLOW_TNL_F_CSUM (1 << 2)
-#define FLOW_TNL_F_KEY (1 << 3)
-
-#define FLOW_TNL_F_MASK ((1 << 4) - 1)
-
-/* Purely internal to OVS userspace. These flags should never be exposed to
- * the outside world and so aren't included in the flags mask. */
-
-/* Tunnel information is in userspace datapath format. */
-#define FLOW_TNL_F_UDPIF (1 << 4)
-
 const char *flow_tun_flag_to_string(uint32_t flags);
 
 /* Maximum number of supported MPLS labels. */
@@ -987,7 +967,7 @@ pkt_metadata_from_flow(struct pkt_metadata *md, const 
struct flow *flow)
 {
 md->recirc_id = flow->recirc_id;
 md->dp_hash = flow->dp_hash;
-md->tunnel = flow->tunnel;
+flow_tnl_copy__(&md->tunnel, &flow->tunnel);
 md->skb_priority = flow->skb_priority;
 md->pkt_mark = flow->pkt_mark;
 md->in_port = flow->in_port;
diff --git a/lib/packets.h b/lib/packets.h
index 38af37b..7140c83 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -35,9 +35,9 @@ struct ds;
 
 /* Tunnel information used in flow key and metadata. */
 struct flow_tnl {
-ovs_be64 tun_id;
 ovs_be32 ip_dst;
 ovs_be32 ip_src;
+ovs_be64 tun_id;
 uint16_t flags;
 uint8_t ip_tos;
 uint8_t ip_ttl;
@@ -49,6 +49,65 @@ struct flow_tnl {
 struct tun_metadata metadata;
 };
 
+/* Some flags are exposed through OpenFlow while others are used only
+ * internally. */
+
+/* Public flags */
+#define FLOW_TNL_F_OAM (1 << 0)
+
+#define FLOW_TNL_PUB_F_MASK ((1 << 1) - 1)
+
+/* Private flags */
+#define FLOW_TNL_F_DONT_FRAGMENT (1 << 1)
+#define FLOW_TNL_F_CSUM (1 << 2)
+#define FLOW_TNL_F_KEY (1 << 3)
+
+#define FLOW_TNL_F_MASK ((1 << 4) - 1)
+
+/* Purely internal to OVS userspace. These flags should never be exposed to
+ * the outside world and so aren't included in the flags mask. */
+
+/* Tunnel information is in userspace datapath format. */
+#define FLOW_TNL_F_UDPIF (1 << 4)
+
+/* Returns an offset to 'src' covering all the meaningful fields in 'src'. */
+static inline size_t
+flow_tnl_size(const struct flow_tnl *src)
+{
+if (!src->ip_dst) {
+/* Covers ip_dst only. */
+return offsetof(struct flow_tnl, ip_src);
+}
+if (src->flags & FLOW_TNL_F_UDPIF) {
+/* Datapath format, cover all options we have. */
+return offsetof(struct flow_tnl, metadata.opts)
++ src->metadata.present.len;
+}
+if (!src->metadata.present.map) {
+/* No TLVs, opts is irrelevant. */
+return offsetof(struct flow_tnl, metadata.opts);
+}
+/* Have decoded TLVs, opts is relevant. */
+return sizeof *src;
+}
+
+/* Copy flow_tnl, but avoid copying unused portions of tun_metadata.  Unused
+ * data in 'dst' is NOT cleared, so this must not be used in cases where the
+ * uninitialized portion may be hashed over. */
+static inline void
+flow_tnl_copy__(struct flow_tnl *dst, const struct flow_tnl *src)
+{
+memcpy(dst, src, flow_tnl_size(src));
+}
+
+static inline bool
+flow_tnl_equal(const struct flow_tnl *a, const struct flow_tnl *b)
+{
+size_t a_size = flow_tnl_size(a);
+
+return a_size == flow_tnl_size(b) && !memcmp(a, b, a_size);
+}
+
 /* Unfortunately, a "struct flow" sometimes has to handle OpenFlow port
  * numbers and other times datapath (dpif) port numbers.  This union allows
  * access to both. */
@@ -67,7 +126,9 @@ struct pkt_metadata {
 uint32_t skb_priority;  /* Packet priority for QoS. */
 uint32_t pkt_mark;  /* Packet mark. */
 union flow_in_port in_port; /* Input port. */
-struct flow_tnl tunnel; /* Encapsulating tunnel parameters. */
+struct flow_tnl tunnel; /* Encapsulating tunnel parameters. Note that
+ * if 'ip_dst' == 0, the rest of the fields may
+ 

[ovs-dev] [PATCH v5 6/6] meta-flow: Avoid unnecessary large memset.

2015-08-21 Thread Jarno Rajahalme
mf_mask_field_and_prereqs() used to memset a static variable again and
again.  Now that mf_value is larger (due to tun_metadata field), this
is more expensive.  Avoid this by using static initialization.

mf_mask_field_and_prereqs() is used only for set field and reg move,
which never deal with the tun_metadata field as a whole.

Signed-off-by: Jarno Rajahalme 
---
 lib/flow.h   |  2 ++
 lib/meta-flow.c  | 20 ++--
 lib/meta-flow.h  |  3 ++-
 lib/nx-match.c   |  4 ++--
 ofproto/ofproto-dpif-xlate.c | 10 +-
 5 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/lib/flow.h b/lib/flow.h
index f9da6b3..1751aed 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -318,6 +318,8 @@ struct flow_wildcards {
 
 #define WC_MASK_FIELD(WC, FIELD) \
 memset(&(WC)->masks.FIELD, 0xff, sizeof (WC)->masks.FIELD)
+#define WC_MASK_FIELD_MASK(WC, FIELD, MASK) \
+((WC)->masks.FIELD |= (MASK))
 #define WC_UNMASK_FIELD(WC, FIELD) \
 memset(&(WC)->masks.FIELD, 0, sizeof (WC)->masks.FIELD)
 
diff --git a/lib/meta-flow.c b/lib/meta-flow.c
index 528c109..1b7f9ca 100644
--- a/lib/meta-flow.c
+++ b/lib/meta-flow.c
@@ -390,21 +390,21 @@ mf_are_prereqs_ok(const struct mf_field *mf, const struct 
flow *flow)
 
 /* Set field and it's prerequisities in the mask.
  * This is only ever called for writeable 'mf's, but we do not make the
- * distinction here. */
+ * distinction here.
+ * The widest field this is ever called for an IPv6 address (16 bytes). */
 void
-mf_mask_field_and_prereqs(const struct mf_field *mf, struct flow *mask)
+mf_mask_field_and_prereqs(const struct mf_field *mf, struct flow_wildcards *wc)
 {
-static union mf_value exact_match_mask;
+static union mf_value exact_match_mask = { .ipv6 = IN6ADDR_EXACT_INIT };
 
-memset(&exact_match_mask, 0xff, sizeof exact_match_mask);
-mf_set_flow_value(mf, &exact_match_mask, mask);
+mf_set_flow_value(mf, &exact_match_mask, &wc->masks);
 
 switch (mf->prereqs) {
 case MFP_ND:
 case MFP_ND_SOLICIT:
 case MFP_ND_ADVERT:
-mask->tp_src = OVS_BE16_MAX;
-mask->tp_dst = OVS_BE16_MAX;
+WC_MASK_FIELD(wc, tp_src);
+WC_MASK_FIELD(wc, tp_dst);
 /* Fall through. */
 case MFP_TCP:
 case MFP_UDP:
@@ -412,17 +412,17 @@ mf_mask_field_and_prereqs(const struct mf_field *mf, 
struct flow *mask)
 case MFP_ICMPV4:
 case MFP_ICMPV6:
 /* nw_frag always unwildcarded. */
-mask->nw_proto = 0xff;
+WC_MASK_FIELD(wc, nw_proto);
 /* Fall through. */
 case MFP_ARP:
 case MFP_IPV4:
 case MFP_IPV6:
 case MFP_MPLS:
 case MFP_IP_ANY:
-mask->dl_type = OVS_BE16_MAX;
+/* dl_type always unwildcarded. */
 break;
 case MFP_VLAN_VID:
-mask->vlan_tci |= htons(VLAN_CFI);
+WC_MASK_FIELD_MASK(wc, vlan_tci, htons(VLAN_CFI));
 break;
 case MFP_NONE:
 break;
diff --git a/lib/meta-flow.h b/lib/meta-flow.h
index 8feefcc..3dc342d 100644
--- a/lib/meta-flow.h
+++ b/lib/meta-flow.h
@@ -1824,7 +1824,8 @@ void mf_get_mask(const struct mf_field *, const struct 
flow_wildcards *,
 
 /* Prerequisites. */
 bool mf_are_prereqs_ok(const struct mf_field *, const struct flow *);
-void mf_mask_field_and_prereqs(const struct mf_field *, struct flow *mask);
+void mf_mask_field_and_prereqs(const struct mf_field *,
+   struct flow_wildcards *);
 void mf_bitmap_set_field_and_prereqs(const struct mf_field *mf, struct
  mf_bitmap *bm);
 
diff --git a/lib/nx-match.c b/lib/nx-match.c
index a8b1183..54645df 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -1577,8 +1577,8 @@ nxm_execute_reg_move(const struct ofpact_reg_move *move,
 union mf_value src_value;
 union mf_value dst_value;
 
-mf_mask_field_and_prereqs(move->dst.field, &wc->masks);
-mf_mask_field_and_prereqs(move->src.field, &wc->masks);
+mf_mask_field_and_prereqs(move->dst.field, wc);
+mf_mask_field_and_prereqs(move->src.field, wc);
 
 /* A flow may wildcard nw_frag.  Do nothing if setting a transport
  * header field on a packet that does not have them. */
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index a34b5f9..6b9713f 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4327,7 +4327,7 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t 
ofpacts_len,
 }
 /* A flow may wildcard nw_frag.  Do nothing if setting a trasport
  * header field on a packet that does not have them. */
-mf_mask_field_and_prereqs(mf, &wc->masks);
+mf_mask_field_and_prereqs(mf, wc);
 if (mf_are_prereqs_ok(mf, flow)) {
 mf_set_flow_value_masked(mf, &set_field->value,
  &set_field->mask, flow);
@@ -4689,16 +4689,16 @@ xlate_wc_init(struct xlate_ctx *ct

[ovs-dev] [PATCH v5 3/6] flow: Add struct flowmap.

2015-08-21 Thread Jarno Rajahalme
Struct miniflow is now sometimes used just as a map.  Define a new
struct flowmap for that purpose.  The flowmap is defined as an array of
maps, and it is automatically sized according to the size of struct
flow, so it will be easier to maintain in the future.

It would have been tempting to use the existing struct bitmap for this
purpose. The main reason this is not feasible at the moment is that
some flowmap algorithms are simpler when it can be assumed that no
struct flow member requires more bits than can fit to a single map
unit. The tunnel member already requires more than 32 bits, so the map
unit needs to be 64 bits wide.

Performance critical algorithms enumerate the flowmap array units
explicitly, as it is easier for the compiler to optimize, compared to
the normal iterator.  Without this optimization a classifier lookup
without wildcard masks would be about 25% slower.

With this more general (and maintainable) algorithm the classifier
lookups are about 5% slower, when the struct flow actually becomes big
enough to require a second map.  This negates the performance gained
in the "Pre-compute stage masks" patch earlier in the series.

Requested-by: Ben Pfaff 
Signed-off-by: Jarno Rajahalme 
---
 lib/classifier-private.h |  89 -
 lib/classifier.c | 228 ++
 lib/dpif-netdev.c|  93 +++--
 lib/flow.c   | 355 +++---
 lib/flow.h   | 483 ---
 lib/match.c  |   7 +-
 tests/test-classifier.c  |  32 ++--
 7 files changed, 666 insertions(+), 621 deletions(-)

diff --git a/lib/classifier-private.h b/lib/classifier-private.h
index a3c0f8c..13d523c 100644
--- a/lib/classifier-private.h
+++ b/lib/classifier-private.h
@@ -42,7 +42,7 @@ struct cls_subtable {
 /* These fields are accessed by readers who care about wildcarding. */
 const tag_type tag;   /* Tag generated from mask for partitioning. */
 const uint8_t n_indices;   /* How many indices to use. */
-const struct miniflow index_maps[CLS_MAX_INDICES + 1]; /* Stage maps. */
+const struct flowmap index_maps[CLS_MAX_INDICES + 1]; /* Stage maps. */
 unsigned int trie_plen[CLS_MAX_TRIES];  /* Trie prefix length in 'mask'
  * (runtime configurable). */
 const int ports_mask_len;
@@ -238,16 +238,16 @@ flow_hash_in_minimask(const struct flow *flow, const 
struct minimask *mask,
 const uint64_t *mask_values = miniflow_get_values(&mask->masks);
 const uint64_t *flow_u64 = (const uint64_t *)flow;
 const uint64_t *p = mask_values;
-uint32_t hash;
-unsigned int idx;
+uint32_t hash = basis;
+map_t map;
 
-hash = basis;
-MAP_FOR_EACH_INDEX(idx, mask->masks.tnl_map) {
-hash = hash_add64(hash, flow_u64[idx] & *p++);
-}
-flow_u64 += FLOW_TNL_U64S;
-MAP_FOR_EACH_INDEX(idx, mask->masks.pkt_map) {
-hash = hash_add64(hash, flow_u64[idx] & *p++);
+FLOWMAP_FOR_EACH_MAP (map, mask->masks.map) {
+unsigned int idx;
+
+MAP_FOR_EACH_INDEX (idx, map) {
+hash = hash_add64(hash, flow_u64[idx] & *p++);
+}
+flow_u64 += MAP_T_BITS;
 }
 
 return hash_finish(hash, (p - mask_values) * 8);
@@ -265,13 +265,10 @@ miniflow_hash_in_minimask(const struct miniflow *flow,
 const uint64_t *mask_values = miniflow_get_values(&mask->masks);
 const uint64_t *p = mask_values;
 uint32_t hash = basis;
-uint64_t flow_u64;
+uint64_t value;
 
-MINIFLOW_FOR_EACH_IN_TNL_MAP(flow_u64, flow, mask->masks) {
-hash = hash_add64(hash, flow_u64 & *p++);
-}
-MINIFLOW_FOR_EACH_IN_PKT_MAP(flow_u64, flow, mask->masks) {
-hash = hash_add64(hash, flow_u64 & *p++);
+MINIFLOW_FOR_EACH_IN_FLOWMAP(value, flow, mask->masks.map) {
+hash = hash_add64(hash, value & *p++);
 }
 
 return hash_finish(hash, (p - mask_values) * 8);
@@ -293,22 +290,23 @@ miniflow_hash_in_minimask(const struct miniflow *flow,
 static inline uint32_t
 flow_hash_in_minimask_range(const struct flow *flow,
 const struct minimask *mask,
-const struct miniflow *range,
+const struct flowmap range,
 unsigned int *offset,
 uint32_t *basis)
 {
 const uint64_t *mask_values = miniflow_get_values(&mask->masks);
-const uint64_t *p = mask_values + *offset;
 const uint64_t *flow_u64 = (const uint64_t *)flow;
+const uint64_t *p = mask_values + *offset;
 uint32_t hash = *basis;
-unsigned int idx;
+map_t map;
 
-MAP_FOR_EACH_INDEX(idx, range->tnl_map) {
-hash = hash_add64(hash, flow_u64[idx] & *p++);
-}
-flow_u64 += FLOW_TNL_U64S;
-MAP_FOR_EACH_INDEX(idx, range->pkt_map) {
-hash = hash_add64(hash, flow_u64[idx] & *p++);
+FLOWMAP_FOR_EACH_MAP (map, range

Re: [ovs-dev] [PATCH v4 4/9] classifier: Pre-compute stage masks.

2015-08-21 Thread Jarno Rajahalme
I sent a v5 of the rest of the series (with some new patches, too).

  Jarno

> On Aug 21, 2015, at 2:36 PM, Jarno Rajahalme  wrote:
> 
> 
>> On Aug 21, 2015, at 8:30 AM, Ben Pfaff > > wrote:
>> 
>> On Fri, Aug 07, 2015 at 04:57:37PM -0700, Jarno Rajahalme wrote:
>>> This makes stage mask computation happen only when a subtable is
>>> inserted and allows simplification of the main lookup function.
>>> 
>>> Classifier benchmark shows that this speeds up the classification
>>> (with wildcards) about 5%.
>>> 
>>> Signed-off-by: Jarno Rajahalme >> >
>> 
>> I don't think that flow_hash_in_minimask_range() actually does what its
>> name implies anymore (where's the range?).  It at least needs a comment
>> update since the comment still talks about parameters it doesn't have.
>> Ditto for minimatch_hash_range().
>> 
> 
> The ‘maps’ argument really needs to be a successive/continuous subset of the 
> bits in the masks map. I renamed it as “range” and fixed and elaborated the 
> comments.
> 
>> In a construction like this:
>>if (map->tnl_map) {
>>MAP_FOR_EACH_INDEX(idx, map->tnl_map) {
>>dst_u64[idx] |= *p++;
>>}
>>}
>> it appears to me that the 'if' is redundant.  In one case I see there's
>> an OVS_UNLIKELY on the 'if' but I wonder whether that's sufficient
>> benefit.
>> 
>> If we believe that tnl_map is usually 0, then it would be profitable to
>> test pkt_map first below to allow short-circuiting to bail out earlier:
>> 
>>static inline bool
>>miniflow_equal_maps(const struct miniflow *a, const struct miniflow *b)
>>{
>>return a->tnl_map == b->tnl_map && a->pkt_map == b->pkt_map;
>>}
>> 
> 
> Right, it has likely only very marginal benefit, so I removed the extra if’s.
> 
>> 
>> While reading code, I came up with some additional comments that are not
>> directly related to the changes here.
>> 
>> First, classifier.c has a lot of "static inline" functions.  Does the
>> "inline" actually produce measurable performance improvement?  If not,
>> then it's better to avoid "inline" in .c files since it suppresses
>> otherwise useful "unused function" warnings.
>> 
> 
> Currently, the compiler inlines all these functions as intended. Earlier I 
> noticed from the disassembly that inlining was not happening without the 
> explicit “inline” keywords. Now that the classifier benchmark is in, we can 
> test if it makes a difference in performance. I’ll do this sometime later.
> 
>> Second, the amount of duplicated code because of tnl_map versus pkt_map
>> is starting to bug me.  If these were just a 2-element array, for
>> example, then miniflow_and_mask_matches_flow_wc() could be in my opinion
>> tons cleaner as something like this:
>> 
> 
> A later patch of the series does this in a general way (sizing the array 
> automatically).
> 
>> {
>>const uint64_t *flowp = miniflow_get_values(flow);
>>const uint64_t *maskp = miniflow_get_values(&mask->masks);
>>const uint64_t *target_u64 = (const uint64_t *)target;
>>uint64_t *wc_u64 = (uint64_t *)&wc->masks;
>> 
>>for (int i = 0; i < 2; i++) {
>>size_t idx;
>> 
>>MAP_FOR_EACH_INDEX(idx, mask->masks.maps[i]) {
>>uint64_t msk = *maskp++;
>> 
>>uint64_t diff = (*flowp++ ^ target_u64[idx]) & msk;
>>if (diff) {
>>/* Only unwildcard if none of the differing bits is already
>> * exact-matched. */
>>if (!(wc_u64[idx] & diff)) {
>>/* Keep one bit of the difference.  The selected bit may 
>> be
>> * different in big-endian v.s. little-endian systems. */
>>wc_u64[idx] |= rightmost_1bit(diff);
>>}
>>return false;
>>}
>> 
>>/* Fill in the bits that were looked at. */
>>wc_u64[idx] |= msk;
>>}
>>target_u64 += FLOW_TNL_U64S;
>>wc_u64 += FLOW_TNL_U64S;
>>}
>> 
>>return true;
>> }
>> 
>> I noticed that MINIFLOW_IN_MAP could be a function, so I send out a
>> patch (which applies following this one):
>>http://openvswitch.org/pipermail/dev/2015-August/059029.html 
>> 
> 
> 
> There might be other macros that could be functions as well in the series, 
> I’ll check for that.
> 
>   Jarno
> 
> 

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/3] ovsdb-server: Add the ability to push peer-cert.

2015-08-21 Thread Gurucharan Shetty
On Fri, Aug 21, 2015 at 2:43 PM, Ben Pfaff  wrote:
> On Thu, Aug 20, 2015 at 10:03:39AM -0700, Gurucharan Shetty wrote:
>> In OVN, ovsdb-server is the daemon that manages the databases
>> and can be called as the central controller. So it would be
>> nice for ovsdb-server to be able to push its self-signed
>> certificate to all the other nodes where ovn-controller runs.
>>
>> Signed-off-by: Gurucharan Shetty 
>
> This is a good idea, especially the test.
>
> The test passes a plain --log-file option and a --log-file option with a
> full path.  The first one fails:
>
> 2015-08-21T21:34:19Z|1|vlog|WARN|failed to open
> /var/log/openvswitch/ovsdb-server.log for logging: No such file or
> directory
>
> so I'd replace it by the one with the full path.
>
> The log message from ovs-vsctl on the disconnection is confusing:
>
> ovs-vsctl: ssl:127.0.0.1:34766: database connection failed ()
>
> It looks like this improves it, at least to "(Protocol error)":
>
> diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c
> index ae51b42..1e312a2 100644
> --- a/lib/jsonrpc.c
> +++ b/lib/jsonrpc.c
> @@ -948,6 +948,7 @@ jsonrpc_session_run(struct jsonrpc_session *s)
>  reconnect_connect_failed(s->reconnect, time_msec(), error);
>  stream_close(s->stream);
>  s->stream = NULL;
> +s->last_error = error;
>  }
>  }
>
>
> The test runs over 10x faster on my system with 1024-bit keys:
>
> diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
> index 6a2189a..caaa497 100644
> --- a/tests/ovs-vsctl.at
> +++ b/tests/ovs-vsctl.at
> @@ -1315,9 +1315,9 @@ AT_KEYWORDS([ovs-vsctl ssl])
>  AT_SKIP_IF([test "$HAVE_OPENSSL" = no])
>  PKIDIR=`pwd`
>  OVS_PKI="sh $abs_top_srcdir/utilities/ovs-pki.in --dir=$PKIDIR/pki 
> --log=$PKIDIR/ovs-pki.log"
> -$OVS_PKI init && \
> -$OVS_PKI req+sign vsctl switch && \
> -$OVS_PKI req ovsdbserver && $OVS_PKI self-sign ovsdbserver
> +$OVS_PKI -B 1024 init && \
> +$OVS_PKI -B 1024 req+sign vsctl switch && \
> +$OVS_PKI -B 1024 req ovsdbserver && $OVS_PKI self-sign ovsdbserver
>
>  dnl Create database.
>  touch .conf.db.~lock~
>
> Acked-by: Ben Pfaff 

I took all your suggestions and pushed the series. I have one question
for you though.
In lib/stream-ssl.c there is this piece of code:

/* Check that 'cert' is self-signed.  Otherwise it is not a CA
 * certificate and we should not attempt to use it as one. */
error = X509_check_issued(cert, cert);
if (error) {
VLOG_ERR("could not bootstrap CA cert: obtained certificate is "
 "not self-signed (%s)",
 X509_verify_cert_error_string(error));
if (sk_X509_num(chain) < 2) {
VLOG_ERR("only one certificate was received, so probably the peer "
 "is not configured to send its CA certificate");
}
return EPROTO;
}


Now, what the above does is that it will only let boot-strap happen if
the controller certificate is self-signed (which is what the unit test
in this commit does). The bootstrap fails if the controller
certificate is signed by a CA. The check looks to be explicit and was
present many years ago, so there must have been a reason for that. Do
you remember why? The man pages do not mandate this requirement and
makes you believe that CA certificates are OK.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Returned mail: Data format error

2015-08-21 Thread researcher


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] News24,

2015-08-21 Thread HSBC BANK
Kindly View Attached File
Yours Faithfully,Dr/ Mrs. Catherine.T.ButchHSBC Bank LondonKindly Contact Dr. 
Daniel Mminele by via Email: daniel.mmin...@mailbox.co.za Dr. Daniel Mminele 
Tel No: +27 837 425 342

  
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 1/3] bridge: Relax the whitelist format for punix path.

2015-08-21 Thread Alex Wang
This commit relaxes the whitelist format for punix path for
service controller.  Instead of only allowing
punix:/.controller, the new format
allows any suffix, like punix:/.*.

Signed-off-by: Alex Wang 
---
 tests/ovs-vswitchd.at |   11 +++
 vswitchd/bridge.c |   11 ++-
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/tests/ovs-vswitchd.at b/tests/ovs-vswitchd.at
index 3b7c516..a42c272 100644
--- a/tests/ovs-vswitchd.at
+++ b/tests/ovs-vswitchd.at
@@ -153,3 +153,14 @@ AT_CHECK([sed -n "
 ])
 
 AT_CLEANUP
+
+dnl --
+AT_SETUP([ovs-vswitchd -- set service controller])
+AT_SKIP_IF([test "$IS_WIN32" = "yes"])
+OVS_VSWITCHD_START
+
+AT_CHECK([ovs-vsctl set-controller br0 punix:$(pwd)/br0.void])
+OVS_WAIT_UNTIL([test -e br0.void])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index f021360..b95610d 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -3559,18 +3559,19 @@ bridge_configure_remotes(struct bridge *br,
 continue;
 }
 } else {
-   whitelist = xasprintf("punix:%s/%s.controller",
+   whitelist = xasprintf("punix:%s/%s.",
  ovs_rundir(), br->name);
-   if (!equal_pathnames(c->target, whitelist, SIZE_MAX)) {
+   if (!equal_pathnames(c->target, whitelist, strlen(whitelist))) {
/* Prevent remote ovsdb-server users from accessing
 * arbitrary Unix domain sockets and overwriting arbitrary
 * local files. */
VLOG_ERR_RL(&rl, "bridge %s: Not adding Unix domain socket "
   "controller \"%s\" due to possibility of "
   "overwriting local files. Instead, specify "
-  "whitelisted \"%s\" or connect to "
-  "\"unix:%s/%s.mgmt\" (which is always "
-  "available without special configuration).",
+  "path in whitelisted format \"%s*\" or "
+  "connect to \"unix:%s/%s.mgmt\" (which is "
+  "always available without special "
+  "configuration).",
   br->name, c->target, whitelist,
   ovs_rundir(), br->name);
free(whitelist);
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/3] service controller: Allow change to punix socket file group ownership.

2015-08-21 Thread Alex Wang
This commit adds a new key-value pair, 'punix_file_group=',
to the 'other_config' column in the 'Controller' table.  This new config
allows user to change the punix socket file's group ownership, so that
non-root process can also connect to ovs bridge.

Signed-off-by: Alex Wang 
---
 lib/socket-util-unix.c |   70 
 lib/socket-util.h  |7 +
 vswitchd/bridge.c  |   17 
 vswitchd/vswitch.xml   |   16 +++
 4 files changed, 110 insertions(+)

diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c
index afab195..40105cb 100644
--- a/lib/socket-util-unix.c
+++ b/lib/socket-util-unix.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -49,6 +50,75 @@ VLOG_DEFINE_THIS_MODULE(socket_util_unix);
  * space for a null terminator. */
 #define MAX_UN_LEN (sizeof(((struct sockaddr_un *) 0)->sun_path) - 1)
 
+/* Given 'group_name', returns true if the corresponding 'group'
+ * is found and saves the group id into 'gid'.  Returns false if
+ * cannot find 'group'. */
+static bool
+query_group_gid(const char *group_name, gid_t *gid)
+{
+struct group group;
+struct group *result;
+size_t buflen;
+char *buf;
+
+if (!group_name) {
+return false;
+}
+
+buflen = sysconf(_SC_GETGR_R_SIZE_MAX);
+buf = xmalloc(buflen * sizeof *buf);
+getgrnam_r(group_name, &group, buf, buflen, &result);
+
+if (!result) {
+VLOG_ERR("cannot retrieve group info for specified group "
+ "(%s) : %s",
+ group_name,
+ errno ? ovs_strerror(errno) : "group not found");
+} else {
+*gid = result->gr_gid;
+}
+free(buf);
+
+return result;
+}
+
+/* Sets the unix domain socket file's group ownership to 'group'. */
+void
+unix_socket_set_file_group(const char *path, const char *group)
+{
+struct stat info;
+gid_t gid = 0;
+
+if (stat(path, &info)) {
+VLOG_WARN("set (%s) group fails : cannot find file (%s)",
+  path, path);
+return;
+}
+
+if (!query_group_gid(group ? group : "root", &gid)) {
+VLOG_WARN("set (%s) group fails : cannot find group (%s)",
+  path, group ? group : "root");
+return;
+}
+
+/* Returns if current group id is same as desired group id. */
+if (gid == info.st_gid) {
+return;
+} else {
+/* Changes group ownership to 'gid'. */
+if (chown(path, -1, gid)) {
+VLOG_WARN("chown (%s) group to (%s) fails: %s",
+  path, group ? group : "root", ovs_strerror(errno));
+return;
+}
+if (chmod(path, 0770)) {
+VLOG_WARN("chmod (%s) permission to 0700 fails: %s",
+  path, ovs_strerror(errno));
+return;
+}
+}
+}
+
 void
 xpipe(int fds[2])
 {
diff --git a/lib/socket-util.h b/lib/socket-util.h
index 8015c7f..77c251c 100644
--- a/lib/socket-util.h
+++ b/lib/socket-util.h
@@ -81,6 +81,7 @@ void xpipe_nonblocking(int fds[2]);
 
 int drain_rcvbuf(int fd);
 
+void unix_socket_set_file_group(const char *path, const char *group);
 int make_unix_socket(int style, bool nonblock,
  const char *bind_path, const char *connect_path);
 int get_unix_name_len(socklen_t sun_len);
@@ -95,6 +96,12 @@ int af_inet_ifreq_ioctl(const char *name, struct ifreq *,
 #endif
 
 #ifdef _WIN32
+static inline void unix_socket_set_file_group(const char *path,
+  const char *group)
+{
+return;
+}
+
 static inline int make_unix_socket(int style, bool nonblock,
const char *bind_path,
const char *connect_path)
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index b95610d..c931247 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -3507,6 +3507,7 @@ bridge_configure_remotes(struct bridge *br,
 
 enum ofproto_fail_mode fail_mode;
 
+const char **punix_file_groups;
 struct ofproto_controller *ocs;
 size_t n_ocs;
 size_t i;
@@ -3528,9 +3529,12 @@ bridge_configure_remotes(struct bridge *br,
 
 n_controllers = bridge_get_controllers(br, &controllers);
 
+punix_file_groups =
+xmalloc((n_controllers + 1) * sizeof *punix_file_groups);
 ocs = xmalloc((n_controllers + 1) * sizeof *ocs);
 n_ocs = 0;
 
+punix_file_groups[n_ocs] = NULL;
 bridge_ofproto_controller_for_mgmt(br, &ocs[n_ocs++]);
 for (i = 0; i < n_controllers; i++) {
 struct ovsrec_controller *c = controllers[i];
@@ -3584,6 +3588,8 @@ bridge_configure_remotes(struct bridge *br,
 
 bridge_configure_local_iface_netdev(br, c);
 bridge_ofproto_controller_from_ovsrec(c, &ocs[n_ocs]);
+punix_file_groups[n_ocs] = smap_get(&c->other_config,
+"punix_file_group");
 if (disable_in_band)

[ovs-dev] [PATCH RFC 3/3] manager: Allow change to punix socket file group ownership.

2015-08-21 Thread Alex Wang
This commit adds a new key-value pair, 'punix_file_group=',
to the 'other_config' column in the 'Manager' table.  This new config
allows user to change the punix socket file's group ownership, so that
non-root process can also connect to ovsdb-server.

Signed-off-by: Alex Wang 
---
 ovsdb/jsonrpc-server.c |6 ++
 ovsdb/jsonrpc-server.h |1 +
 ovsdb/ovsdb-server.c   |2 ++
 vswitchd/vswitch.xml   |   16 
 4 files changed, 25 insertions(+)

diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c
index fffcb73..387a7a0 100644
--- a/ovsdb/jsonrpc-server.c
+++ b/ovsdb/jsonrpc-server.c
@@ -32,6 +32,7 @@
 #include "row.h"
 #include "server.h"
 #include "simap.h"
+#include "socket-util.h"
 #include "stream.h"
 #include "table.h"
 #include "timeval.h"
@@ -227,6 +228,11 @@ ovsdb_jsonrpc_server_set_remotes(struct 
ovsdb_jsonrpc_server *svr,
 }
 
 ovsdb_jsonrpc_session_set_all_options(remote, options);
+
+if (!strncmp(node->name, "punix:", 6)) {
+unix_socket_set_file_group(node->name + 6,
+   options->punix_file_group);
+}
 }
 }
 
diff --git a/ovsdb/jsonrpc-server.h b/ovsdb/jsonrpc-server.h
index fce8b7b..36a15f3 100644
--- a/ovsdb/jsonrpc-server.h
+++ b/ovsdb/jsonrpc-server.h
@@ -35,6 +35,7 @@ struct ovsdb_jsonrpc_options {
 int max_backoff;/* Maximum reconnection backoff, in msec. */
 int probe_interval; /* Max idle time before probing, in msec. */
 int dscp;   /* Dscp value for manager connections */
+const char *punix_file_group; /* For setting the punix file's group. */
 };
 struct ovsdb_jsonrpc_options *
 ovsdb_jsonrpc_default_options(const char *target);
diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index cd13b0d..8dca006 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -770,6 +770,8 @@ add_manager_options(struct shash *remotes, const struct 
ovsdb_row *row)
 options->dscp = dscp;
 }
 }
+options->punix_file_group = read_map_string_column(row, "other_config",
+   "punix_file_group");
 }
 
 static void
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 6f6e0ed..ae7abfb 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -4286,6 +4286,22 @@
 default value of 48 is chosen.  Valid DSCP values must be in the range
 0 to 63.
   
+
+  
+
+  When connection method in  is
+  punix, this config specifies the user group to which
+  the group ownership for 'punix' (unix domain socket) file created
+  by ovsdb will be applied.  Also, the file's access permission will be
+  changed to '0770'.
+
+
+  By default, the 'punix' file is associated with the 'root'
+  group and have access permission '0700'.  If this config is
+  not specified or specified as 'root', the default is restored.
+
+  
 
 
 
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC 3/3] manager: Allow change to punix socket file group ownership.

2015-08-21 Thread Alex Wang
If we want to make ovsdb-server non-root, this change may not be need,~

On Fri, Aug 21, 2015 at 11:10 PM, Alex Wang  wrote:

> This commit adds a new key-value pair, 'punix_file_group=',
> to the 'other_config' column in the 'Manager' table.  This new config
> allows user to change the punix socket file's group ownership, so that
> non-root process can also connect to ovsdb-server.
>
> Signed-off-by: Alex Wang 
> ---
>  ovsdb/jsonrpc-server.c |6 ++
>  ovsdb/jsonrpc-server.h |1 +
>  ovsdb/ovsdb-server.c   |2 ++
>  vswitchd/vswitch.xml   |   16 
>  4 files changed, 25 insertions(+)
>
> diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c
> index fffcb73..387a7a0 100644
> --- a/ovsdb/jsonrpc-server.c
> +++ b/ovsdb/jsonrpc-server.c
> @@ -32,6 +32,7 @@
>  #include "row.h"
>  #include "server.h"
>  #include "simap.h"
> +#include "socket-util.h"
>  #include "stream.h"
>  #include "table.h"
>  #include "timeval.h"
> @@ -227,6 +228,11 @@ ovsdb_jsonrpc_server_set_remotes(struct
> ovsdb_jsonrpc_server *svr,
>  }
>
>  ovsdb_jsonrpc_session_set_all_options(remote, options);
> +
> +if (!strncmp(node->name, "punix:", 6)) {
> +unix_socket_set_file_group(node->name + 6,
> +   options->punix_file_group);
> +}
>  }
>  }
>
> diff --git a/ovsdb/jsonrpc-server.h b/ovsdb/jsonrpc-server.h
> index fce8b7b..36a15f3 100644
> --- a/ovsdb/jsonrpc-server.h
> +++ b/ovsdb/jsonrpc-server.h
> @@ -35,6 +35,7 @@ struct ovsdb_jsonrpc_options {
>  int max_backoff;/* Maximum reconnection backoff, in msec.
> */
>  int probe_interval; /* Max idle time before probing, in msec.
> */
>  int dscp;   /* Dscp value for manager connections */
> +const char *punix_file_group; /* For setting the punix file's group.
> */
>  };
>  struct ovsdb_jsonrpc_options *
>  ovsdb_jsonrpc_default_options(const char *target);
> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
> index cd13b0d..8dca006 100644
> --- a/ovsdb/ovsdb-server.c
> +++ b/ovsdb/ovsdb-server.c
> @@ -770,6 +770,8 @@ add_manager_options(struct shash *remotes, const
> struct ovsdb_row *row)
>  options->dscp = dscp;
>  }
>  }
> +options->punix_file_group = read_map_string_column(row,
> "other_config",
> +
>  "punix_file_group");
>  }
>
>  static void
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 6f6e0ed..ae7abfb 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -4286,6 +4286,22 @@
>  default value of 48 is chosen.  Valid DSCP values must be in the
> range
>  0 to 63.
>
> +
> +   +type='{"type": "string"}'>
> +
> +  When connection method in  is
> +  punix, this config specifies the user group to
> which
> +  the group ownership for 'punix' (unix domain socket) file
> created
> +  by ovsdb will be applied.  Also, the file's access permission
> will be
> +  changed to '0770'.
> +
> +
> +  By default, the 'punix' file is associated with the 'root'
> +  group and have access permission '0700'.  If this config is
> +  not specified or specified as 'root', the default is restored.
> +
> +  
>  
>
>  
> --
> 1.7.9.5
>
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev