Re: [ovs-dev] [PATCH v7 01/16] hashtable: introduce a small and naive hashtable

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> This hashtable implementation is using hlist buckets to provide a simple
> hashtable to prevent it from getting reimplemented all over the kernel.
> 
> Signed-off-by: Sasha Levin 
> ---
> 
> Sorry for the long delay, I was busy with a bunch of personal things.
> 
> Changes since v6:
> 
>  - Use macros that point to internal static inline functions instead of
>  implementing everything as a macro.
>  - Rebase on latest -next.
>  - Resending the enter patch series on request.
>  - Break early from hash_empty() if found to be non-empty.
>  - DECLARE_HASHTABLE/DEFINE_HASHTABLE.
> 
> 
>  include/linux/hashtable.h | 193 
> ++
>  1 file changed, 193 insertions(+)
>  create mode 100644 include/linux/hashtable.h
> 
> diff --git a/include/linux/hashtable.h b/include/linux/hashtable.h
> new file mode 100644
> index 000..1fb8c97
> --- /dev/null
> +++ b/include/linux/hashtable.h
> @@ -0,0 +1,193 @@
> +/*
> + * Statically sized hash table implementation
> + * (C) 2012  Sasha Levin 
> + */
> +
> +#ifndef _LINUX_HASHTABLE_H
> +#define _LINUX_HASHTABLE_H
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DEFINE_HASHTABLE(name, bits) 
> \
> + struct hlist_head name[1 << bits] = 
> \
> + { [0 ... ((1 << bits) - 1)] = HLIST_HEAD_INIT }

Although it's unlikely that someone would use this with a binary
operator with lower precedence than "<<" (see e.g.
http://www.swansontec.com/sopc.html) as "bits", lack of parenthesis
around "bits" would be unexpected by the caller, and could introduce
bugs. Please review all macros with the precedence table in mind, and
ask yourself if lack of parenthesis could introduce a subtle bug.

> +
> +#define DECLARE_HASHTABLE(name, bits)
> \
> + struct hlist_head name[1 << (bits)]

Here, you have parenthesis around "bits", but not above (inconsistency).

> +
> +#define HASH_SIZE(name) (ARRAY_SIZE(name))
> +#define HASH_BITS(name) ilog2(HASH_SIZE(name))
> +
> +/* Use hash_32 when possible to allow for fast 32bit hashing in 64bit 
> kernels. */
> +#define hash_min(val, bits)  
> \
> +({   
> \
> + sizeof(val) <= 4 ?  
> \
> + hash_32(val, bits) :
> \
> + hash_long(val, bits);   
> \
> +})
> +
> +static inline void __hash_init(struct hlist_head *ht, int sz)

int -> unsigned int.

> +{
> + int i;

int -> unsigned int.

> +
> + for (i = 0; i < sz; i++)
> + INIT_HLIST_HEAD(&ht[sz]);

ouch. How did this work ? Has it been tested at all ?

sz -> i


> +}
> +
> +/**
> + * hash_init - initialize a hash table
> + * @hashtable: hashtable to be initialized
> + *
> + * Calculates the size of the hashtable from the given parameter, otherwise
> + * same as hash_init_size.
> + *
> + * This has to be a macro since HASH_BITS() will not work on pointers since
> + * it calculates the size during preprocessing.
> + */
> +#define hash_init(hashtable) __hash_init(hashtable, HASH_SIZE(hashtable))
> +
> +/**
> + * hash_add - add an object to a hashtable
> + * @hashtable: hashtable to add to
> + * @node: the &struct hlist_node of the object to be added
> + * @key: the key of the object to be added
> + */
> +#define hash_add(hashtable, node, key)   
> \
> + hlist_add_head(node, &hashtable[hash_min(key, HASH_BITS(hashtable))]);

extra ";" at the end to remove.

> +
> +/**
> + * hash_add_rcu - add an object to a rcu enabled hashtable
> + * @hashtable: hashtable to add to
> + * @node: the &struct hlist_node of the object to be added
> + * @key: the key of the object to be added
> + */
> +#define hash_add_rcu(hashtable, node, key)   
> \
> + hlist_add_head_rcu(node, &hashtable[hash_min(key, 
> HASH_BITS(hashtable))]);

extra ";" at the end to remove.

> +
> +/**
> + * hash_hashed - check whether an object is in any hashtable
> + * @node: the &struct hlist_node of the object to be checked
> + */
> +#define hash_hashed(node) (!hlist_unhashed(node))

Please use a static inline for this instead of a macro.

> +
> +static inline bool __hash_empty(struct hlist_head *ht, int sz)

int -> unsigned int.

> +{
> + int i;

int -> unsigned int.

> +
> + for (i = 0; i < sz; i++)
> + if (!hlist_empty(&ht[i]))
> + return false;
> +
> + return true;
> +}
> +
> +/**
> + * hash_empty - check whether a hashtable is empty
> + * @hashtable: hashtable to check
> + *
> + * This has to be a macro since HASH_BITS() will not work on pointers since
> + * it calculate

Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> Switch tracepoints to use the new hashtable implementation. This reduces the 
> amount of
> generic unrelated code in the tracepoints.
> 
> Signed-off-by: Sasha Levin 
> ---
>  kernel/tracepoint.c | 27 +++
>  1 file changed, 11 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index d96ba22..854df92 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  extern struct tracepoint * const __start___tracepoints_ptrs[];
>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
>   * Protected by tracepoints_mutex.
>   */
>  #define TRACEPOINT_HASH_BITS 6
> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
>  
[...]
>  
> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
>  
>  static int init_tracepoints(void)
>  {
> + hash_init(tracepoint_table);
> +
>   return register_module_notifier(&tracepoint_module_nb);
>  }
>  __initcall(init_tracepoints);

So we have a hash table defined in .bss (therefore entirely initialized
to NULL), and you add a call to "hash_init", which iterates on the whole
array and initialize it to NULL (again) ?

This extra initialization is redundant. I think it should be removed
from here, and hashtable.h should document that hash_init() don't need
to be called on zeroed memory (which includes static/global variables,
kzalloc'd memory, etc).

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 07/16] net, 9p: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> Switch 9p error table to use the new hashtable implementation. This reduces 
> the amount of
> generic unrelated code in 9p.
> 
> Signed-off-by: Sasha Levin 
> ---
>  net/9p/error.c | 21 ++---
>  1 file changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/net/9p/error.c b/net/9p/error.c
> index 2ab2de7..a5cc7dd 100644
> --- a/net/9p/error.c
> +++ b/net/9p/error.c
> @@ -34,7 +34,7 @@
>  #include 
>  #include 
>  #include 
> -
> +#include 

missing newline.

>  /**
>   * struct errormap - map string errors from Plan 9 to Linux numeric ids
>   * @name: string sent over 9P
> @@ -50,8 +50,8 @@ struct errormap {
>   struct hlist_node list;
>  };
>  
> -#define ERRHASHSZ32
> -static struct hlist_head hash_errmap[ERRHASHSZ];
> +#define ERR_HASH_BITS 5
> +static DEFINE_HASHTABLE(hash_errmap, ERR_HASH_BITS);
>  
>  /* FixMe - reduce to a reasonable size */
>  static struct errormap errmap[] = {
> @@ -193,18 +193,17 @@ static struct errormap errmap[] = {
>  int p9_error_init(void)
>  {
>   struct errormap *c;
> - int bucket;
> + u32 hash;
>  
>   /* initialize hash table */
> - for (bucket = 0; bucket < ERRHASHSZ; bucket++)
> - INIT_HLIST_HEAD(&hash_errmap[bucket]);
> + hash_init(hash_errmap);

As for most of the other patches in this series, the hash_init is
redundant for a statically defined hash table.

Thanks,

Mathieu

>  
>   /* load initial error map into hash table */
>   for (c = errmap; c->name != NULL; c++) {
>   c->namelen = strlen(c->name);
> - bucket = jhash(c->name, c->namelen, 0) % ERRHASHSZ;
> + hash = jhash(c->name, c->namelen, 0);
>   INIT_HLIST_NODE(&c->list);
> - hlist_add_head(&c->list, &hash_errmap[bucket]);
> + hash_add(hash_errmap, &c->list, hash);
>   }
>  
>   return 1;
> @@ -223,13 +222,13 @@ int p9_errstr2errno(char *errstr, int len)
>   int errno;
>   struct hlist_node *p;
>   struct errormap *c;
> - int bucket;
> + u32 hash;
>  
>   errno = 0;
>   p = NULL;
>   c = NULL;
> - bucket = jhash(errstr, len, 0) % ERRHASHSZ;
> - hlist_for_each_entry(c, p, &hash_errmap[bucket], list) {
> + hash = jhash(errstr, len, 0);
> + hash_for_each_possible(hash_errmap, c, p, list, hash) {
>   if (c->namelen == len && !memcmp(c->name, errstr, len)) {
>   errno = c->val;
>   break;
> -- 
> 1.7.12.4
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 08/16] block, elevator: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
[...]
> @@ -96,6 +97,8 @@ struct elevator_type
>   struct list_head list;
>  };
>  
> +#define ELV_HASH_BITS 6
> +
>  /*
>   * each queue has an elevator_queue associated with it
>   */
> @@ -105,7 +108,7 @@ struct elevator_queue
>   void *elevator_data;
>   struct kobject kobj;
>   struct mutex sysfs_lock;
> - struct hlist_head *hash;
> + DECLARE_HASHTABLE(hash, ELV_HASH_BITS);
>   unsigned int registered:1;

Hrm, so this is moving "registered" out of the elevator_queue first
cache-line by turning the pointer into a 256 or 512 bytes hash table.

Maybe we should consider moving "registered" before the "hash" field ?

Thanks,

Mathieu

>  };
>  
> -- 
> 1.7.12.4
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 09/16] SUNRPC/cache: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> Switch cache to use the new hashtable implementation. This reduces the amount 
> of
> generic unrelated code in the cache implementation.
> 
> Signed-off-by: Sasha Levin 
> ---
>  net/sunrpc/cache.c | 20 +---
>  1 file changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index fc2f7aa..0490546 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -524,19 +525,18 @@ EXPORT_SYMBOL_GPL(cache_purge);
>   * it to be revisited when cache info is available
>   */
>  
> -#define  DFR_HASHSIZE(PAGE_SIZE/sizeof(struct list_head))
> -#define  DFR_HASH(item)  long)item)>>4 ^ (((long)item)>>13)) % 
> DFR_HASHSIZE)
> +#define  DFR_HASH_BITS   9

If we look at a bit of history, mainly commit:

commit 1117449276bb909b029ed0b9ba13f53e4784db9d
Author: NeilBrown 
Date:   Thu Aug 12 17:04:08 2010 +1000

sunrpc/cache: change deferred-request hash table to use hlist.


we'll notice that the only reason why the prior DFR_HASHSIZE was using

  (PAGE_SIZE/sizeof(struct list_head))

instead of

  (PAGE_SIZE/sizeof(struct hlist_head))

is because it has been forgotten in that commit. The intent there is to
make the hash table array fit the page size.

By defining DFR_HASH_BITS arbitrarily to "9", this indeed fulfills this
purpose on architectures with 4kB page size and 64-bit pointers, but not
on some powerpc configurations, and Tile architectures, which have more
exotic 64kB page size, and of course on the far less exotic 32-bit
pointer architectures.

So defining e.g.:

#include 

#define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))

would keep the intended behavior in all cases: use one page for the hash
array.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 10/16] dlm: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
[...]
> @@ -158,34 +159,21 @@ static int dlm_allow_conn;
>  static struct workqueue_struct *recv_workqueue;
>  static struct workqueue_struct *send_workqueue;
>  
> -static struct hlist_head connection_hash[CONN_HASH_SIZE];
> +static struct hlist_head connection_hash[CONN_HASH_BITS];
>  static DEFINE_MUTEX(connections_lock);
>  static struct kmem_cache *con_cache;
>  
>  static void process_recv_sockets(struct work_struct *work);
>  static void process_send_sockets(struct work_struct *work);
>  
> -
> -/* This is deliberately very simple because most clusters have simple
> -   sequential nodeids, so we should be able to go straight to a connection
> -   struct in the array */
> -static inline int nodeid_hash(int nodeid)
> -{
> - return nodeid & (CONN_HASH_SIZE-1);
> -}

There is one thing I dislike about this change: you remove a useful
comment. It's good to be informed of the reason why a direct mapping
"value -> hash" without any dispersion function is preferred here.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 11/16] net, l2tp: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
[...]  
> -/* Session hash global list for L2TPv3.
> - * The session_id SHOULD be random according to RFC3931, but several
> - * L2TP implementations use incrementing session_ids.  So we do a real
> - * hash on the session_id, rather than a simple bitmask.
> - */
> -static inline struct hlist_head *
> -l2tp_session_id_hash_2(struct l2tp_net *pn, u32 session_id)
> -{
> - return &pn->l2tp_session_hlist[hash_32(session_id, L2TP_HASH_BITS_2)];
> -
> -}

I understand that you removed this hash function, as well as
"l2tp_session_id_hash" below, but is there any way we could leave those
comments in place ? They look useful.

> -/* Session hash list.
> - * The session_id SHOULD be random according to RFC2661, but several
> - * L2TP implementations (Cisco and Microsoft) use incrementing
> - * session_ids.  So we do a real hash on the session_id, rather than a
> - * simple bitmask.

Ditto.

> - */
> -static inline struct hlist_head *
> -l2tp_session_id_hash(struct l2tp_tunnel *tunnel, u32 session_id)
> -{
> - return &tunnel->session_hlist[hash_32(session_id, L2TP_HASH_BITS)];
> -}
> -
>  /* Lookup a session by id
>   */

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 10/16] dlm: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Mathieu Desnoyers (mathieu.desnoy...@efficios.com) wrote:
> * Sasha Levin (levinsasha...@gmail.com) wrote:
> [...]
> > @@ -158,34 +159,21 @@ static int dlm_allow_conn;
> >  static struct workqueue_struct *recv_workqueue;
> >  static struct workqueue_struct *send_workqueue;
> >  
> > -static struct hlist_head connection_hash[CONN_HASH_SIZE];
> > +static struct hlist_head connection_hash[CONN_HASH_BITS];
> >  static DEFINE_MUTEX(connections_lock);
> >  static struct kmem_cache *con_cache;
> >  
> >  static void process_recv_sockets(struct work_struct *work);
> >  static void process_send_sockets(struct work_struct *work);
> >  
> > -
> > -/* This is deliberately very simple because most clusters have simple
> > -   sequential nodeids, so we should be able to go straight to a connection
> > -   struct in the array */
> > -static inline int nodeid_hash(int nodeid)
> > -{
> > -   return nodeid & (CONN_HASH_SIZE-1);
> > -}
> 
> There is one thing I dislike about this change: you remove a useful
> comment. It's good to be informed of the reason why a direct mapping
> "value -> hash" without any dispersion function is preferred here.

And now that I come to think of it: you're changing the behavior : you
will now use a dispersion function on the key, which goes against the
intent expressed in this comment.

It might be good to change hash_add(), hash_add_rcu(),
hash_for_each_possible*() key parameter for a "hash" parameter, and let
the caller provide the hash value computed by the function they like as
parameter, rather than enforcing hash_32/hash_64.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 13/16] lockd: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> Switch lockd to use the new hashtable implementation. This reduces the amount 
> of
> generic unrelated code in lockd.
> 
> Signed-off-by: Sasha Levin 
> ---
>  fs/lockd/svcsubs.c | 66 
> +-
>  1 file changed, 36 insertions(+), 30 deletions(-)
> 
> diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
> index 0deb5f6..d223a1f 100644
> --- a/fs/lockd/svcsubs.c
> +++ b/fs/lockd/svcsubs.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define NLMDBG_FACILITY  NLMDBG_SVCSUBS
>  
> @@ -28,8 +29,7 @@
>   * Global file hash table
>   */
>  #define FILE_HASH_BITS   7
> -#define FILE_NRHASH  (1< -static struct hlist_head nlm_files[FILE_NRHASH];
> +static DEFINE_HASHTABLE(nlm_files, FILE_HASH_BITS);
>  static DEFINE_MUTEX(nlm_file_mutex);
>  
>  #ifdef NFSD_DEBUG
> @@ -68,7 +68,7 @@ static inline unsigned int file_hash(struct nfs_fh *f)
>   int i;
>   for (i=0; i   tmp += f->data[i];
> - return tmp & (FILE_NRHASH - 1);
> + return tmp;
>  }
>  
>  /*
> @@ -86,17 +86,17 @@ nlm_lookup_file(struct svc_rqst *rqstp, struct nlm_file 
> **result,
>  {
>   struct hlist_node *pos;
>   struct nlm_file *file;
> - unsigned inthash;
> + unsigned intkey;
>   __be32  nfserr;
>  
>   nlm_debug_print_fh("nlm_lookup_file", f);
>  
> - hash = file_hash(f);
> + key = file_hash(f);
>  
>   /* Lock file table */
>   mutex_lock(&nlm_file_mutex);
>  
> - hlist_for_each_entry(file, pos, &nlm_files[hash], f_list)
> + hash_for_each_possible(nlm_files, file, pos, f_list, file_hash(f))

we have a nice example of weirdness about key vs hash here:

1) "key" is computed from file_hash(f)
2) file_hash(f) is computed again and again in hash_for_each_possible()

>   if (!nfs_compare_fh(&file->f_handle, f))
>   goto found;
>  
> @@ -123,7 +123,7 @@ nlm_lookup_file(struct svc_rqst *rqstp, struct nlm_file 
> **result,
>   goto out_free;
>   }
>  
> - hlist_add_head(&file->f_list, &nlm_files[hash]);
> + hash_add(nlm_files, &file->f_list, key);

3) then we use "key" as parameter to hash_add.

Moreover, we're adding dispersion to the file_hash() with the hash_32()
called under the hook within hashtable.h. Is it an intended behavior ?
This should at the very least be documented in the changelog.

[...]

> +static int __init nlm_init(void)
> +{
> + hash_init(nlm_files);

Useless.

Thanks,

Mathieu

> + return 0;
> +}
> +
> +module_init(nlm_init);
> -- 
> 1.7.12.4
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 14/16] net, rds: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> Switch rds to use the new hashtable implementation. This reduces the amount of
> generic unrelated code in rds.
> 
> Signed-off-by: Sasha Levin 
> ---
>  net/rds/bind.c   |  28 +-
>  net/rds/connection.c | 102 
> +++
>  2 files changed, 63 insertions(+), 67 deletions(-)
> 
> diff --git a/net/rds/bind.c b/net/rds/bind.c
> index 637bde5..79d65ce 100644
> --- a/net/rds/bind.c
> +++ b/net/rds/bind.c
> @@ -36,16 +36,16 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "rds.h"
>  
> -#define BIND_HASH_SIZE 1024
> -static struct hlist_head bind_hash_table[BIND_HASH_SIZE];
> +#define BIND_HASH_BITS 10
> +static DEFINE_HASHTABLE(bind_hash_table, BIND_HASH_BITS);
>  static DEFINE_SPINLOCK(rds_bind_lock);
>  
> -static struct hlist_head *hash_to_bucket(__be32 addr, __be16 port)
> +static u32 rds_hash(__be32 addr, __be16 port)
>  {
> - return bind_hash_table + (jhash_2words((u32)addr, (u32)port, 0) &
> -   (BIND_HASH_SIZE - 1));
> + return jhash_2words((u32)addr, (u32)port, 0);
>  }
>  
>  static struct rds_sock *rds_bind_lookup(__be32 addr, __be16 port,
> @@ -53,12 +53,12 @@ static struct rds_sock *rds_bind_lookup(__be32 addr, 
> __be16 port,
>  {
>   struct rds_sock *rs;
>   struct hlist_node *node;
> - struct hlist_head *head = hash_to_bucket(addr, port);
> + u32 key = rds_hash(addr, port);
>   u64 cmp;
>   u64 needle = ((u64)be32_to_cpu(addr) << 32) | be16_to_cpu(port);
>  
>   rcu_read_lock();
> - hlist_for_each_entry_rcu(rs, node, head, rs_bound_node) {
> + hash_for_each_possible_rcu(bind_hash_table, rs, node, rs_bound_node, 
> key) {

here too, key will be hashed twice:

- once by jhash_2words,
- once by hash_32(),

is this intended ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 15/16] openvswitch: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
[...]
> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
> -{
> - unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
> - return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
> -}
> -
>  /**
>   *   ovs_vport_locate - find a port that has already been created
>   *
> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, 
> const char *name)
>   */
>  struct vport *ovs_vport_locate(struct net *net, const char *name)
>  {
> - struct hlist_head *bucket = hash_bucket(net, name);
>   struct vport *vport;
>   struct hlist_node *node;
> + int key = full_name_hash(name, strlen(name));
>  
> - hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
> - if (!strcmp(name, vport->ops->get_name(vport)) &&
> - net_eq(ovs_dp_get_net(vport->dp), net))
> + hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)

Is applying hash_32() on top of full_name_hash() needed and expected ?

Thanks,

Mathieu

> + if (!strcmp(name, vport->ops->get_name(vport)))
>   return vport;
>  
>   return NULL;
> @@ -174,7 +165,8 @@ struct vport *ovs_vport_add(const struct vport_parms 
> *parms)
>  
>   for (i = 0; i < ARRAY_SIZE(vport_ops_list); i++) {
>   if (vport_ops_list[i]->type == parms->type) {
> - struct hlist_head *bucket;
> + int key;
> + const char *name;
>  
>   vport = vport_ops_list[i]->create(parms);
>   if (IS_ERR(vport)) {
> @@ -182,9 +174,9 @@ struct vport *ovs_vport_add(const struct vport_parms 
> *parms)
>   goto out;
>   }
>  
> - bucket = hash_bucket(ovs_dp_get_net(vport->dp),
> -  vport->ops->get_name(vport));
> - hlist_add_head_rcu(&vport->hash_node, bucket);
> + name = vport->ops->get_name(vport);
> + key = full_name_hash(name, strlen(name));
> + hash_add_rcu(dev_table, &vport->hash_node, key);
>   return vport;
>   }
>   }
> @@ -225,7 +217,7 @@ void ovs_vport_del(struct vport *vport)
>  {
>   ASSERT_RTNL();
>  
> - hlist_del_rcu(&vport->hash_node);
> + hash_del_rcu(&vport->hash_node);
>  
>   vport->ops->destroy(vport);
>  }
> -- 
> 1.7.12.4
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 09/16] SUNRPC/cache: use new hashtable implementation

2012-10-29 Thread Linus Torvalds
On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
 wrote:
>
> So defining e.g.:
>
> #include 
>
> #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
>
> would keep the intended behavior in all cases: use one page for the hash
> array.

Well, since that wasn't true before either because of the long-time
bug you point out, clearly the page size isn't all that important. I
think it's more important to have small and simple code, and "9" is
certainly that, compared to playing ilog2 games with not-so-obvious
things.

Because there's no reason to believe that '9' is in any way a worse
random number than something page-shift-related, is there? And getting
away from *previous* overly-complicated size calculations that had
been broken because they were too complicated and random, sounds like
a good idea.

Linus
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/8] User-Space MPLS actions and matches

2012-10-29 Thread Isaku Yamahata
I needed the following patch to pass unit tests.

>From 55e62f64c984accf578a3ae8e1b315b7752c7ebe Mon Sep 17 00:00:00 2001
Message-Id: 
<55e62f64c984accf578a3ae8e1b315b7752c7ebe.1351522305.git.yamah...@valinux.co.jp>
From: Isaku Yamahata 
Date: Mon, 29 Oct 2012 23:43:25 +0900
Subject: [PATCH] unittest/mpls: OF12 openflow dump-tables

fb1d6089402a4d108133bd39820bd0b345da0e9a adds OFPXMT12_OFB_MPLS_BOS so that
OFPXMT12_OFB_MAX is increased by one.
Thus OF12 dump-tables needs update.

Signed-off-by: Isaku Yamahata 
---
 tests/ofproto.at |   12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/ofproto.at b/tests/ofproto.at
index 3865fc5..95c2fa8 100644
--- a/tests/ofproto.at
+++ b/tests/ofproto.at
@@ -575,13 +575,13 @@ AT_CLEANUP
 AT_SETUP([ofproto - flow table configuration (OpenFlow 1.2)])
 OVS_VSWITCHD_START
 # Check the default configuration.
-(mid="wild=0xf, max=100,"
+(mid="wild=0x1f, max=100,"
  tail="
lookup=0, matched=0
-   match=0xf, instructions=0x0007, config=0x0003
+   match=0x1f, instructions=0x0007, config=0x0003
write_actions=0x, apply_actions=0x
-   write_setfields=0x000f
-   apply_setfields=0x000f
+   write_setfields=0x001f
+   apply_setfields=0x001f
metadata_match=0x
metadata_write=0x"
  echo "OFPST_TABLE reply (OF1.2) (xid=0x2): 255 tables
@@ -607,9 +607,9 @@ AT_CHECK(
 # Check that the configuration was updated.
 mv expout orig-expout
 (echo "OFPST_TABLE reply (OF1.2) (xid=0x2): 255 tables
-  0: main: wild=0xf, max=100, active=0"
+  0: main: wild=0x1f, max=100, active=0"
  tail -n +3 orig-expout | head -7
- echo "  1: table1  : wild=0xf, max=  1024, active=0"
+ echo "  1: table1  : wild=0x1f, max=  1024, active=0"
  tail -n +11 orig-expout) > expout
 AT_CHECK([ovs-ofctl --allowed-ofp-versions OpenFlow12 dump-tables br0], [0], 
[expout])
 OVS_VSWITCHD_STOP
-- 
1.7.10.4
-- 
yamahata
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 09/16] SUNRPC/cache: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Linus Torvalds (torva...@linux-foundation.org) wrote:
> On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
>  wrote:
> >
> > So defining e.g.:
> >
> > #include 
> >
> > #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
> >
> > would keep the intended behavior in all cases: use one page for the hash
> > array.
> 
> Well, since that wasn't true before either because of the long-time
> bug you point out, clearly the page size isn't all that important. I
> think it's more important to have small and simple code, and "9" is
> certainly that, compared to playing ilog2 games with not-so-obvious
> things.
> 
> Because there's no reason to believe that '9' is in any way a worse
> random number than something page-shift-related, is there? And getting
> away from *previous* overly-complicated size calculations that had
> been broken because they were too complicated and random, sounds like
> a good idea.

Good point. I agree that unless we really care about the precise number
of TLB entries and cache lines used by this hash table, we might want to
stay away from page-size and pointer-size based calculation.

It might not hurt to explain this in the patch changelog though.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 09/16] SUNRPC/cache: use new hashtable implementation

2012-10-29 Thread J. Bruce Fields
On Mon, Oct 29, 2012 at 11:13:43AM -0400, Mathieu Desnoyers wrote:
> * Linus Torvalds (torva...@linux-foundation.org) wrote:
> > On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
> >  wrote:
> > >
> > > So defining e.g.:
> > >
> > > #include 
> > >
> > > #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
> > >
> > > would keep the intended behavior in all cases: use one page for the hash
> > > array.
> > 
> > Well, since that wasn't true before either because of the long-time
> > bug you point out, clearly the page size isn't all that important. I
> > think it's more important to have small and simple code, and "9" is
> > certainly that, compared to playing ilog2 games with not-so-obvious
> > things.
> > 
> > Because there's no reason to believe that '9' is in any way a worse
> > random number than something page-shift-related, is there? And getting
> > away from *previous* overly-complicated size calculations that had
> > been broken because they were too complicated and random, sounds like
> > a good idea.
> 
> Good point. I agree that unless we really care about the precise number
> of TLB entries and cache lines used by this hash table, we might want to
> stay away from page-size and pointer-size based calculation.
>
> It might not hurt to explain this in the patch changelog though.

I'd also be happy to take that as a separate patch now.

--b.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 09/16] SUNRPC/cache: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* J. Bruce Fields (bfie...@fieldses.org) wrote:
> On Mon, Oct 29, 2012 at 11:13:43AM -0400, Mathieu Desnoyers wrote:
> > * Linus Torvalds (torva...@linux-foundation.org) wrote:
> > > On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
> > >  wrote:
> > > >
> > > > So defining e.g.:
> > > >
> > > > #include 
> > > >
> > > > #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
> > > >
> > > > would keep the intended behavior in all cases: use one page for the hash
> > > > array.
> > > 
> > > Well, since that wasn't true before either because of the long-time
> > > bug you point out, clearly the page size isn't all that important. I
> > > think it's more important to have small and simple code, and "9" is
> > > certainly that, compared to playing ilog2 games with not-so-obvious
> > > things.
> > > 
> > > Because there's no reason to believe that '9' is in any way a worse
> > > random number than something page-shift-related, is there? And getting
> > > away from *previous* overly-complicated size calculations that had
> > > been broken because they were too complicated and random, sounds like
> > > a good idea.
> > 
> > Good point. I agree that unless we really care about the precise number
> > of TLB entries and cache lines used by this hash table, we might want to
> > stay away from page-size and pointer-size based calculation.
> >
> > It might not hurt to explain this in the patch changelog though.
> 
> I'd also be happy to take that as a separate patch now.

FYIW: I've made a nice boo-boo above. It should have been:

#define DFR_HASH_BITS   (PAGE_SHIFT - ilog2(sizeof(struct hlist_head)))

Because we happen to have a memory indexed in bytes, not in bits. I
guess this goes a long way proving Linus' point about virtues of trivial
code. ;-)

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 15/16] openvswitch: use new hashtable implementation

2012-10-29 Thread Sasha Levin
Hi Mathieu,

On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
 wrote:
> * Sasha Levin (levinsasha...@gmail.com) wrote:
> [...]
>> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
>> -{
>> - unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
>> - return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
>> -}
>> -
>>  /**
>>   *   ovs_vport_locate - find a port that has already been created
>>   *
>> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, 
>> const char *name)
>>   */
>>  struct vport *ovs_vport_locate(struct net *net, const char *name)
>>  {
>> - struct hlist_head *bucket = hash_bucket(net, name);
>>   struct vport *vport;
>>   struct hlist_node *node;
>> + int key = full_name_hash(name, strlen(name));
>>
>> - hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
>> - if (!strcmp(name, vport->ops->get_name(vport)) &&
>> - net_eq(ovs_dp_get_net(vport->dp), net))
>> + hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
>
> Is applying hash_32() on top of full_name_hash() needed and expected ?

Since this was pointed out in several of the patches, I'll answer it
just once here.

I've intentionally "allowed" double hashing with hash_32 to keep the
code simple.

hash_32() is pretty simple and gcc optimizes it to be almost nothing,
so doing that costs us a multiplication and a shift. On the other
hand, we benefit from keeping our code simple - how would we avoid
doing this double hash? adding a different hashtable function for
strings? or a new function for already hashed keys? I think we benefit
a lot from having to mul/shr instead of adding extra lines of code
here.


Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 10/16] dlm: use new hashtable implementation

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 9:07 AM, Mathieu Desnoyers
 wrote:
> * Mathieu Desnoyers (mathieu.desnoy...@efficios.com) wrote:
>> * Sasha Levin (levinsasha...@gmail.com) wrote:
>> [...]
>> > @@ -158,34 +159,21 @@ static int dlm_allow_conn;
>> >  static struct workqueue_struct *recv_workqueue;
>> >  static struct workqueue_struct *send_workqueue;
>> >
>> > -static struct hlist_head connection_hash[CONN_HASH_SIZE];
>> > +static struct hlist_head connection_hash[CONN_HASH_BITS];
>> >  static DEFINE_MUTEX(connections_lock);
>> >  static struct kmem_cache *con_cache;
>> >
>> >  static void process_recv_sockets(struct work_struct *work);
>> >  static void process_send_sockets(struct work_struct *work);
>> >
>> > -
>> > -/* This is deliberately very simple because most clusters have simple
>> > -   sequential nodeids, so we should be able to go straight to a connection
>> > -   struct in the array */
>> > -static inline int nodeid_hash(int nodeid)
>> > -{
>> > -   return nodeid & (CONN_HASH_SIZE-1);
>> > -}
>>
>> There is one thing I dislike about this change: you remove a useful
>> comment. It's good to be informed of the reason why a direct mapping
>> "value -> hash" without any dispersion function is preferred here.

Yes, I've removed the comment because it's no longer true with the patch :)

> And now that I come to think of it: you're changing the behavior : you
> will now use a dispersion function on the key, which goes against the
> intent expressed in this comment.

The comment gave us the information that nodeids are mostly
sequential, we no longer need to rely on that.

> It might be good to change hash_add(), hash_add_rcu(),
> hash_for_each_possible*() key parameter for a "hash" parameter, and let
> the caller provide the hash value computed by the function they like as
> parameter, rather than enforcing hash_32/hash_64.

Why? We already proved that hash_32() is more than enough as a hashing
function, why complicate things?

Even doing hash_32() on top of another hash is probably a good idea to
keep things simple.

Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 15/16] openvswitch: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> Hi Mathieu,
> 
> On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
>  wrote:
> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> > [...]
> >> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
> >> -{
> >> - unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
> >> - return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
> >> -}
> >> -
> >>  /**
> >>   *   ovs_vport_locate - find a port that has already been created
> >>   *
> >> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, 
> >> const char *name)
> >>   */
> >>  struct vport *ovs_vport_locate(struct net *net, const char *name)
> >>  {
> >> - struct hlist_head *bucket = hash_bucket(net, name);
> >>   struct vport *vport;
> >>   struct hlist_node *node;
> >> + int key = full_name_hash(name, strlen(name));
> >>
> >> - hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
> >> - if (!strcmp(name, vport->ops->get_name(vport)) &&
> >> - net_eq(ovs_dp_get_net(vport->dp), net))
> >> + hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
> >
> > Is applying hash_32() on top of full_name_hash() needed and expected ?
> 
> Since this was pointed out in several of the patches, I'll answer it
> just once here.
> 
> I've intentionally "allowed" double hashing with hash_32 to keep the
> code simple.
> 
> hash_32() is pretty simple and gcc optimizes it to be almost nothing,
> so doing that costs us a multiplication and a shift. On the other
> hand, we benefit from keeping our code simple - how would we avoid
> doing this double hash? adding a different hashtable function for
> strings? or a new function for already hashed keys? I think we benefit
> a lot from having to mul/shr instead of adding extra lines of code
> here.

This could be done, as I pointed out in another email within this
thread, by changing the "key" argument from add/for_each_possible to an
expected "hash" value, and let the caller invoke hash_32() if they want.
I doubt this would add a significant amount of complexity for users of
this API, but would allow much more flexibility to choose hash
functions.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 01/16] hashtable: introduce a small and naive hashtable

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 7:29 AM, Mathieu Desnoyers
 wrote:
> * Sasha Levin (levinsasha...@gmail.com) wrote:
>> +
>> + for (i = 0; i < sz; i++)
>> + INIT_HLIST_HEAD(&ht[sz]);
>
> ouch. How did this work ? Has it been tested at all ?
>
> sz -> i

Funny enough, it works perfectly. Generally as a test I boot the
kernel in a VM and let it fuzz with trinity for a bit, doing that with
the code above worked flawlessly.

While it works, it's obviously wrong. Why does it work though? Usually
there's a list op happening pretty soon after that which brings the
list into proper state.

I've been playing with a patch that adds a magic value into list_head
if CONFIG_DEBUG_LIST is set, and checks that magic in the list debug
code in lib/list_debug.c.

Does it sound like something useful? If so I'll send that patch out.


Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 10/16] dlm: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 9:07 AM, Mathieu Desnoyers
>  wrote:
> > * Mathieu Desnoyers (mathieu.desnoy...@efficios.com) wrote:
> >> * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> [...]
> >> > @@ -158,34 +159,21 @@ static int dlm_allow_conn;
> >> >  static struct workqueue_struct *recv_workqueue;
> >> >  static struct workqueue_struct *send_workqueue;
> >> >
> >> > -static struct hlist_head connection_hash[CONN_HASH_SIZE];
> >> > +static struct hlist_head connection_hash[CONN_HASH_BITS];
> >> >  static DEFINE_MUTEX(connections_lock);
> >> >  static struct kmem_cache *con_cache;
> >> >
> >> >  static void process_recv_sockets(struct work_struct *work);
> >> >  static void process_send_sockets(struct work_struct *work);
> >> >
> >> > -
> >> > -/* This is deliberately very simple because most clusters have simple
> >> > -   sequential nodeids, so we should be able to go straight to a 
> >> > connection
> >> > -   struct in the array */
> >> > -static inline int nodeid_hash(int nodeid)
> >> > -{
> >> > -   return nodeid & (CONN_HASH_SIZE-1);
> >> > -}
> >>
> >> There is one thing I dislike about this change: you remove a useful
> >> comment. It's good to be informed of the reason why a direct mapping
> >> "value -> hash" without any dispersion function is preferred here.
> 
> Yes, I've removed the comment because it's no longer true with the patch :)
> 
> > And now that I come to think of it: you're changing the behavior : you
> > will now use a dispersion function on the key, which goes against the
> > intent expressed in this comment.
> 
> The comment gave us the information that nodeids are mostly
> sequential, we no longer need to rely on that.

I'm fine with turning a direct + modulo mapping into a dispersed hash as
long as there are no underlying assumptions about sequentiality of value
accesses.

If the access pattern would happen to be typically sequential, then
adding dispersion could hurt performances significantly, turning a
frequent L1 access into a L2 access for instance.

> 
> > It might be good to change hash_add(), hash_add_rcu(),
> > hash_for_each_possible*() key parameter for a "hash" parameter, and let
> > the caller provide the hash value computed by the function they like as
> > parameter, rather than enforcing hash_32/hash_64.
> 
> Why? We already proved that hash_32() is more than enough as a hashing
> function, why complicate things?
> 
> Even doing hash_32() on top of another hash is probably a good idea to
> keep things simple.

All I'm asking is: have you made sure that this hash table is not
deliberately kept sequential (without dispersion) to accelerate specific
access patterns ? This should at least be documented in the changelog.

Thanks,

Mathieu


> 
> Thanks,
> Sasha

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 01/16] hashtable: introduce a small and naive hashtable

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 7:29 AM, Mathieu Desnoyers
>  wrote:
> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> +
> >> + for (i = 0; i < sz; i++)
> >> + INIT_HLIST_HEAD(&ht[sz]);
> >
> > ouch. How did this work ? Has it been tested at all ?
> >
> > sz -> i
> 
> Funny enough, it works perfectly. Generally as a test I boot the
> kernel in a VM and let it fuzz with trinity for a bit, doing that with
> the code above worked flawlessly.
> 
> While it works, it's obviously wrong. Why does it work though? Usually
> there's a list op happening pretty soon after that which brings the
> list into proper state.
> 
> I've been playing with a patch that adds a magic value into list_head
> if CONFIG_DEBUG_LIST is set, and checks that magic in the list debug
> code in lib/list_debug.c.
> 
> Does it sound like something useful? If so I'll send that patch out.

Most of the calls to this initialization function apply it on zeroed
memory (static/kzalloc'd...), which makes it useless. I'd actually be in
favor of removing those redundant calls (as I pointed out in another
email), and document that zeroed memory don't need to be explicitly
initialized.

Those sites that need to really reinitialize memory, or initialize it
(if located on the stack or in non-zeroed dynamically allocated memory)
could use a memset to 0, which will likely be faster than setting to
NULL on many architectures.

About testing, I'd recommend taking the few sites that still need the
initialization function, and just initialize the array with garbage
before calling the initialization function. Things should blow up quite
quickly. Doing it as a one-off thing might be enough to catch any issue.
I don't think we need extra magic numbers to catch issues in this rather
obvious init function.

Thanks,

Mathieu

> 
> 
> Thanks,
> Sasha

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 01/16] hashtable: introduce a small and naive hashtable

2012-10-29 Thread Tejun Heo
Hello,

On Mon, Oct 29, 2012 at 12:14:12PM -0400, Mathieu Desnoyers wrote:
> Most of the calls to this initialization function apply it on zeroed
> memory (static/kzalloc'd...), which makes it useless. I'd actually be in
> favor of removing those redundant calls (as I pointed out in another
> email), and document that zeroed memory don't need to be explicitly
> initialized.
> 
> Those sites that need to really reinitialize memory, or initialize it
> (if located on the stack or in non-zeroed dynamically allocated memory)
> could use a memset to 0, which will likely be faster than setting to
> NULL on many architectures.

I don't think it's a good idea to optimize out the basic encapsulation
there.  We're talking about re-zeroing some static memory areas which
are pretty small.  It's just not worth optimizing out at the cost of
proper initializtion.  e.g. We might add debug fields to list_head
later.

Thanks.

-- 
tejun
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 01/16] hashtable: introduce a small and naive hashtable

2012-10-29 Thread Mathieu Desnoyers
* Tejun Heo (t...@kernel.org) wrote:
> Hello,
> 
> On Mon, Oct 29, 2012 at 12:14:12PM -0400, Mathieu Desnoyers wrote:
> > Most of the calls to this initialization function apply it on zeroed
> > memory (static/kzalloc'd...), which makes it useless. I'd actually be in
> > favor of removing those redundant calls (as I pointed out in another
> > email), and document that zeroed memory don't need to be explicitly
> > initialized.
> > 
> > Those sites that need to really reinitialize memory, or initialize it
> > (if located on the stack or in non-zeroed dynamically allocated memory)
> > could use a memset to 0, which will likely be faster than setting to
> > NULL on many architectures.
> 
> I don't think it's a good idea to optimize out the basic encapsulation
> there.  We're talking about re-zeroing some static memory areas which
> are pretty small.  It's just not worth optimizing out at the cost of
> proper initializtion.  e.g. We might add debug fields to list_head
> later.

Future-proofness for debugging fields is indeed a very compelling
argument. Fair enough!

We might want to document this intent at the top of the initialization
function though, just in case anyone want to short-circuit it.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] netdev-vport: Warn users that pmtud is deprecated

2012-10-29 Thread Ansis Atteka
This is a complementary patch to the other one I sent out previously.
The pupose of it is to make it more obvious that tunnel pmtud is
deprecated and will be removed soon.

Requested-by: Jesse Gross 
Signed-off-by: Ansis Atteka 
---
 NEWS |4 +++-
 lib/netdev-vport.c   |4 
 utilities/ovs-save   |2 +-
 vswitchd/vswitch.xml |3 ++-
 4 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/NEWS b/NEWS
index fa0a249..203a662 100644
--- a/NEWS
+++ b/NEWS
@@ -32,7 +32,8 @@ v1.9.0 - xx xxx 
   are true, but because we do not know of any users for this
   feature it seems better on balance to remove it.  (The ovs-pki-cgi
   program was not included in distribution packaging.)
-- Tunnel Path MTU Discovery default value was set to 'disabled'.
+- Tunnel Path MTU Discovery default value was set to 'disabled'.  This
+  feature is deprecated and will be removed soon.
 - ovsdb-server now enforces the immutability of immutable columns.  This
   was not enforced in earlier versions due to an oversight.
 - New support for a nonstandard form of GRE that supports a 64-bit key.
@@ -43,6 +44,7 @@ v1.9.0 - xx xxx 
 - The autopath action.
 - Interface type "null".
 - Numeric values for reserved ports (see "ovs-ofctl" note above).
+- Tunnel Path MTU Discovery.
 
 
 v1.8.0 - xx xxx 
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 4168959..ee9cb04 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -648,6 +648,10 @@ parse_tunnel_config(const char *name, const char *type,
 }
 } else if (!strcmp(node->key, "pmtud")) {
 if (!strcmp(node->value, "true")) {
+VLOG_WARN_ONCE("%s: The tunnel Path MTU discovery is "
+   "deprecated and may be removed in February "
+   "2013. Please email dev@openvswitch.org with "
+   "concerns.", name);
 flags |= TNL_F_PMTUD;
 }
 } else if (!strcmp(node->key, "header_cache")) {
diff --git a/utilities/ovs-save b/utilities/ovs-save
index 01e5791..2f70221 100755
--- a/utilities/ovs-save
+++ b/utilities/ovs-save
@@ -199,7 +199,7 @@ save_datapaths () {
 # lookups: hit:0 missed:0 lost:0
 # flows: 0
 # port 0: br1 (internal)
-# port 2: gre2886795521 (ipsec_gre: key=flow, pmtud=false, 
remote_ip=172.17.1.1, tos=inherit)
+# port 2: gre2886795521 (ipsec_gre: key=flow, 
remote_ip=172.17.1.1, tos=inherit)
 # port 3: gre1 (ipsec_gre: remote_ip=192.168.113.1)
 # port 14: gre2 (gre: remote_ip=192.168.115.1)
 # port 15: gre3 (gre64: remote_ip=192.168.116.1)
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 0bc4ccd..f486d6a 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -1366,7 +1366,8 @@
 of the tunnel headers.  Note that this option causes behavior that is
 typically reserved for routers and therefore is not entirely in
 compliance with the IEEE 802.1D specification for bridges.  Default is
-disabled; set to true to enable.
+disabled; set to true to enable.  This feature is
+deprecated and will be removed soon.
   
 
   
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 10/16] dlm: use new hashtable implementation

2012-10-29 Thread David Teigland
On Mon, Oct 29, 2012 at 12:07:10PM -0400, Mathieu Desnoyers wrote:
> I'm fine with turning a direct + modulo mapping into a dispersed hash as
> long as there are no underlying assumptions about sequentiality of value
> accesses.
> 
> If the access pattern would happen to be typically sequential, then
> adding dispersion could hurt performances significantly, turning a
> frequent L1 access into a L2 access for instance.
  
> All I'm asking is: have you made sure that this hash table is not
> deliberately kept sequential (without dispersion) to accelerate specific
> access patterns ? This should at least be documented in the changelog.

It was not intentional.  I don't expect any benefit would be lost by
making it non-sequential.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] netdev-vport: Warn users that pmtud is deprecated

2012-10-29 Thread Ben Pfaff
On Mon, Oct 29, 2012 at 06:22:54PM +0200, Ansis Atteka wrote:
> This is a complementary patch to the other one I sent out previously.
> The pupose of it is to make it more obvious that tunnel pmtud is
> deprecated and will be removed soon.
> 
> Requested-by: Jesse Gross 
> Signed-off-by: Ansis Atteka 

Looks good to me.

s/pupose/purpose/ in the commit message.

I'd wait for Jesse's review on this.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 01/16] hashtable: introduce a small and naive hashtable

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 12:14 PM, Mathieu Desnoyers
 wrote:
> * Sasha Levin (levinsasha...@gmail.com) wrote:
>> On Mon, Oct 29, 2012 at 7:29 AM, Mathieu Desnoyers
>>  wrote:
>> > * Sasha Levin (levinsasha...@gmail.com) wrote:
>> >> +
>> >> + for (i = 0; i < sz; i++)
>> >> + INIT_HLIST_HEAD(&ht[sz]);
>> >
>> > ouch. How did this work ? Has it been tested at all ?
>> >
>> > sz -> i
>>
>> Funny enough, it works perfectly. Generally as a test I boot the
>> kernel in a VM and let it fuzz with trinity for a bit, doing that with
>> the code above worked flawlessly.
>>
>> While it works, it's obviously wrong. Why does it work though? Usually
>> there's a list op happening pretty soon after that which brings the
>> list into proper state.
>>
>> I've been playing with a patch that adds a magic value into list_head
>> if CONFIG_DEBUG_LIST is set, and checks that magic in the list debug
>> code in lib/list_debug.c.
>>
>> Does it sound like something useful? If so I'll send that patch out.
>
> Most of the calls to this initialization function apply it on zeroed
> memory (static/kzalloc'd...), which makes it useless. I'd actually be in
> favor of removing those redundant calls (as I pointed out in another
> email), and document that zeroed memory don't need to be explicitly
> initialized.

Why would that make it useless? The idea is that the init functions
will set the magic field to something random, like:

.magic = 0xBADBEEF0;

And have list_add() and friends WARN(.magic != 0xBADBEEF0, "Using an
uninitialized list\n");

This way we'll catch all places that don't go through list initialization code.


Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 09/16] SUNRPC/cache: use new hashtable implementation

2012-10-29 Thread Andrew Morton
On Mon, 29 Oct 2012 07:49:42 -0700 Linus Torvalds 
 wrote:

> Because there's no reason to believe that '9' is in any way a worse
> random number than something page-shift-related, is there?

9 is much better than PAGE_SHIFT.  PAGE_SIZE can vary by a factor of
16, depending on config.

Everyone thinks 4k, and tests only for that.  There's potential for
very large performance and behavior changes when their code gets run
on a 64k PAGE_SIZE machine.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 01/16] hashtable: introduce a small and naive hashtable

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 12:14 PM, Mathieu Desnoyers
>  wrote:
> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> On Mon, Oct 29, 2012 at 7:29 AM, Mathieu Desnoyers
> >>  wrote:
> >> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> >> +
> >> >> + for (i = 0; i < sz; i++)
> >> >> + INIT_HLIST_HEAD(&ht[sz]);
> >> >
> >> > ouch. How did this work ? Has it been tested at all ?
> >> >
> >> > sz -> i
> >>
> >> Funny enough, it works perfectly. Generally as a test I boot the
> >> kernel in a VM and let it fuzz with trinity for a bit, doing that with
> >> the code above worked flawlessly.
> >>
> >> While it works, it's obviously wrong. Why does it work though? Usually
> >> there's a list op happening pretty soon after that which brings the
> >> list into proper state.
> >>
> >> I've been playing with a patch that adds a magic value into list_head
> >> if CONFIG_DEBUG_LIST is set, and checks that magic in the list debug
> >> code in lib/list_debug.c.
> >>
> >> Does it sound like something useful? If so I'll send that patch out.
> >
> > Most of the calls to this initialization function apply it on zeroed
> > memory (static/kzalloc'd...), which makes it useless. I'd actually be in
> > favor of removing those redundant calls (as I pointed out in another
> > email), and document that zeroed memory don't need to be explicitly
> > initialized.
> 
> Why would that make it useless? The idea is that the init functions
> will set the magic field to something random, like:
> 
> .magic = 0xBADBEEF0;
> 
> And have list_add() and friends WARN(.magic != 0xBADBEEF0, "Using an
> uninitialized list\n");
> 
> This way we'll catch all places that don't go through list initialization 
> code.

As I replied to Tejun Heo already, I agree that keeping the
initialization in place makes sense for future-proofness. This intent
should probably be documented in a comment about the initialization
function though, just to make sure nobody will try to skip it.

Thanks,

Mathieu

> 
> 
> Thanks,
> Sasha

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovs-vsctl: Allow command-specific options to mingle with global options.

2012-10-29 Thread Ben Pfaff
Until now, a command like "ovs-vsctl --may-exist add-br br0" yielded a
confusing error message.  Users had to realize that the correct form was
"ovs-vsctl -- --may-exist add-br br0", but instead they often reported a
bug or gave up in frustration.  Even though the behavior was documented, it
was counterintuitive.

This commit allows command-specific options to be mixed with global
options, making both forms of the command listed above equally acceptable.

CC: 691...@bugs.debian.org
Reported-by: Adam Heath 
Signed-off-by: Ben Pfaff 
---
 AUTHORS  |1 +
 tests/ovs-vsctl.at   |   13 -
 utilities/ovs-vsctl.8.in |   11 ++--
 utilities/ovs-vsctl.c|  131 +
 4 files changed, 137 insertions(+), 19 deletions(-)

diff --git a/AUTHORS b/AUTHORS
index 4687865..83e6bb5 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -81,6 +81,7 @@ The following additional people are mentioned in commit logs 
as having
 provided helpful bug reports or suggestions.
 
 Aaron M. Ucko   u...@debian.org
+Adam Heath  doo...@brainfood.com
 Ahmed Bilal numan...@gmail.com
 Alan Shieh  ash...@nicira.com
 Alban Browaeys  pra...@yahoo.com
diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
index e903619..cb12f31 100644
--- a/tests/ovs-vsctl.at
+++ b/tests/ovs-vsctl.at
@@ -15,7 +15,7 @@ dnl RUN_OVS_VSCTL(COMMAND, ...)
 dnl
 dnl Executes each ovs-vsctl COMMAND.
 m4_define([RUN_OVS_VSCTL],
-  [m4_foreach([command], [$@], [ovs-vsctl --timeout=5 --no-wait 
-vreconnect:emer --db=unix:socket -- command
+  [m4_foreach([command], [$@], [ovs-vsctl --timeout=5 --no-wait 
-vreconnect:emer --db=unix:socket command
 ])])
 m4_define([RUN_OVS_VSCTL_ONELINE],
   [m4_foreach([command], [$@], [ovs-vsctl --timeout=5 --no-wait 
-vreconnect:emer --db=unix:socket --oneline -- command
@@ -655,6 +655,17 @@ AT_CLEANUP
 AT_SETUP([database commands -- negative checks])
 AT_KEYWORDS([ovs-vsctl])
 OVS_VSCTL_SETUP
+
+AT_CHECK([ovs-vsctl --may-exist],
+  [1], [ignore], [ovs-vsctl: missing command name (use --help for help)
+], [OVS_VSCTL_CLEANUP])
+AT_CHECK([ovs-vsctl --may-exist --],
+  [1], [ignore], [ovs-vsctl: missing command name (use --help for help)
+], [OVS_VSCTL_CLEANUP])
+AT_CHECK([ovs-vsctl -- --may-exist],
+  [1], [ignore], [ovs-vsctl: missing command name (use --help for help)
+], [OVS_VSCTL_CLEANUP])
+
 AT_CHECK([RUN_OVS_VSCTL([add-br br0])],
   [0], [ignore], [], [OVS_VSCTL_CLEANUP])
 AT_CHECK([RUN_OVS_VSCTL([add-br br1])], 
diff --git a/utilities/ovs-vsctl.8.in b/utilities/ovs-vsctl.8.in
index 1b80d05..8f70d6b 100644
--- a/utilities/ovs-vsctl.8.in
+++ b/utilities/ovs-vsctl.8.in
@@ -43,9 +43,9 @@ implemented as a single atomic transaction against the 
database.
 The \fBovs\-vsctl\fR command line begins with global options (see
 \fBOPTIONS\fR below for details).  The global options are followed by
 one or more commands.  Each command should begin with \fB\-\-\fR by
-itself as a command-line argument, to separate it from the global
-options and following commands.  (If the first command does not have
-any options, then the first \fB\-\-\fR may be omitted.)  The command
+itself as a command-line argument, to separate it from the following
+commands.  (The \fB\-\-\fR before the first command is optional.)  The
+command
 itself starts with command-specific options, if any, followed by the
 command name and any arguments.  See \fBEXAMPLES\fR below for syntax
 examples.
@@ -769,10 +769,9 @@ Delete bridge \fBbr0\fR, reporting an error if it does not 
exist:
 .IP
 .B "ovs\-vsctl del\-br br0"
 .PP
-Delete bridge \fBbr0\fR if it exists (the \fB\-\-\fR is required to
-separate \fBdel\-br\fR's options from the global options):
+Delete bridge \fBbr0\fR if it exists:
 .IP
-.B "ovs\-vsctl \-\- \-\-if\-exists del\-br br0"
+.B "ovs\-vsctl \-\-if\-exists del\-br br0"
 .PP
 Set the \fBqos\fR column of the \fBPort\fR record for \fBeth0\fR to
 point to a new \fBQoS\fR record, which in turn points with its queue 0
diff --git a/utilities/ovs-vsctl.c b/utilities/ovs-vsctl.c
index fda3a89..1745418 100644
--- a/utilities/ovs-vsctl.c
+++ b/utilities/ovs-vsctl.c
@@ -131,12 +131,14 @@ static void vsctl_exit(int status) NO_RETURN;
 static void vsctl_fatal(const char *, ...) PRINTF_FORMAT(1, 2) NO_RETURN;
 static char *default_db(void);
 static void usage(void) NO_RETURN;
-static void parse_options(int argc, char *argv[]);
+static void parse_options(int argc, char *argv[], struct shash *local_options);
 static bool might_write_to_db(char **argv);
 
 static struct vsctl_command *parse_commands(int argc, char *argv[],
+struct shash *local_options,
 size_t *n_commandsp);
-static void parse_command(int argc, char *argv[], struct vsctl_command *);
+static void parse_command(int argc, char *argv[], struct shash *local_options,
+  struct vsctl_command *);
 static const struct vs

Re: [ovs-dev] [PATCH 1/8] flow: Set ttl in flow_compose().

2012-10-29 Thread Ben Pfaff
On Sat, Oct 27, 2012 at 03:05:55PM +0900, Simon Horman wrote:
> From: Justin Pettit 
> 
> Thanks to Ben Pfaff for immediately pinpointing the likely location of
> the issue.
> 
> Signed-off-by: Justin Pettit 
> Signed-off-by: Simon Horman 

I applied this one to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 3/8] nx-match: Do not check pre-requisites for load actions

2012-10-29 Thread Ben Pfaff
On Sat, Oct 27, 2012 at 03:05:57PM +0900, Simon Horman wrote:
> There are (or at least will be) cases where this check can produce false
> positives.  For example, a flow which matches a non-MPLS packet and then
> applies an MPLS push action followed by an action to load the MPLS label.

So, I remember the discussion, but I don't remember coming to exactly
this conclusion.

The problem seems to be, at this point, MPLS specific, because for
most prerequisites, whether it is satisfied cannot be affected by
actions.  For example, no action can transform a TCP packet into a
non-TCP packet, and vice versa.

My thought is, then, for now we should define a new function to test
whether a prerequisite could ever be satisfied given a particular
flow.  For all prerequisites other than MPLS, this would return the
same thing as mf_are_prereqs_ok(); for MPLS, it could just return
true.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 7:35 AM, Mathieu Desnoyers
 wrote:
> * Sasha Levin (levinsasha...@gmail.com) wrote:
>> Switch tracepoints to use the new hashtable implementation. This reduces the 
>> amount of
>> generic unrelated code in the tracepoints.
>>
>> Signed-off-by: Sasha Levin 
>> ---
>>  kernel/tracepoint.c | 27 +++
>>  1 file changed, 11 insertions(+), 16 deletions(-)
>>
>> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
>> index d96ba22..854df92 100644
>> --- a/kernel/tracepoint.c
>> +++ b/kernel/tracepoint.c
>> @@ -26,6 +26,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  extern struct tracepoint * const __start___tracepoints_ptrs[];
>>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
>> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
>>   * Protected by tracepoints_mutex.
>>   */
>>  #define TRACEPOINT_HASH_BITS 6
>> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
>> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
>> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
>>
> [...]
>>
>> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
>>
>>  static int init_tracepoints(void)
>>  {
>> + hash_init(tracepoint_table);
>> +
>>   return register_module_notifier(&tracepoint_module_nb);
>>  }
>>  __initcall(init_tracepoints);
>
> So we have a hash table defined in .bss (therefore entirely initialized
> to NULL), and you add a call to "hash_init", which iterates on the whole
> array and initialize it to NULL (again) ?
>
> This extra initialization is redundant. I think it should be removed
> from here, and hashtable.h should document that hash_init() don't need
> to be called on zeroed memory (which includes static/global variables,
> kzalloc'd memory, etc).

This was discussed in the previous series, the conclusion was to call
hash_init() either way to keep the encapsulation and consistency.

It's cheap enough and happens only once, so why not?


Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 15/16] openvswitch: use new hashtable implementation

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 11:59 AM, Mathieu Desnoyers
 wrote:
> * Sasha Levin (levinsasha...@gmail.com) wrote:
>> Hi Mathieu,
>>
>> On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
>>  wrote:
>> > * Sasha Levin (levinsasha...@gmail.com) wrote:
>> > [...]
>> >> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
>> >> -{
>> >> - unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
>> >> - return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
>> >> -}
>> >> -
>> >>  /**
>> >>   *   ovs_vport_locate - find a port that has already been created
>> >>   *
>> >> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net 
>> >> *net, const char *name)
>> >>   */
>> >>  struct vport *ovs_vport_locate(struct net *net, const char *name)
>> >>  {
>> >> - struct hlist_head *bucket = hash_bucket(net, name);
>> >>   struct vport *vport;
>> >>   struct hlist_node *node;
>> >> + int key = full_name_hash(name, strlen(name));
>> >>
>> >> - hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
>> >> - if (!strcmp(name, vport->ops->get_name(vport)) &&
>> >> - net_eq(ovs_dp_get_net(vport->dp), net))
>> >> + hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
>> >
>> > Is applying hash_32() on top of full_name_hash() needed and expected ?
>>
>> Since this was pointed out in several of the patches, I'll answer it
>> just once here.
>>
>> I've intentionally "allowed" double hashing with hash_32 to keep the
>> code simple.
>>
>> hash_32() is pretty simple and gcc optimizes it to be almost nothing,
>> so doing that costs us a multiplication and a shift. On the other
>> hand, we benefit from keeping our code simple - how would we avoid
>> doing this double hash? adding a different hashtable function for
>> strings? or a new function for already hashed keys? I think we benefit
>> a lot from having to mul/shr instead of adding extra lines of code
>> here.
>
> This could be done, as I pointed out in another email within this
> thread, by changing the "key" argument from add/for_each_possible to an
> expected "hash" value, and let the caller invoke hash_32() if they want.
> I doubt this would add a significant amount of complexity for users of
> this API, but would allow much more flexibility to choose hash
> functions.

Most callers do need to do the hashing though, so why add an
additional step for all callers instead of doing another hash_32 for
the ones that don't really need it?

Another question is why do you need flexibility? I think that
simplicity wins over flexibility here.

Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 7:35 AM, Mathieu Desnoyers
>  wrote:
> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> Switch tracepoints to use the new hashtable implementation. This reduces 
> >> the amount of
> >> generic unrelated code in the tracepoints.
> >>
> >> Signed-off-by: Sasha Levin 
> >> ---
> >>  kernel/tracepoint.c | 27 +++
> >>  1 file changed, 11 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> >> index d96ba22..854df92 100644
> >> --- a/kernel/tracepoint.c
> >> +++ b/kernel/tracepoint.c
> >> @@ -26,6 +26,7 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >>
> >>  extern struct tracepoint * const __start___tracepoints_ptrs[];
> >>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
> >> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
> >>   * Protected by tracepoints_mutex.
> >>   */
> >>  #define TRACEPOINT_HASH_BITS 6
> >> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
> >> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
> >> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
> >>
> > [...]
> >>
> >> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
> >>
> >>  static int init_tracepoints(void)
> >>  {
> >> + hash_init(tracepoint_table);
> >> +
> >>   return register_module_notifier(&tracepoint_module_nb);
> >>  }
> >>  __initcall(init_tracepoints);
> >
> > So we have a hash table defined in .bss (therefore entirely initialized
> > to NULL), and you add a call to "hash_init", which iterates on the whole
> > array and initialize it to NULL (again) ?
> >
> > This extra initialization is redundant. I think it should be removed
> > from here, and hashtable.h should document that hash_init() don't need
> > to be called on zeroed memory (which includes static/global variables,
> > kzalloc'd memory, etc).
> 
> This was discussed in the previous series, the conclusion was to call
> hash_init() either way to keep the encapsulation and consistency.

Agreed,

Thanks,

Mathieu

> 
> It's cheap enough and happens only once, so why not?
> 
> 
> Thanks,
> Sasha

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 15/16] openvswitch: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 11:59 AM, Mathieu Desnoyers
>  wrote:
> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> Hi Mathieu,
> >>
> >> On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
> >>  wrote:
> >> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> > [...]
> >> >> -static struct hlist_head *hash_bucket(struct net *net, const char 
> >> >> *name)
> >> >> -{
> >> >> - unsigned int hash = jhash(name, strlen(name), (unsigned long) 
> >> >> net);
> >> >> - return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
> >> >> -}
> >> >> -
> >> >>  /**
> >> >>   *   ovs_vport_locate - find a port that has already been created
> >> >>   *
> >> >> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net 
> >> >> *net, const char *name)
> >> >>   */
> >> >>  struct vport *ovs_vport_locate(struct net *net, const char *name)
> >> >>  {
> >> >> - struct hlist_head *bucket = hash_bucket(net, name);
> >> >>   struct vport *vport;
> >> >>   struct hlist_node *node;
> >> >> + int key = full_name_hash(name, strlen(name));
> >> >>
> >> >> - hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
> >> >> - if (!strcmp(name, vport->ops->get_name(vport)) &&
> >> >> - net_eq(ovs_dp_get_net(vport->dp), net))
> >> >> + hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
> >> >
> >> > Is applying hash_32() on top of full_name_hash() needed and expected ?
> >>
> >> Since this was pointed out in several of the patches, I'll answer it
> >> just once here.
> >>
> >> I've intentionally "allowed" double hashing with hash_32 to keep the
> >> code simple.
> >>
> >> hash_32() is pretty simple and gcc optimizes it to be almost nothing,
> >> so doing that costs us a multiplication and a shift. On the other
> >> hand, we benefit from keeping our code simple - how would we avoid
> >> doing this double hash? adding a different hashtable function for
> >> strings? or a new function for already hashed keys? I think we benefit
> >> a lot from having to mul/shr instead of adding extra lines of code
> >> here.
> >
> > This could be done, as I pointed out in another email within this
> > thread, by changing the "key" argument from add/for_each_possible to an
> > expected "hash" value, and let the caller invoke hash_32() if they want.
> > I doubt this would add a significant amount of complexity for users of
> > this API, but would allow much more flexibility to choose hash
> > functions.
> 
> Most callers do need to do the hashing though, so why add an
> additional step for all callers instead of doing another hash_32 for
> the ones that don't really need it?
> 
> Another question is why do you need flexibility? I think that
> simplicity wins over flexibility here.

I usually try to make things as simple as possible, but not simplistic
compared to the problem tackled. In this case, I would ask the following
question: by standardizing the hash function of all those pieces of
kernel infrastructure to "hash_32()", including submodules part of the
kernel network infrastructure, parts of the kernel that can be fed
values coming from user-space (through the VFS), how can you guarantee
that hash_32() won't be the cause of a DoS attack based on the fact that
this algorithm is a) known by an attacker, and b) does not have any
randomness. It's been a recent trend to perform DoS attacks on poorly
implemented hashing functions.

This is just one example in an attempt to show why different hash table
users may have different constraints: for a hash table entirely
populated by keys generated internally by the kernel, a random seed
might not be required, but for cases where values are fed by user-space
and from the NIC, I would argue that flexibility to implement a
randomizable hash function beats implementation simplicity any time.

And you could keep the basic use-case simple by providing hints to the
hash_32()/hash_64()/hash_ulong() helpers in comments.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 15/16] openvswitch: use new hashtable implementation

2012-10-29 Thread Tejun Heo
Hello,

On Mon, Oct 29, 2012 at 02:16:48PM -0400, Mathieu Desnoyers wrote:
> This is just one example in an attempt to show why different hash table
> users may have different constraints: for a hash table entirely
> populated by keys generated internally by the kernel, a random seed
> might not be required, but for cases where values are fed by user-space
> and from the NIC, I would argue that flexibility to implement a
> randomizable hash function beats implementation simplicity any time.
> 
> And you could keep the basic use-case simple by providing hints to the
> hash_32()/hash_64()/hash_ulong() helpers in comments.

If all you need is throwing in a salt value to avoid attacks, can't
you just do that from caller side?  Scrambling the key before feeding
it into hash_*() should work, no?

Thanks.

-- 
tejun
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/8] User-Space MPLS actions and matches

2012-10-29 Thread Ben Pfaff
On Sat, Oct 27, 2012 at 03:05:58PM +0900, Simon Horman wrote:
> This patch implements use-space datapath and non-datapath code
> to match and use the datapath API set out in Leo Alterman's patch
> "user-space datapath: Add basic MPLS support to kernel".
> 
> The resulting MPLS implementation supports:
> * Pushing a single MPLS label
> * Poping a single MPLS label
> * Modifying an MPLS lable using set-field or load actions
>   that act on the label value, tc and bos bit.
> * There is no support for manipulating the TTL
>   this is considered future work.
> 
> The single-level push pop limitation is implemented by processing
> push, pop and set-field/load actions in order and discarding information
> that would require multiple levels of push/pop to be supported.
> 
> e.g.
>push,push -> the first push is discarded
>pop,pop -> the first pop is discarded
> 
> This patch is based heavily on work by Ravi K.
> 
> Cc: Isaku Yamahata 
> Cc: Ravi K 
> Signed-off-by: Simon Horman 

Why does this modify datapath/datapath.c and datapath/flow.h?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Josh Triplett
On Mon, Oct 29, 2012 at 01:29:24PM -0400, Sasha Levin wrote:
> On Mon, Oct 29, 2012 at 7:35 AM, Mathieu Desnoyers
>  wrote:
> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> Switch tracepoints to use the new hashtable implementation. This reduces 
> >> the amount of
> >> generic unrelated code in the tracepoints.
> >>
> >> Signed-off-by: Sasha Levin 
> >> ---
> >>  kernel/tracepoint.c | 27 +++
> >>  1 file changed, 11 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> >> index d96ba22..854df92 100644
> >> --- a/kernel/tracepoint.c
> >> +++ b/kernel/tracepoint.c
> >> @@ -26,6 +26,7 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >>
> >>  extern struct tracepoint * const __start___tracepoints_ptrs[];
> >>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
> >> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
> >>   * Protected by tracepoints_mutex.
> >>   */
> >>  #define TRACEPOINT_HASH_BITS 6
> >> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
> >> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
> >> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
> >>
> > [...]
> >>
> >> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
> >>
> >>  static int init_tracepoints(void)
> >>  {
> >> + hash_init(tracepoint_table);
> >> +
> >>   return register_module_notifier(&tracepoint_module_nb);
> >>  }
> >>  __initcall(init_tracepoints);
> >
> > So we have a hash table defined in .bss (therefore entirely initialized
> > to NULL), and you add a call to "hash_init", which iterates on the whole
> > array and initialize it to NULL (again) ?
> >
> > This extra initialization is redundant. I think it should be removed
> > from here, and hashtable.h should document that hash_init() don't need
> > to be called on zeroed memory (which includes static/global variables,
> > kzalloc'd memory, etc).
> 
> This was discussed in the previous series, the conclusion was to call
> hash_init() either way to keep the encapsulation and consistency.
> 
> It's cheap enough and happens only once, so why not?

Unnecessary work adds up.  Better not to do it unnecessarily, even if by
itself it doesn't cost that much.

It doesn't seem that difficult for future fields to have 0 as their
initialized state.

- Josh Triplett
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] netdev-vport: Warn users that pmtud is deprecated

2012-10-29 Thread Jesse Gross
On Mon, Oct 29, 2012 at 9:22 AM, Ansis Atteka  wrote:
> This is a complementary patch to the other one I sent out previously.
> The pupose of it is to make it more obvious that tunnel pmtud is
> deprecated and will be removed soon.
>
> Requested-by: Jesse Gross 
> Signed-off-by: Ansis Atteka 

Looks good to me.  Since we've already branched for 1.9, you might
want to put the changes to NEWS in the Debian changelog as well.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 15/16] openvswitch: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Tejun Heo (t...@kernel.org) wrote:
> Hello,
> 
> On Mon, Oct 29, 2012 at 02:16:48PM -0400, Mathieu Desnoyers wrote:
> > This is just one example in an attempt to show why different hash table
> > users may have different constraints: for a hash table entirely
> > populated by keys generated internally by the kernel, a random seed
> > might not be required, but for cases where values are fed by user-space
> > and from the NIC, I would argue that flexibility to implement a
> > randomizable hash function beats implementation simplicity any time.
> > 
> > And you could keep the basic use-case simple by providing hints to the
> > hash_32()/hash_64()/hash_ulong() helpers in comments.
> 
> If all you need is throwing in a salt value to avoid attacks, can't
> you just do that from caller side?  Scrambling the key before feeding
> it into hash_*() should work, no?

Yes, I think salting the "key" parameter would work.

Thanks,

Mathieu

> 
> Thanks.
> 
> -- 
> tejun

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 2:31 PM, Josh Triplett  wrote:
> On Mon, Oct 29, 2012 at 01:29:24PM -0400, Sasha Levin wrote:
>> On Mon, Oct 29, 2012 at 7:35 AM, Mathieu Desnoyers
>>  wrote:
>> > * Sasha Levin (levinsasha...@gmail.com) wrote:
>> >> Switch tracepoints to use the new hashtable implementation. This reduces 
>> >> the amount of
>> >> generic unrelated code in the tracepoints.
>> >>
>> >> Signed-off-by: Sasha Levin 
>> >> ---
>> >>  kernel/tracepoint.c | 27 +++
>> >>  1 file changed, 11 insertions(+), 16 deletions(-)
>> >>
>> >> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
>> >> index d96ba22..854df92 100644
>> >> --- a/kernel/tracepoint.c
>> >> +++ b/kernel/tracepoint.c
>> >> @@ -26,6 +26,7 @@
>> >>  #include 
>> >>  #include 
>> >>  #include 
>> >> +#include 
>> >>
>> >>  extern struct tracepoint * const __start___tracepoints_ptrs[];
>> >>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
>> >> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
>> >>   * Protected by tracepoints_mutex.
>> >>   */
>> >>  #define TRACEPOINT_HASH_BITS 6
>> >> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
>> >> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
>> >> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
>> >>
>> > [...]
>> >>
>> >> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
>> >>
>> >>  static int init_tracepoints(void)
>> >>  {
>> >> + hash_init(tracepoint_table);
>> >> +
>> >>   return register_module_notifier(&tracepoint_module_nb);
>> >>  }
>> >>  __initcall(init_tracepoints);
>> >
>> > So we have a hash table defined in .bss (therefore entirely initialized
>> > to NULL), and you add a call to "hash_init", which iterates on the whole
>> > array and initialize it to NULL (again) ?
>> >
>> > This extra initialization is redundant. I think it should be removed
>> > from here, and hashtable.h should document that hash_init() don't need
>> > to be called on zeroed memory (which includes static/global variables,
>> > kzalloc'd memory, etc).
>>
>> This was discussed in the previous series, the conclusion was to call
>> hash_init() either way to keep the encapsulation and consistency.
>>
>> It's cheap enough and happens only once, so why not?
>
> Unnecessary work adds up.  Better not to do it unnecessarily, even if by
> itself it doesn't cost that much.
>
> It doesn't seem that difficult for future fields to have 0 as their
> initialized state.

Let's put it this way: hlist requires the user to initialize hlist
head before usage, therefore as a hlist user, hashtable implementation
must do that.

We do it automatically when the hashtable user does
DEFINE_HASHTABLE(), but we can't do that if he does
DECLARE_HASHTABLE(). This means that the hashtable user must call
hash_init() whenever he uses DECLARE_HASHTABLE() to create his
hashtable.

There are two options here, either we specify that hash_init() should
only be called if DECLARE_HASHTABLE() was called, which is confusing,
inconsistent and prone to errors, or we can just say that it should be
called whenever a hashtable is used.

The only way to work around it IMO is to get hlist to not require
initializing before usage, and there are good reasons that that won't
happen.


Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 2:31 PM, Josh Triplett  wrote:
> > On Mon, Oct 29, 2012 at 01:29:24PM -0400, Sasha Levin wrote:
> >> On Mon, Oct 29, 2012 at 7:35 AM, Mathieu Desnoyers
> >>  wrote:
> >> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> >> Switch tracepoints to use the new hashtable implementation. This 
> >> >> reduces the amount of
> >> >> generic unrelated code in the tracepoints.
> >> >>
> >> >> Signed-off-by: Sasha Levin 
> >> >> ---
> >> >>  kernel/tracepoint.c | 27 +++
> >> >>  1 file changed, 11 insertions(+), 16 deletions(-)
> >> >>
> >> >> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> >> >> index d96ba22..854df92 100644
> >> >> --- a/kernel/tracepoint.c
> >> >> +++ b/kernel/tracepoint.c
> >> >> @@ -26,6 +26,7 @@
> >> >>  #include 
> >> >>  #include 
> >> >>  #include 
> >> >> +#include 
> >> >>
> >> >>  extern struct tracepoint * const __start___tracepoints_ptrs[];
> >> >>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
> >> >> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
> >> >>   * Protected by tracepoints_mutex.
> >> >>   */
> >> >>  #define TRACEPOINT_HASH_BITS 6
> >> >> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
> >> >> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
> >> >> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
> >> >>
> >> > [...]
> >> >>
> >> >> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
> >> >>
> >> >>  static int init_tracepoints(void)
> >> >>  {
> >> >> + hash_init(tracepoint_table);
> >> >> +
> >> >>   return register_module_notifier(&tracepoint_module_nb);
> >> >>  }
> >> >>  __initcall(init_tracepoints);
> >> >
> >> > So we have a hash table defined in .bss (therefore entirely initialized
> >> > to NULL), and you add a call to "hash_init", which iterates on the whole
> >> > array and initialize it to NULL (again) ?
> >> >
> >> > This extra initialization is redundant. I think it should be removed
> >> > from here, and hashtable.h should document that hash_init() don't need
> >> > to be called on zeroed memory (which includes static/global variables,
> >> > kzalloc'd memory, etc).
> >>
> >> This was discussed in the previous series, the conclusion was to call
> >> hash_init() either way to keep the encapsulation and consistency.
> >>
> >> It's cheap enough and happens only once, so why not?
> >
> > Unnecessary work adds up.  Better not to do it unnecessarily, even if by
> > itself it doesn't cost that much.
> >
> > It doesn't seem that difficult for future fields to have 0 as their
> > initialized state.
> 
> Let's put it this way: hlist requires the user to initialize hlist
> head before usage, therefore as a hlist user, hashtable implementation
> must do that.
> 
> We do it automatically when the hashtable user does
> DEFINE_HASHTABLE(), but we can't do that if he does
> DECLARE_HASHTABLE(). This means that the hashtable user must call
> hash_init() whenever he uses DECLARE_HASHTABLE() to create his
> hashtable.
> 
> There are two options here, either we specify that hash_init() should
> only be called if DECLARE_HASHTABLE() was called, which is confusing,
> inconsistent and prone to errors, or we can just say that it should be
> called whenever a hashtable is used.
> 
> The only way to work around it IMO is to get hlist to not require
> initializing before usage, and there are good reasons that that won't
> happen.

Hrm, just a second here.

The argument about hash_init being useful to add magic values in the
future only works for the cases where a hash table is declared with
DECLARE_HASHTABLE(). It's completely pointless with DEFINE_HASHTABLE(),
because we could initialize any debugging variables from within
DEFINE_HASHTABLE().

So I take my "Agreed" back. I disagree with initializing the hash table
twice redundantly. There should be at least "DEFINE_HASHTABLE()" or a
hash_init() (for DECLARE_HASHTABLE()), but not useless execution
initialization on top of an already statically initialized hash table.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Tejun Heo
On Mon, Oct 29, 2012 at 02:53:19PM -0400, Mathieu Desnoyers wrote:
> The argument about hash_init being useful to add magic values in the
> future only works for the cases where a hash table is declared with
> DECLARE_HASHTABLE(). It's completely pointless with DEFINE_HASHTABLE(),
> because we could initialize any debugging variables from within
> DEFINE_HASHTABLE().

You can do that with [0 .. HASH_SIZE - 1] initializer.

Thanks.

-- 
tejun
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Tejun Heo
On Mon, Oct 29, 2012 at 11:58:14AM -0700, Tejun Heo wrote:
> On Mon, Oct 29, 2012 at 02:53:19PM -0400, Mathieu Desnoyers wrote:
> > The argument about hash_init being useful to add magic values in the
> > future only works for the cases where a hash table is declared with
> > DECLARE_HASHTABLE(). It's completely pointless with DEFINE_HASHTABLE(),
> > because we could initialize any debugging variables from within
> > DEFINE_HASHTABLE().
> 
> You can do that with [0 .. HASH_SIZE - 1] initializer.

And in general, let's please try not to do optimizations which are
pointless.  Just stick to the usual semantics.  You have an abstract
data structure - invoke the initializer before using it.  Sure,
optimize it if it shows up somewhere.  And here, if we do the
initializers properly, it shouldn't cause any more actual overhead -
ie. DEFINE_HASHTABLE() will basicallly boil down to all zero
assignments and the compiler will put the whole thing in .bss anyway.

Thanks.

-- 
tejun
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 2:53 PM, Mathieu Desnoyers
 wrote:
> * Sasha Levin (levinsasha...@gmail.com) wrote:
>> On Mon, Oct 29, 2012 at 2:31 PM, Josh Triplett  wrote:
>> > On Mon, Oct 29, 2012 at 01:29:24PM -0400, Sasha Levin wrote:
>> >> On Mon, Oct 29, 2012 at 7:35 AM, Mathieu Desnoyers
>> >>  wrote:
>> >> > * Sasha Levin (levinsasha...@gmail.com) wrote:
>> >> >> Switch tracepoints to use the new hashtable implementation. This 
>> >> >> reduces the amount of
>> >> >> generic unrelated code in the tracepoints.
>> >> >>
>> >> >> Signed-off-by: Sasha Levin 
>> >> >> ---
>> >> >>  kernel/tracepoint.c | 27 +++
>> >> >>  1 file changed, 11 insertions(+), 16 deletions(-)
>> >> >>
>> >> >> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
>> >> >> index d96ba22..854df92 100644
>> >> >> --- a/kernel/tracepoint.c
>> >> >> +++ b/kernel/tracepoint.c
>> >> >> @@ -26,6 +26,7 @@
>> >> >>  #include 
>> >> >>  #include 
>> >> >>  #include 
>> >> >> +#include 
>> >> >>
>> >> >>  extern struct tracepoint * const __start___tracepoints_ptrs[];
>> >> >>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
>> >> >> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
>> >> >>   * Protected by tracepoints_mutex.
>> >> >>   */
>> >> >>  #define TRACEPOINT_HASH_BITS 6
>> >> >> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
>> >> >> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
>> >> >> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
>> >> >>
>> >> > [...]
>> >> >>
>> >> >> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
>> >> >>
>> >> >>  static int init_tracepoints(void)
>> >> >>  {
>> >> >> + hash_init(tracepoint_table);
>> >> >> +
>> >> >>   return register_module_notifier(&tracepoint_module_nb);
>> >> >>  }
>> >> >>  __initcall(init_tracepoints);
>> >> >
>> >> > So we have a hash table defined in .bss (therefore entirely initialized
>> >> > to NULL), and you add a call to "hash_init", which iterates on the whole
>> >> > array and initialize it to NULL (again) ?
>> >> >
>> >> > This extra initialization is redundant. I think it should be removed
>> >> > from here, and hashtable.h should document that hash_init() don't need
>> >> > to be called on zeroed memory (which includes static/global variables,
>> >> > kzalloc'd memory, etc).
>> >>
>> >> This was discussed in the previous series, the conclusion was to call
>> >> hash_init() either way to keep the encapsulation and consistency.
>> >>
>> >> It's cheap enough and happens only once, so why not?
>> >
>> > Unnecessary work adds up.  Better not to do it unnecessarily, even if by
>> > itself it doesn't cost that much.
>> >
>> > It doesn't seem that difficult for future fields to have 0 as their
>> > initialized state.
>>
>> Let's put it this way: hlist requires the user to initialize hlist
>> head before usage, therefore as a hlist user, hashtable implementation
>> must do that.
>>
>> We do it automatically when the hashtable user does
>> DEFINE_HASHTABLE(), but we can't do that if he does
>> DECLARE_HASHTABLE(). This means that the hashtable user must call
>> hash_init() whenever he uses DECLARE_HASHTABLE() to create his
>> hashtable.
>>
>> There are two options here, either we specify that hash_init() should
>> only be called if DECLARE_HASHTABLE() was called, which is confusing,
>> inconsistent and prone to errors, or we can just say that it should be
>> called whenever a hashtable is used.
>>
>> The only way to work around it IMO is to get hlist to not require
>> initializing before usage, and there are good reasons that that won't
>> happen.
>
> Hrm, just a second here.
>
> The argument about hash_init being useful to add magic values in the
> future only works for the cases where a hash table is declared with
> DECLARE_HASHTABLE(). It's completely pointless with DEFINE_HASHTABLE(),
> because we could initialize any debugging variables from within
> DEFINE_HASHTABLE().
>
> So I take my "Agreed" back. I disagree with initializing the hash table
> twice redundantly. There should be at least "DEFINE_HASHTABLE()" or a
> hash_init() (for DECLARE_HASHTABLE()), but not useless execution
> initialization on top of an already statically initialized hash table.

The "magic values" argument was used to point out that some sort of
initialization *must* occur, either by hash_init() or by a proper
initialization in DEFINE_HASHTABLE(), and we can't simply memset() it
to 0. It appears that we all agree on that.

The other thing is whether hash_init() should be called for hashtables
that were created with DEFINE_HASHTABLE(). That point was raised by
Neil Brown last time this series went around, and it seems that no one
objected to the point that it should be consistent across the code.

Even if we ignore hash_init() being mostly optimized out, is it really
worth it taking the risk that some future patch would move a hashtable
that user DEFINE_HASHTABLE() into a struct a

Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Tejun Heo (t...@kernel.org) wrote:
> On Mon, Oct 29, 2012 at 11:58:14AM -0700, Tejun Heo wrote:
> > On Mon, Oct 29, 2012 at 02:53:19PM -0400, Mathieu Desnoyers wrote:
> > > The argument about hash_init being useful to add magic values in the
> > > future only works for the cases where a hash table is declared with
> > > DECLARE_HASHTABLE(). It's completely pointless with DEFINE_HASHTABLE(),
> > > because we could initialize any debugging variables from within
> > > DEFINE_HASHTABLE().
> > 
> > You can do that with [0 .. HASH_SIZE - 1] initializer.
> 
> And in general, let's please try not to do optimizations which are
> pointless.  Just stick to the usual semantics.  You have an abstract
> data structure - invoke the initializer before using it.  Sure,
> optimize it if it shows up somewhere.  And here, if we do the
> initializers properly, it shouldn't cause any more actual overhead -
> ie. DEFINE_HASHTABLE() will basicallly boil down to all zero
> assignments and the compiler will put the whole thing in .bss anyway.

Yes, agreed. I was going too far in optimization land by proposing
assumptions on zeroed memory. All I actually really care about is that
we don't end up calling hash_init() on a statically defined (and thus
already initialized) hash table.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Tejun Heo
On Mon, Oct 29, 2012 at 03:09:36PM -0400, Sasha Levin wrote:
> The other thing is whether hash_init() should be called for hashtables
> that were created with DEFINE_HASHTABLE(). That point was raised by
> Neil Brown last time this series went around, and it seems that no one
> objected to the point that it should be consistent across the code.

Hmmm?  If something is DEFINE_XXX()'d, you definitely shouldn't be
calling XXX_init() on it.  That's how it is with most other abstract
data types and you need *VERY* strong rationale to deviate from that.

Thanks.

-- 
tejun
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Mathieu Desnoyers
* Sasha Levin (levinsasha...@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 2:53 PM, Mathieu Desnoyers
>  wrote:
> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> On Mon, Oct 29, 2012 at 2:31 PM, Josh Triplett  
> >> wrote:
> >> > On Mon, Oct 29, 2012 at 01:29:24PM -0400, Sasha Levin wrote:
> >> >> On Mon, Oct 29, 2012 at 7:35 AM, Mathieu Desnoyers
> >> >>  wrote:
> >> >> > * Sasha Levin (levinsasha...@gmail.com) wrote:
> >> >> >> Switch tracepoints to use the new hashtable implementation. This 
> >> >> >> reduces the amount of
> >> >> >> generic unrelated code in the tracepoints.
> >> >> >>
> >> >> >> Signed-off-by: Sasha Levin 
> >> >> >> ---
> >> >> >>  kernel/tracepoint.c | 27 +++
> >> >> >>  1 file changed, 11 insertions(+), 16 deletions(-)
> >> >> >>
> >> >> >> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> >> >> >> index d96ba22..854df92 100644
> >> >> >> --- a/kernel/tracepoint.c
> >> >> >> +++ b/kernel/tracepoint.c
> >> >> >> @@ -26,6 +26,7 @@
> >> >> >>  #include 
> >> >> >>  #include 
> >> >> >>  #include 
> >> >> >> +#include 
> >> >> >>
> >> >> >>  extern struct tracepoint * const __start___tracepoints_ptrs[];
> >> >> >>  extern struct tracepoint * const __stop___tracepoints_ptrs[];
> >> >> >> @@ -49,8 +50,7 @@ static LIST_HEAD(tracepoint_module_list);
> >> >> >>   * Protected by tracepoints_mutex.
> >> >> >>   */
> >> >> >>  #define TRACEPOINT_HASH_BITS 6
> >> >> >> -#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
> >> >> >> -static struct hlist_head tracepoint_table[TRACEPOINT_TABLE_SIZE];
> >> >> >> +static DEFINE_HASHTABLE(tracepoint_table, TRACEPOINT_HASH_BITS);
> >> >> >>
> >> >> > [...]
> >> >> >>
> >> >> >> @@ -722,6 +715,8 @@ struct notifier_block tracepoint_module_nb = {
> >> >> >>
> >> >> >>  static int init_tracepoints(void)
> >> >> >>  {
> >> >> >> + hash_init(tracepoint_table);
> >> >> >> +
> >> >> >>   return register_module_notifier(&tracepoint_module_nb);
> >> >> >>  }
> >> >> >>  __initcall(init_tracepoints);
> >> >> >
> >> >> > So we have a hash table defined in .bss (therefore entirely 
> >> >> > initialized
> >> >> > to NULL), and you add a call to "hash_init", which iterates on the 
> >> >> > whole
> >> >> > array and initialize it to NULL (again) ?
> >> >> >
> >> >> > This extra initialization is redundant. I think it should be removed
> >> >> > from here, and hashtable.h should document that hash_init() don't need
> >> >> > to be called on zeroed memory (which includes static/global variables,
> >> >> > kzalloc'd memory, etc).
> >> >>
> >> >> This was discussed in the previous series, the conclusion was to call
> >> >> hash_init() either way to keep the encapsulation and consistency.
> >> >>
> >> >> It's cheap enough and happens only once, so why not?
> >> >
> >> > Unnecessary work adds up.  Better not to do it unnecessarily, even if by
> >> > itself it doesn't cost that much.
> >> >
> >> > It doesn't seem that difficult for future fields to have 0 as their
> >> > initialized state.
> >>
> >> Let's put it this way: hlist requires the user to initialize hlist
> >> head before usage, therefore as a hlist user, hashtable implementation
> >> must do that.
> >>
> >> We do it automatically when the hashtable user does
> >> DEFINE_HASHTABLE(), but we can't do that if he does
> >> DECLARE_HASHTABLE(). This means that the hashtable user must call
> >> hash_init() whenever he uses DECLARE_HASHTABLE() to create his
> >> hashtable.
> >>
> >> There are two options here, either we specify that hash_init() should
> >> only be called if DECLARE_HASHTABLE() was called, which is confusing,
> >> inconsistent and prone to errors, or we can just say that it should be
> >> called whenever a hashtable is used.
> >>
> >> The only way to work around it IMO is to get hlist to not require
> >> initializing before usage, and there are good reasons that that won't
> >> happen.
> >
> > Hrm, just a second here.
> >
> > The argument about hash_init being useful to add magic values in the
> > future only works for the cases where a hash table is declared with
> > DECLARE_HASHTABLE(). It's completely pointless with DEFINE_HASHTABLE(),
> > because we could initialize any debugging variables from within
> > DEFINE_HASHTABLE().
> >
> > So I take my "Agreed" back. I disagree with initializing the hash table
> > twice redundantly. There should be at least "DEFINE_HASHTABLE()" or a
> > hash_init() (for DECLARE_HASHTABLE()), but not useless execution
> > initialization on top of an already statically initialized hash table.
> 
> The "magic values" argument was used to point out that some sort of
> initialization *must* occur, either by hash_init() or by a proper
> initialization in DEFINE_HASHTABLE(), and we can't simply memset() it
> to 0. It appears that we all agree on that.

Yes.

> The other thing is whether hash_init() should be called for hashtables
> that were created with DEFINE_HASHTABLE(). That point was raised by
> Neil Brown last time this ser

Re: [ovs-dev] [PATCH v7 06/16] tracepoint: use new hashtable implementation

2012-10-29 Thread Sasha Levin
On Mon, Oct 29, 2012 at 3:12 PM, Tejun Heo  wrote:
> On Mon, Oct 29, 2012 at 03:09:36PM -0400, Sasha Levin wrote:
>> The other thing is whether hash_init() should be called for hashtables
>> that were created with DEFINE_HASHTABLE(). That point was raised by
>> Neil Brown last time this series went around, and it seems that no one
>> objected to the point that it should be consistent across the code.
>
> Hmmm?  If something is DEFINE_XXX()'d, you definitely shouldn't be
> calling XXX_init() on it.  That's how it is with most other abstract
> data types and you need *VERY* strong rationale to deviate from that.

Neil Brown raised that point last time that this series went around,
and suggested that this should be consistent and hash_init() would
appear everywhere, even if DEFINE_HASHTABLE() was used. Since no one
objected to that I thought we're going with that.

I'll chalk it up to me getting confused :)


Thanks,
Sasha
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovs-ctl.in: Do not fail 'restart'.

2012-10-29 Thread Gurucharan Shetty
ovs-ctl restart is called by the debian package upgrade.
We do not want to fail the package upgrade just because
restoring of flows failed.

The error message will still be printed on the console.

Bug #13730.
Signed-off-by: Gurucharan Shetty 
---
 utilities/ovs-ctl.in |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
index e8b72ba..7febd1c 100755
--- a/utilities/ovs-ctl.in
+++ b/utilities/ovs-ctl.in
@@ -420,7 +420,8 @@ restart () {
 stop_forwarding
 start_forwarding
 
-restore_flows
+# Restore the saved flows. Do not return error if restore fails.
+restore_flows || true
 }
 
 ## --- ##
-- 
1.7.9.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] JOIN in this new home based business MCA!!!

2012-10-29 Thread Hu$tleMann

http://www.reverbnation.com/widget_code/html_widget/artist_2782324?widget_id=51&posted_by=artist_2782324&pwc[design]=default&pwc[background_color]=%2333&pwc[size]=custom";
 width="525" height="350" frameborder="0" scrolling="no">

http://getweeklypaychecks.com/cp1.php?id=1620&campaign=videolink10-25-12

Thank You for your time...

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: Fix sparse warning for symbol 'BUILD_BUG_ON_NOT_POWER_OF_2'

2012-10-29 Thread Pravin Shelar
On Fri, Oct 26, 2012 at 2:29 PM, Jesse Gross  wrote:
> On Thu, Oct 25, 2012 at 1:28 PM, Pravin Shelar  wrote:
>> On Thu, Oct 25, 2012 at 1:19 PM, Ben Pfaff  wrote:
>>> On Sat, Oct 27, 2012 at 01:07:24PM -0700, Pravin B Shelar wrote:
 Signed-off-by: Pravin B Shelar 
>>>
>>> I'm curious, what warning does sparse give?  (You could add this to
>>> the commit message.)
>>>
>> This symbol was moved from kernel.h to bug.h in 3.4.
>> warning is as follows, I will add it to commit msg.
>>
>> include/linux/bug.h:15:9: warning: preprocessor token
>> BUILD_BUG_ON_NOT_POWER_OF_2 redefined
>> /root/ovs-cl1/stt-fl/datapath/linux/compat/include/linux/kernel.h:44:9:
>> this was the original definition
>
> Since the "correct" place for this on new kernels is in bug.h, it's
> probably better to just move the whole definition there to make it
> easier to find.  That would also solve this problem since bug.h
> includes kernel.h

ok, I will post new patch.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: Add Upstream id for GRE type.

2012-10-29 Thread Pravin Shelar
On Fri, Oct 26, 2012 at 11:32 AM, Jesse Gross  wrote:
> On Sat, Oct 27, 2012 at 1:06 PM, Pravin B Shelar  wrote:
>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>> index 89feb61..d4a043c 100644
>> --- a/include/linux/openvswitch.h
>> +++ b/include/linux/openvswitch.h
>> @@ -182,6 +182,7 @@ enum ovs_vport_type {
>> OVS_VPORT_TYPE_UNSPEC,
>> OVS_VPORT_TYPE_NETDEV,   /* network device */
>> OVS_VPORT_TYPE_INTERNAL, /* network device implemented by datapath */
>> +   OVS_VPORT_TYPE_FT_GRE,
>
> I'm not sure what FT stands for.  Future?  In any case, can you add a
> comment next to it?
I meant Flow-based Tunneling, I will add a comment.

>
>> OVS_VPORT_TYPE_PATCH = 100, /* virtual tunnel connecting two vports 
>> */
>> OVS_VPORT_TYPE_GRE,  /* GRE tunnel */
>> OVS_VPORT_TYPE_CAPWAP,   /* CAPWAP tunnel */
>
>> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
>> index 621abd1..a864341 100644
>> --- a/lib/netdev-vport.c
>> +++ b/lib/netdev-vport.c
>> @@ -154,6 +154,9 @@ netdev_vport_get_netdev_type(const struct 
>> dpif_linux_vport *vport)
>>  case OVS_VPORT_TYPE_PATCH:
>>  return "patch";
>>
>> +case OVS_VPORT_TYPE_FT_GRE:
>> +return "gre";
>
> I don't think that we want to introduce any uses of the symbolic name
> in userspace.  In the next release, we'll drop the FT part of the name
> (but keep the same constant) so I think it's easier if we don't touch
> userspace.
>
ok.

> Also, did you look at the userspace code to see whether it can handle
> the transition by removing port types that it doesn't understand and
> recreating them?

Yes, I checked it and it works.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath: Remove tunnel header caching.

2012-10-29 Thread Pravin Shelar
On Fri, Oct 26, 2012 at 5:15 PM, Jesse Gross  wrote:
> On Thu, Oct 25, 2012 at 11:14 AM, Pravin B Shelar  wrote:
>> Tunnel caching was added to reduce CPU utilization on TX path
>> by caching packet header, So performance gain is directly proportional
>> to number of skbs transferred. But with help of offloads skb are getting
>> larger. So there are less number of skbs. Therefore header caching does
>> not shows similar gains we seen in past. And now kernel 3.6 has removed
>> dst caching from networking which makes header caching even more tricky.
>> So this commit removes header caching from OVS tunnelling.
>>
>> Signed-off-by: Pravin B Shelar 
>
> We should probably add a NEWS item.
OK.

>
>> diff --git a/datapath/tunnel.h b/datapath/tunnel.h
>> index 951a6f1..f296ecb 100644
>> --- a/datapath/tunnel.h
>> +++ b/datapath/tunnel.h
>> -   /*
>> -* Updates the cached header of a packet to match the actual packet
>> +* Updates the header of a packet to match the actual packet
>>  * data.  Typical things that might need to be updated are length,
>>  * checksum, etc.  The IP header will have already been updated and 
>> this
>>  * is the final step before transmission.  Returns a linked list of
>
> Is this still really updating the header now?  It's really creating it
> from scratch.
>
right. I will rename it back to build_header.

>> diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
>> index d02d4ec..3a877f3 100644
>> --- a/datapath/vport-gre.c
>> +++ b/datapath/vport-gre.c
>> @@ -196,7 +162,6 @@ static struct sk_buff *gre_update_header(const struct 
>> vport *vport,
>>  * packet originally had DF set.
>>  */
>> skb->local_df = 1;
>> -   __ip_select_ident(ip_hdr(skb), dst, 0);
>
> Why is this no longer needed?

right, It is required.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovs-vsctl: Allow command-specific options to mingle with global options.

2012-10-29 Thread Adam Heath
On 10/29/2012 11:34 AM, Ben Pfaff wrote:
> Until now, a command like "ovs-vsctl --may-exist add-br br0" yielded a
> confusing error message.  Users had to realize that the correct form was
> "ovs-vsctl -- --may-exist add-br br0", but instead they often reported a
> bug or gave up in frustration.  Even though the behavior was documented, it
> was counterintuitive.
> 
> This commit allows command-specific options to be mixed with global
> options, making both forms of the command listed above equally acceptable.

Tbh, I would actually prefer to have command-specific options that
appear in the global area issue an error.

==
one-vsctl: Found command-specific --may-exist in global area, please
use -- instead.
==

See (1) for a reason why I'd prefer it the way I suggested above.

1: http://lwn.net/Articles/520198/
   you might need to wait a few days for it to become publically
   available.


> 
> CC: 691...@bugs.debian.org
> Reported-by: Adam Heath 
> Signed-off-by: Ben Pfaff 
> ---
>  AUTHORS  |1 +
>  tests/ovs-vsctl.at   |   13 -
>  utilities/ovs-vsctl.8.in |   11 ++--
>  utilities/ovs-vsctl.c|  131 +
>  4 files changed, 137 insertions(+), 19 deletions(-)
> 
> diff --git a/AUTHORS b/AUTHORS
> index 4687865..83e6bb5 100644
> --- a/AUTHORS
> +++ b/AUTHORS
> @@ -81,6 +81,7 @@ The following additional people are mentioned in commit 
> logs as having
>  provided helpful bug reports or suggestions.
>  
>  Aaron M. Ucko   u...@debian.org
> +Adam Heath  doo...@brainfood.com
>  Ahmed Bilal numan...@gmail.com
>  Alan Shieh  ash...@nicira.com
>  Alban Browaeys  pra...@yahoo.com
> diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
> index e903619..cb12f31 100644
> --- a/tests/ovs-vsctl.at
> +++ b/tests/ovs-vsctl.at
> @@ -15,7 +15,7 @@ dnl RUN_OVS_VSCTL(COMMAND, ...)
>  dnl
>  dnl Executes each ovs-vsctl COMMAND.
>  m4_define([RUN_OVS_VSCTL],
> -  [m4_foreach([command], [$@], [ovs-vsctl --timeout=5 --no-wait 
> -vreconnect:emer --db=unix:socket -- command
> +  [m4_foreach([command], [$@], [ovs-vsctl --timeout=5 --no-wait 
> -vreconnect:emer --db=unix:socket command
>  ])])
>  m4_define([RUN_OVS_VSCTL_ONELINE],
>[m4_foreach([command], [$@], [ovs-vsctl --timeout=5 --no-wait 
> -vreconnect:emer --db=unix:socket --oneline -- command
> @@ -655,6 +655,17 @@ AT_CLEANUP
>  AT_SETUP([database commands -- negative checks])
>  AT_KEYWORDS([ovs-vsctl])
>  OVS_VSCTL_SETUP
> +
> +AT_CHECK([ovs-vsctl --may-exist],
> +  [1], [ignore], [ovs-vsctl: missing command name (use --help for help)
> +], [OVS_VSCTL_CLEANUP])
> +AT_CHECK([ovs-vsctl --may-exist --],
> +  [1], [ignore], [ovs-vsctl: missing command name (use --help for help)
> +], [OVS_VSCTL_CLEANUP])
> +AT_CHECK([ovs-vsctl -- --may-exist],
> +  [1], [ignore], [ovs-vsctl: missing command name (use --help for help)
> +], [OVS_VSCTL_CLEANUP])
> +
>  AT_CHECK([RUN_OVS_VSCTL([add-br br0])],
>[0], [ignore], [], [OVS_VSCTL_CLEANUP])
>  AT_CHECK([RUN_OVS_VSCTL([add-br br1])], 
> diff --git a/utilities/ovs-vsctl.8.in b/utilities/ovs-vsctl.8.in
> index 1b80d05..8f70d6b 100644
> --- a/utilities/ovs-vsctl.8.in
> +++ b/utilities/ovs-vsctl.8.in
> @@ -43,9 +43,9 @@ implemented as a single atomic transaction against the 
> database.
>  The \fBovs\-vsctl\fR command line begins with global options (see
>  \fBOPTIONS\fR below for details).  The global options are followed by
>  one or more commands.  Each command should begin with \fB\-\-\fR by
> -itself as a command-line argument, to separate it from the global
> -options and following commands.  (If the first command does not have
> -any options, then the first \fB\-\-\fR may be omitted.)  The command
> +itself as a command-line argument, to separate it from the following
> +commands.  (The \fB\-\-\fR before the first command is optional.)  The
> +command
>  itself starts with command-specific options, if any, followed by the
>  command name and any arguments.  See \fBEXAMPLES\fR below for syntax
>  examples.
> @@ -769,10 +769,9 @@ Delete bridge \fBbr0\fR, reporting an error if it does 
> not exist:
>  .IP
>  .B "ovs\-vsctl del\-br br0"
>  .PP
> -Delete bridge \fBbr0\fR if it exists (the \fB\-\-\fR is required to
> -separate \fBdel\-br\fR's options from the global options):
> +Delete bridge \fBbr0\fR if it exists:
>  .IP
> -.B "ovs\-vsctl \-\- \-\-if\-exists del\-br br0"
> +.B "ovs\-vsctl \-\-if\-exists del\-br br0"
>  .PP
>  Set the \fBqos\fR column of the \fBPort\fR record for \fBeth0\fR to
>  point to a new \fBQoS\fR record, which in turn points with its queue 0
> diff --git a/utilities/ovs-vsctl.c b/utilities/ovs-vsctl.c
> index fda3a89..1745418 100644
> --- a/utilities/ovs-vsctl.c
> +++ b/utilities/ovs-vsctl.c
> @@ -131,12 +131,14 @@ static void vsctl_exit(int status) NO_RETURN;
>  static void vsctl_fatal(const char *, ...) PRINTF_FORMAT(1, 2) NO_RETURN;
>  static char *default_db(void);
>  static void usage

Re: [ovs-dev] [PATCH] ovs-ctl.in: Do not fail 'restart'.

2012-10-29 Thread Ben Pfaff
On Mon, Oct 29, 2012 at 12:35:34PM -0700, Gurucharan Shetty wrote:
> ovs-ctl restart is called by the debian package upgrade.
> We do not want to fail the package upgrade just because
> restoring of flows failed.
> 
> The error message will still be printed on the console.
> 
> Bug #13730.
> Signed-off-by: Gurucharan Shetty 

Seems reasonable, thanks.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 04/15] Use ODP ports in dpif layer and below.

2012-10-29 Thread Justin Pettit
On Thu, Oct 18, 2012 at 4:42 PM, Ben Pfaff  wrote:

> On Thu, Oct 18, 2012 at 12:51:49PM -0700, Justin Pettit wrote:
> > The current code has a simple mapping between datapath and OpenFlow port
> > numbers (1-to-1 other than the local port).
>
> The important fact is not really that the mapping is one-to-one, it's
> that the port numbers are actually *the same* except OFPP_LOCAL <-> 0,
> and that that is statically known at compile time.
>

Okay, it should be clearer now.


> I think something is inconsistent in dpif_netdev_flow_from_nlattrs().
> The function odp_flow_key_to_flow() that it calls is documented to put
> a datapath port number in the flow that it returns, but
> dpif_netdev_flow_from_nlattrs() then compares the returned port number
> against OFPP_MAX, OFPP_LOCAL, and OFPP_NONE.
>
> I think that the comments could be improved.  Suggestion for
> odp_flow_key_from_flow():
>  * 'flow->in_port' is ignored (since it is likely to be an OpenFlow port
> number
>  * rather than a datapath port number).  Instead, if 'odp_in_port' is
> anything
>  * other than OVSP_NONE, it is included in 'buf' as the input port.
>
> Suggestion for struct flow:
>
> /*
>  * A flow in the network.
>  *
>  * The meaning of 'in_port' is context-dependent.  In most cases, it is a
>  * 16-bit OpenFlow 1.0 port number.  In the software datapath interface
> (dpif)
>  * layer and its implementations (e.g. dpif-linux, dpif-netdev), it is
> instead
>  * a 32-bit datapath port number.
>  */
> struct flow {
> struct flow_tnl tunnel; /* Encapsulating tunnel parameters. */
> ovs_be64 metadata;  /* OpenFlow Metadata. */
> struct in6_addr ipv6_src;   /* IPv6 source address. */
> struct in6_addr ipv6_dst;   /* IPv6 destination address. */
> struct in6_addr nd_target;  /* IPv6 neighbor discovery (ND) target. */
> uint32_t skb_priority;  /* Packet priority for QoS. */
> uint32_t regs[FLOW_N_REGS]; /* Registers. */
> ovs_be32 nw_src;/* IPv4 source address. */
> ovs_be32 nw_dst;/* IPv4 destination address. */
> ovs_be32 ipv6_label;/* IPv6 flow label. */
> uint32_t in_port;   /* Input port (see above). */
>

Great suggestions.  Done.


> In ofproto_unixctl_trace(), it looks very much to me like the conversion
> to an ODP port in the argc == 6 case should not be happening, that is,
> that the ofp_port_to_odp_port call should be omitted here:
>  uint32_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
>

Fixed.  Isn't this a bug in the existing code?  Should I do a separate
patch to fix this in existing branches?


> Are there any callers to ofp_port_to_odp_port() that couldn't handle the
> extra semantics that ofp_port_to_odp_in_port() provides?  I think that
> we could just fold the latter into the former.
>

Good point.  I made that change.

Thanks.

--Justin
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 04/15] Use ODP ports in dpif layer and below.

2012-10-29 Thread Ben Pfaff
On Mon, Oct 29, 2012 at 02:11:51PM -0700, Justin Pettit wrote:
> On Thu, Oct 18, 2012 at 4:42 PM, Ben Pfaff  wrote:
> > In ofproto_unixctl_trace(), it looks very much to me like the conversion
> > to an ODP port in the argc == 6 case should not be happening, that is,
> > that the ofp_port_to_odp_port call should be omitted here:
> >  uint32_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
> >
> 
> Fixed.  Isn't this a bug in the existing code?  Should I do a separate
> patch to fix this in existing branches?

I've had a fix for this on a local branch forever and I finally sent
it out on Friday, so you can just review it here:
http://openvswitch.org/pipermail/dev/2012-October/022319.html

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [tests 20/22] ofproto-dpif: Fix "ofproto/trace" of flow for local port.

2012-10-29 Thread Justin Pettit
Acked-by: Justin Pettit 

Thanks,

--Justin


On Oct 25, 2012, at 5:02 PM, Ben Pfaff  wrote:

> The 'in_port' variable is an OpenFlow port, not an ODP port, so we should
> not translate it to ODP.
> 
> Signed-off-by: Ben Pfaff 
> ---
> ofproto/ofproto-dpif.c |2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 3e1b7b3..7d914c8 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -6811,7 +6811,7 @@ ofproto_unixctl_trace(struct unixctl_conn *conn, int 
> argc, const char *argv[],
> const char *tun_id_s = argv[3];
> const char *in_port_s = argv[4];
> const char *packet_s = argv[5];
> -uint16_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
> +uint16_t in_port = atoi(in_port_s);
> ovs_be64 tun_id = htonll(strtoull(tun_id_s, NULL, 0));
> uint32_t priority = atoi(priority_s);
> const char *msg;
> -- 
> 1.7.2.5
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 04/15] Use ODP ports in dpif layer and below.

2012-10-29 Thread Justin Pettit
Huh, somehow I missed it in that 22 patch series.  I just reviewed that one 
patch.

Thanks,

--Justin


On Oct 29, 2012, at 2:14 PM, Ben Pfaff  wrote:

> On Mon, Oct 29, 2012 at 02:11:51PM -0700, Justin Pettit wrote:
>> On Thu, Oct 18, 2012 at 4:42 PM, Ben Pfaff  wrote:
>>> In ofproto_unixctl_trace(), it looks very much to me like the conversion
>>> to an ODP port in the argc == 6 case should not be happening, that is,
>>> that the ofp_port_to_odp_port call should be omitted here:
>>> uint32_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
>>> 
>> 
>> Fixed.  Isn't this a bug in the existing code?  Should I do a separate
>> patch to fix this in existing branches?
> 
> I've had a fix for this on a local branch forever and I finally sent
> it out on Friday, so you can just review it here:
>http://openvswitch.org/pipermail/dev/2012-October/022319.html
> 
> Thanks,
> 
> Ben.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2] datapath: Add Upstream id for GRE type.

2012-10-29 Thread Pravin B Shelar
v1-v2:
- Dropped userspace changes.
--8<--cut here-->8--

Once GRE is upstream it will have new type to have continuous sequence
of ids for vport type.  Following patch adds this ID to have
compatibility with it.

Signed-off-by: Pravin B Shelar 
---
 datapath/vport-gre.c|   23 +++
 datapath/vport.c|1 +
 datapath/vport.h|1 +
 include/linux/openvswitch.h |1 +
 4 files changed, 26 insertions(+)

diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
index d02d4ec..e810173 100644
--- a/datapath/vport-gre.c
+++ b/datapath/vport-gre.c
@@ -488,6 +488,11 @@ static struct vport *gre_create(const struct vport_parms 
*parms)
return ovs_tnl_create(parms, &ovs_gre_vport_ops, &gre_tnl_ops);
 }
 
+static struct vport *gre_create_ft(const struct vport_parms *parms)
+{
+   return ovs_tnl_create(parms, &ovs_gre_ft_vport_ops, &gre_tnl_ops);
+}
+
 static const struct tnl_ops gre64_tnl_ops = {
.tunnel_type= TNL_T_PROTO_GRE64,
.ipproto= IPPROTO_GRE,
@@ -536,6 +541,24 @@ static void gre_exit(void)
inet_del_protocol(&gre_protocol_handlers, IPPROTO_GRE);
 }
 
+const struct vport_ops ovs_gre_ft_vport_ops = {
+   .type   = OVS_VPORT_TYPE_FT_GRE,
+   .flags  = VPORT_F_TUN_ID,
+   .init   = gre_init,
+   .exit   = gre_exit,
+   .create = gre_create_ft,
+   .destroy= ovs_tnl_destroy,
+   .set_addr   = ovs_tnl_set_addr,
+   .get_name   = ovs_tnl_get_name,
+   .get_addr   = ovs_tnl_get_addr,
+   .get_options= ovs_tnl_get_options,
+   .set_options= ovs_tnl_set_options,
+   .get_dev_flags  = ovs_vport_gen_get_dev_flags,
+   .is_running = ovs_vport_gen_is_running,
+   .get_operstate  = ovs_vport_gen_get_operstate,
+   .send   = ovs_tnl_send,
+};
+
 const struct vport_ops ovs_gre_vport_ops = {
.type   = OVS_VPORT_TYPE_GRE,
.flags  = VPORT_F_TUN_ID,
diff --git a/datapath/vport.c b/datapath/vport.c
index d9c8cfd..bb2bd27 100644
--- a/datapath/vport.c
+++ b/datapath/vport.c
@@ -41,6 +41,7 @@ static const struct vport_ops *base_vport_ops_list[] = {
&ovs_internal_vport_ops,
&ovs_patch_vport_ops,
&ovs_gre_vport_ops,
+   &ovs_gre_ft_vport_ops,
&ovs_gre64_vport_ops,
 #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26)
&ovs_capwap_vport_ops,
diff --git a/datapath/vport.h b/datapath/vport.h
index 61d5274..5a7caf5 100644
--- a/datapath/vport.h
+++ b/datapath/vport.h
@@ -254,6 +254,7 @@ extern const struct vport_ops ovs_netdev_vport_ops;
 extern const struct vport_ops ovs_internal_vport_ops;
 extern const struct vport_ops ovs_patch_vport_ops;
 extern const struct vport_ops ovs_gre_vport_ops;
+extern const struct vport_ops ovs_gre_ft_vport_ops;
 extern const struct vport_ops ovs_gre64_vport_ops;
 extern const struct vport_ops ovs_capwap_vport_ops;
 
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index 89feb61..e761357 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -182,6 +182,7 @@ enum ovs_vport_type {
OVS_VPORT_TYPE_UNSPEC,
OVS_VPORT_TYPE_NETDEV,   /* network device */
OVS_VPORT_TYPE_INTERNAL, /* network device implemented by datapath */
+   OVS_VPORT_TYPE_FT_GRE,   /* Flow based GRE tunnel. */
OVS_VPORT_TYPE_PATCH = 100, /* virtual tunnel connecting two vports */
OVS_VPORT_TYPE_GRE,  /* GRE tunnel */
OVS_VPORT_TYPE_CAPWAP,   /* CAPWAP tunnel */
-- 
1.7.10

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2] datapath: Fix sparse warning for symbol 'BUILD_BUG_ON_NOT_POWER_OF_2'

2012-10-29 Thread Pravin B Shelar
v1-v2:
- Moved BUILD_BUG_ON_NOT_POWER_OF_2 symbol definition for bug.h
--8<--cut here-->8--

BUILD_BUG_ON_NOT_POWER_OF_2 symbol is moved from kernel.h to
bug.h in 3.4.  Therefore sparse is giving warning:

include/linux/bug.h:15:9: warning: preprocessor token
BUILD_BUG_ON_NOT_POWER_OF_2 redefined
ovs/datapath/linux/compat/include/linux/kernel.h:44:9:
this was the original definition

Signed-off-by: Pravin B Shelar 
---
 datapath/linux/compat/include/linux/bug.h|   12 
 datapath/linux/compat/include/linux/kernel.h |   10 --
 2 files changed, 16 insertions(+), 6 deletions(-)
 create mode 100644 datapath/linux/compat/include/linux/bug.h

diff --git a/datapath/linux/compat/include/linux/bug.h 
b/datapath/linux/compat/include/linux/bug.h
new file mode 100644
index 000..d24e68e
--- /dev/null
+++ b/datapath/linux/compat/include/linux/bug.h
@@ -0,0 +1,12 @@
+#ifndef __BUG_H_WRAPPER
+#define __BUG_H_WRAPPER 1
+
+#include_next 
+
+#ifndef BUILD_BUG_ON_NOT_POWER_OF_2
+/* Force a compilation error if a constant expression is not a power of 2 */
+#define BUILD_BUG_ON_NOT_POWER_OF_2(n) \
+   BUILD_BUG_ON((n) == 0 || (((n) & ((n) - 1)) != 0))
+#endif
+
+#endif
diff --git a/datapath/linux/compat/include/linux/kernel.h 
b/datapath/linux/compat/include/linux/kernel.h
index 812f213..069839b 100644
--- a/datapath/linux/compat/include/linux/kernel.h
+++ b/datapath/linux/compat/include/linux/kernel.h
@@ -7,7 +7,11 @@
 #endif
 
 #include 
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,4,0)
+/* BUILD_BUG_ON_NOT_POWER_OF_2 definition */
 #include 
+#endif
+
 #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28)
 #undef pr_emerg
 #define pr_emerg(fmt, ...) \
@@ -39,12 +43,6 @@
 #define pr_warn pr_warning
 #endif
 
-#ifndef BUILD_BUG_ON_NOT_POWER_OF_2
-/* Force a compilation error if a constant expression is not a power of 2 */
-#define BUILD_BUG_ON_NOT_POWER_OF_2(n) \
-   BUILD_BUG_ON((n) == 0 || (((n) & ((n) - 1)) != 0))
-#endif
-
 #if defined(CONFIG_PREEMPT) && LINUX_VERSION_CODE < KERNEL_VERSION(2,6,21)
 #error "CONFIG_PREEMPT is broken before 2.6.21--see commit 4498121ca3, 
\"[NET]: Handle disabled preemption in gfp_any()\""
 #endif
-- 
1.7.10

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2] datapath: Deprecate CAPWAP support.

2012-10-29 Thread Pravin B Shelar
v1-v2:
- Added userspace warning.
- Added a NEWS item.
--8<--cut here-->8--

The CAPWAP implementation is just the encapsulation format and
therefore really not the full protocol.  While there were some
uses of it (primarily hardware support and UDP transport).  But
these are most likely better provided by VXLAN.  As a result,
CAPWAP will be removed no earlier than February 2013.

Signed-off-by: Pravin B Shelar 
---
 NEWS|1 +
 datapath/vport-capwap.c |1 +
 lib/netdev-vport.c  |4 
 vswitchd/vswitch.xml|3 +++
 4 files changed, 9 insertions(+)

diff --git a/NEWS b/NEWS
index fa0a249..99f7c4b 100644
--- a/NEWS
+++ b/NEWS
@@ -43,6 +43,7 @@ v1.9.0 - xx xxx 
 - The autopath action.
 - Interface type "null".
 - Numeric values for reserved ports (see "ovs-ofctl" note above).
+- CAPWAP tunnel support.
 
 
 v1.8.0 - xx xxx 
diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
index 39aec42..518d5cc 100644
--- a/datapath/vport-capwap.c
+++ b/datapath/vport-capwap.c
@@ -473,6 +473,7 @@ static struct vport *capwap_create(const struct vport_parms 
*parms)
struct vport *vport;
int err;
 
+   pr_warn("%s: CAPWAP support is deprecated\n", __func__);
err = init_socket(ovs_dp_get_net(parms->dp));
if (err)
return ERR_PTR(err);
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 4168959..be6dd91 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -590,6 +590,10 @@ parse_tunnel_config(const char *name, const char *type,
 ovs_be32 saddr = htonl(0);
 uint32_t flags;
 
+if (!strcmp(type, "capwap")) {
+VLOG_WARN("CAPWAP tunnel support is deprecated.");
+}
+
 flags = TNL_F_DF_DEFAULT | TNL_F_HDR_CACHE;
 if (!strcmp(type, "gre") || !strcmp(type, "gre64")) {
 is_gre = true;
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 0bc4ccd..9510db8 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -1216,6 +1216,9 @@
 
   capwap
   
+As of oct 2012, CAPWAP support is deprecated. will be removed no
+earlier than February 2013.
+
 An Ethernet tunnel over the UDP transport portion of CAPWAP (RFC
 5415).  This allows interoperability with certain switches that do
 not support GRE.  Only the tunneling component of the protocol is
-- 
1.7.10

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2] datapath: enable encap for capwap.

2012-10-29 Thread Pravin B Shelar
v1-v2:
- enable encap before incrementing count.
--8<--cut here-->8--

kernel 3.5 added a switch to turn on UDP encap, capwap needs
to enable it.

Signed-off-by: Pravin B Shelar 
---
 datapath/linux/compat/include/linux/udp.h |5 +
 datapath/vport-capwap.c   |2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/datapath/linux/compat/include/linux/udp.h 
b/datapath/linux/compat/include/linux/udp.h
index 6fe4721..6a805b5 100644
--- a/datapath/linux/compat/include/linux/udp.h
+++ b/datapath/linux/compat/include/linux/udp.h
@@ -10,4 +10,9 @@ static inline struct udphdr *udp_hdr(const struct sk_buff 
*skb)
 }
 #endif /* HAVE_SKBUFF_HEADER_HELPERS */
 
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,5,0)
+static inline void udp_encap_enable(void)
+{
+}
+#endif
 #endif
diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
index 39aec42..4061c7b 100644
--- a/datapath/vport-capwap.c
+++ b/datapath/vport-capwap.c
@@ -445,7 +445,7 @@ static int init_socket(struct net *net)
capwap_net->frag_state.low_thresh   = CAPWAP_FRAG_PRUNE_MEM;
 
inet_frags_init_net(&capwap_net->frag_state);
-
+   udp_encap_enable();
capwap_net->n_tunnels++;
return 0;
 
-- 
1.7.10

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3] datapath: Remove tunnel header caching.

2012-10-29 Thread Pravin B Shelar
v2-v3:
- Fixed according to comments from Jesse.
- Dropped flow-refcount related changes, Jesse will
  post seperate patch for same.
v1-v2:
- Fixed capwap fragment case.
- simplified tnl_send.
--8<--cut here-->8--

Tunnel caching was added to reduce CPU utilization on TX path
by caching packet header, So performance gain is directly proportional
to number of skbs transferred.  But with help of offloads skb are getting
larger. So there are less number of skbs.  Therefore header caching does
not shows similar gains we seen in past.  And now kernel 3.6 has removed
dst caching from networking which makes header caching even more tricky.
So this commit removes header caching from OVS tunnelling.

Signed-off-by: Pravin B Shelar 
---
 NEWS |1 +
 datapath/tunnel.c|  465 +++---
 datapath/tunnel.h|  112 +-
 datapath/vport-capwap.c  |   40 +---
 datapath/vport-gre.c |   60 ++
 include/openvswitch/tunnel.h |1 -
 lib/netdev-vport.c   |   11 +-
 vswitchd/vswitch.xml |   10 -
 8 files changed, 62 insertions(+), 638 deletions(-)

diff --git a/NEWS b/NEWS
index 99f7c4b..7a1fa93 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
 post-v1.9.0
 
+- Tunnel header caching removed.
 
 
 v1.9.0 - xx xxx 
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index bfb09ce..a566ea5 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -52,43 +52,9 @@
 #include "vport-generic.h"
 #include "vport-internal_dev.h"
 
-#ifdef NEED_CACHE_TIMEOUT
-/*
- * On kernels where we can't quickly detect changes in the rest of the system
- * we use an expiration time to invalidate the cache.  A shorter expiration
- * reduces the length of time that we may potentially blackhole packets while
- * a longer time increases performance by reducing the frequency that the
- * cache needs to be rebuilt.  A variety of factors may cause the cache to be
- * invalidated before the expiration time but this is the maximum.  The time
- * is expressed in jiffies.
- */
-#define MAX_CACHE_EXP HZ
-#endif
-
-/*
- * Interval to check for and remove caches that are no longer valid.  Caches
- * are checked for validity before they are used for packet encapsulation and
- * old caches are removed at that time.  However, if no packets are sent 
through
- * the tunnel then the cache will never be destroyed.  Since it holds
- * references to a number of system objects, the cache will continue to use
- * system resources by not allowing those objects to be destroyed.  The cache
- * cleaner is periodically run to free invalid caches.  It does not
- * significantly affect system performance.  A lower interval will release
- * resources faster but will itself consume resources by requiring more 
frequent
- * checks.  A longer interval may result in messages being printed to the 
kernel
- * message buffer about unreleased resources.  The interval is expressed in
- * jiffies.
- */
-#define CACHE_CLEANER_INTERVAL (5 * HZ)
-
-#define CACHE_DATA_ALIGN 16
 #define PORT_TABLE_SIZE  1024
 
 static struct hlist_head *port_table __read_mostly;
-static int port_table_count;
-
-static void cache_cleaner(struct work_struct *work);
-static DECLARE_DELAYED_WORK(cache_cleaner_wq, cache_cleaner);
 
 /*
  * These are just used as an optimization: they don't require any kind of
@@ -109,60 +75,17 @@ static unsigned int multicast_ports __read_mostly;
 #define rt_dst(rt) (rt->u.dst)
 #endif
 
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,1,0)
-static struct hh_cache *rt_hh(struct rtable *rt)
-{
-   struct neighbour *neigh = dst_get_neighbour_noref(&rt->dst);
-   if (!neigh || !(neigh->nud_state & NUD_CONNECTED) ||
-   !neigh->hh.hh_len)
-   return NULL;
-   return &neigh->hh;
-}
-#else
-#define rt_hh(rt) (rt_dst(rt).hh)
-#endif
-
 static struct vport *tnl_vport_to_vport(const struct tnl_vport *tnl_vport)
 {
return vport_from_priv(tnl_vport);
 }
 
-/* This is analogous to rtnl_dereference for the tunnel cache.  It checks that
- * cache_lock is held, so it is only for update side code.
- */
-static struct tnl_cache *cache_dereference(struct tnl_vport *tnl_vport)
-{
-   return rcu_dereference_protected(tnl_vport->cache,
-lockdep_is_held(&tnl_vport->cache_lock));
-}
-
-static void schedule_cache_cleaner(void)
-{
-   schedule_delayed_work(&cache_cleaner_wq, CACHE_CLEANER_INTERVAL);
-}
-
-static void free_cache(struct tnl_cache *cache)
-{
-   if (!cache)
-   return;
-
-   ovs_flow_put(cache->flow);
-   ip_rt_put(cache->rt);
-   kfree(cache);
-}
-
 static void free_config_rcu(struct rcu_head *rcu)
 {
struct tnl_mutable_config *c = container_of(rcu, struct 
tnl_mutable_config, rcu);
kfree(c);
 }
 
-static void free_cache_rcu(struct rcu_head 

[ovs-dev] [PATCH v2] datapath: Add support for 3.6 kernel.

2012-10-29 Thread Pravin B Shelar
Signed-off-by: Pravin B Shelar 
---
 datapath/datapath.c |4 ++--
 datapath/tunnel.c   |   33 +++--
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 65f4dc8..9c253e1 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -61,8 +61,8 @@
 #include "vport-internal_dev.h"
 
 #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,18) || \
-LINUX_VERSION_CODE >= KERNEL_VERSION(3,6,0)
-#error Kernels before 2.6.18 or after 3.5 are not supported by this version of 
Open vSwitch.
+LINUX_VERSION_CODE >= KERNEL_VERSION(3,7,0)
+#error Kernels before 2.6.18 or after 3.6 are not supported by this version of 
Open vSwitch.
 #endif
 
 #define REHASH_FLOW_INTERVAL (10 * 60 * HZ)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index a566ea5..efc8b1c 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -701,32 +701,34 @@ static bool check_mtu(struct sk_buff *skb,
return true;
 }
 
-static struct rtable *find_route(const struct tnl_mutable_config *mutable,
-  __be32 saddr, __be32 daddr, u8 ipproto,
-  u8 tos)
+static struct rtable *find_route(struct net *net,
+   __be32 *saddr, __be32 daddr, u8 ipproto,
+   u8 tos)
 {
+   struct rtable *rt;
/* Tunnel configuration keeps DSCP part of TOS bits, But Linux
 * router expect RT_TOS bits only. */
 
 #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,39)
struct flowi fl = { .nl_u = { .ip4_u = {
.daddr = daddr,
-   .saddr = saddr,
+   .saddr = *saddr,
.tos   = RT_TOS(tos) } },
.proto = ipproto };
-   struct rtable *rt;
 
-   if (unlikely(ip_route_output_key(port_key_get_net(&mutable->key), &rt, 
&fl)))
+   if (unlikely(ip_route_output_key(net, &rt, &fl)))
return ERR_PTR(-EADDRNOTAVAIL);
-
+   *saddr = fl.nl_u.ip4_u.saddr;
return rt;
 #else
struct flowi4 fl = { .daddr = daddr,
-.saddr = saddr,
+.saddr = *saddr,
 .flowi4_tos = RT_TOS(tos),
 .flowi4_proto = ipproto };
 
-   return ip_route_output_key(port_key_get_net(&mutable->key), &fl);
+   rt = ip_route_output_key(net, &fl);
+   *saddr = fl.saddr;
+   return rt;
 #endif
 }
 
@@ -946,7 +948,8 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
}
 
/* Route lookup */
-   rt = find_route(mutable, saddr, daddr, tnl_vport->tnl_ops->ipproto, 
tos);
+   rt = find_route(port_key_get_net(&mutable->key), &saddr, daddr,
+ tnl_vport->tnl_ops->ipproto, tos);
if (IS_ERR(rt))
goto error_free;
 
@@ -999,8 +1002,8 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
iph->version= 4;
iph->ihl= sizeof(struct iphdr) >> 2;
iph->protocol   = tnl_vport->tnl_ops->ipproto;
-   iph->daddr  = rt->rt_dst;
-   iph->saddr  = rt->rt_src;
+   iph->daddr  = daddr;
+   iph->saddr  = saddr;
iph->tos= tos;
iph->ttl= ttl;
iph->frag_off   = frag_off;
@@ -1101,9 +1104,11 @@ static int tnl_set_config(struct net *net, struct nlattr 
*options,
if (ipv4_is_multicast(mutable->key.daddr)) {
struct net_device *dev;
struct rtable *rt;
+   __be32 saddr = mutable->key.saddr;
 
-   rt = find_route(mutable, mutable->key.saddr, mutable->key.daddr,
- tnl_ops->ipproto, mutable->tos);
+   rt = find_route(port_key_get_net(&mutable->key),
+&saddr, mutable->key.daddr,
+tnl_ops->ipproto, mutable->tos);
if (IS_ERR(rt))
return -EADDRNOTAVAIL;
dev = rt_dst(rt).dev;
-- 
1.7.10

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 05/15] Allow the OpenFlow port to be requested for a port.

2012-10-29 Thread Justin Pettit
On Oct 18, 2012, at 9:30 PM, Ben Pfaff  wrote:

> The new comment on the ->port_add member function of ofproto_class
> implies that the caller might pass in a NULL 'ofp_portp'.  I don't think
> that the implementation should have to bother with that possibility,
> because ofproto_port_add(), by design, always passes a non-null value,
> so I'd revise the comment.

Updated.

> struct if_cfg, in bridge.c, now ends with a spurious blank line.

Whoops.

> It looks like the signed 64-bit value from the database gets truncated
> down to an unsigned 16-bit value in iface_do_create().  Actually in
> iface_create() too, in the initialization of 'ofp_port'.  I'd check that
> it's in the valid range [1,OFPP_MAX) and use OFPP_NONE if not.

As you suggested later, I defined constraints in the database definition.  Is 
that sufficient?

> In iface_create(), the variable name 'result' is fairly vague, maybe
> 'ok' instead?

'ok'

> In vswitch.xml, I'm not sure that it makes sense to say that the port
> number can be greater than 0xff00.  I think that OpenFlow reserves
> those.  I guess I'd be inclined to say something more like:
> 
>Requested OpenFlow port number for this interface.  The port
>number must be between 1 and 65279, inclusive.  Some datapaths
>cannot satisfy all requests for particular port numbers.  When
>this column is empty or the request cannot be fulfilled, the
>system will choose a free port.  The 
>column reports + the assigned OpenFlow port number.

Sounds good.

> One could limit the allowed values at the database level, in the schema
> itself.

I didn't originally do this, since the limit varies by datapath implementation. 
 However, as you noted, there are real limits defined by OpenFlow, so I went 
ahead and put them in there.

> I think that this implementation generally considers the ofport
> request_value only if it is set in the same transaction that creates the
> port.  That is, changing ofport_request, or setting it when it was
> initially unset, will not change the OpenFlow port number.  Perhaps this
> should be noted in vswitch.xml too.


Done.

--Justin





___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 04/15] Use ODP ports in dpif layer and below.

2012-10-29 Thread Justin Pettit
On Oct 22, 2012, at 3:24 PM, Ben Pfaff  wrote:

> On Thu, Oct 18, 2012 at 12:51:49PM -0700, Justin Pettit wrote:
>> @@ -89,7 +91,7 @@ struct flow {
>> uint8_t arp_tha[6]; /* ARP/ND target hardware address. */
>> uint8_t nw_ttl; /* IP TTL/Hop Limit. */
>> uint8_t nw_frag;/* FLOW_FRAG_* flags. */
>> -uint8_t zeros[2];   /* Must be zero. */
>> +uint8_t zeros[0];   /* Must be zero. */
> 
> I'd get rid of this zeros[] member entirely.  It looks weird and it
> causes various "sparse" warnings:
> 
>../lib/flow.c:595:11: warning: memset with byte count of 0
>../tests/test-bundle.c:139:15: warning: memset with byte count of 0
>../tests/test-multipath.c:63:19: warning: memset with byte count of 0

Okay.  Done.

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 06/15] ofproto: Add initialization function.

2012-10-29 Thread Justin Pettit
On Oct 19, 2012, at 10:53 AM, Ben Pfaff  wrote:

> Hmm, putting "return;" in an otherwise empty function isn't our usual
> practice:
> 
>> static void
>> +init(const struct shash *iface_hints OVS_UNUSED)
>> +{
>> +return;
>> +}

Removed

>> @@ -319,6 +319,15 @@ struct ofproto_class {
>> /* ## - ## */
>> /* ## Factory Functions ## */
>> /* ## - ## */
> 
> It'd be nice to have a blank line here.

Okey-dokey.

>> +/* Iinitializes provider.  The caller may pass in 'iface_hints',
> 
> s/Iinitializes/Initializes/.

Fixed.

>> + * which contains an shash of "iface_hint" elements indexed by the
> 
> Would you mind saying "struct iface_hint" elements?  It took me a minute
> to understand that was what you meant.

Sure.

>> + * interface's name.  The provider may use these hints to describe
>> + * the startup configuration in order to reinitialize its state.
>> + * The caller owns the provided data, so a provider must make copies
>> + * of anything required.  An ofproto provider must remove any
>> + * existing state that is not described by the hint, and may choose
>> + * to remove it all. */
>> +void (*init)(const struct shash *iface_hints);
> 
> Otherwise this looks good, thank you.

Thanks!

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 07/15] tests: Define new ADD_OF_PORTS macro for ofproto tests.

2012-10-29 Thread Justin Pettit

On Oct 19, 2012, at 11:05 AM, Ben Pfaff  wrote:

> On Thu, Oct 18, 2012 at 12:51:52PM -0700, Justin Pettit wrote:
>> A future commit will break the relation between OpenFlow and datapath
>> port numbers.  The new ADD_OF_PORTS macro adds an interface to a bridge
>> and configures it such that both the OpenFlow and datapath port numbers
>> are the same.  This is important in tests that deal with both OpenFlow
>> and datapath port numbers (e.g., ones that use ofproto/trace).
>> 
>> Signed-off-by: Justin Pettit 
> 
> This looks good, but the following definition of ADD_OF_PORTS should
> be faster because it will run ovs-vsctl only once regardless of the
> number of ports to add:
> 
> m4_define([ADD_OF_PORTS],
>  [ovs-vsctl m4_foreach([of_port], m4_cdr($@),
>[ \
> -- add-port $1 p[]of_port -- set Interface p[]of_port type=dummy 
> ofport_request=of_port])])

Awesome!  Much better.  Thank you.

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: add ipv6 'set' action

2012-10-29 Thread Jesse Gross
On Sun, Oct 28, 2012 at 2:21 PM, Ansis Atteka  wrote:
> This patch adds ipv6 set action functionality. It allows to change
> traffic class, flow label, hop-limit, ipv6 source and destination
> address fields.
>
> Signed-off-by: Ansis Atteka 

Can you add an item to NEWS?

I got a few sparse errors when I applied this:

  CHECK   /home/jgross/openvswitch/datapath/linux/actions.c
/home/jgross/openvswitch/datapath/linux/actions.c:270:12: warning:
incorrect type in assignment (different base types)
/home/jgross/openvswitch/datapath/linux/actions.c:270:12:expected
restricted __be32 [usertype] fl
/home/jgross/openvswitch/datapath/linux/actions.c:270:12:got int
/home/jgross/openvswitch/datapath/linux/actions.c:273:41: warning:
incorrect type in argument 2 (different base types)
/home/jgross/openvswitch/datapath/linux/actions.c:273:41:expected
unsigned int [unsigned] [usertype] fl
/home/jgross/openvswitch/datapath/linux/actions.c:273:41:got
restricted __be32 const [usertype] ipv6_label

These look like real problems to me on certain architectures because
the constants that you are using for bit manipulations will have
different results depending on endianness.

I think we also need to do some validation of these actions in
datapath.c similar to what we do currently for IPv4 set.  I think
right now these actions just get rejected immediately because they
aren't expected by the validation code.

We could implement the equivalent actions in the userspace datapath as
well.  There aren't any complex dependencies here so it would be ideal
to maintain parity.

> diff --git a/datapath/actions.c b/datapath/actions.c
> index ec9b595..7e60b0f 100644
> --- a/datapath/actions.c
> +++ b/datapath/actions.c
> +static void set_ipv6_addr(struct sk_buff *skb, struct ipv6hdr *nh,
> + __be32 addr[4], const __be32 new_addr[4])
> +{
> +   int transport_len = skb->len - skb_transport_offset(skb);
> +
> +   if (nh->nexthdr == IPPROTO_TCP) {
> +   if (likely(transport_len >= sizeof(struct tcphdr)))
> +   inet_proto_csum_replace16(&tcp_hdr(skb)->check, skb,
> + addr, new_addr, 1);
> +   } else if (nh->nexthdr == IPPROTO_UDP) {
> +   if (likely(transport_len >= sizeof(struct udphdr))) {
> +   struct udphdr *uh = udp_hdr(skb);
> +
> +   if (uh->check ||
> +   get_ip_summed(skb) == OVS_CSUM_PARTIAL) {
> +   inet_proto_csum_replace16(&uh->check, skb,
> + addr, new_addr, 1);
> +   if (!uh->check)
> +   uh->check = CSUM_MANGLED_0;
> +   }
> +   }
> +   }

IPv6 has an extra little twist when it comes to checksums compared to
IPv4: routing headers.  In an IPv6 packet, the destination IP address
used for the purposes of checksum calculation in the pseudo header is
the final destination address.  This means that if you have a routing
header then it is that address that matters and not one in the IP
header.  This is likely to be a pretty rare case but we should handle
it.

Also, due to the presence of extension headers in general, nh->nexthdr
might not point to the L4 protocol.  We should use the extracted L4
protocol in the flow instead.

> +static void set_ipv6_tc(struct ipv6hdr *nh, __u8 tc)

In general, I would prefer the versions of the host byte order types
without underscores (i.e. u8, u32, etc.) for components entirely
within the kernel.

> +{
> +   nh->priority = (nh->priority & 0xf0) | (tc >> 4);

nh->priority is only 4 bits, so you shouldn't have to do the mask.

> +   nh->flow_lbl[0] = (nh->flow_lbl[0] & 0x0f) | ((tc & 0x0f) << 4);
> +}
> +
> +static void set_ipv6_fl(struct ipv6hdr *nh, __u32 fl)
> +{
> +   nh->flow_lbl[0] = (nh->flow_lbl[0] & 0xf0) | ((fl & 0x0f) >> 16);
> +   nh->flow_lbl[1] = (fl & 0x00ff00) >> 8;
> +   nh->flow_lbl[2] = fl & 0xff;

I would make all of these hex constants the full 4 bytes.  Whenever I
see only 3 being used with a 4 byte value I think that I've
miscounted.

> +static int set_ipv6(struct sk_buff *skb, const struct ovs_key_ipv6 *ipv6_key)
> +{
> +   struct ipv6hdr *nh;
> +   int err;
> +   __u8 tc;
> +   __be32 fl;

Based on your usage below, I think that fl is supposed to be in host byte order.

> +   __be32 *saddr;
> +   __be32 *daddr;
> +
> +   err = make_writable(skb, skb_network_offset(skb) +
> +sizeof(struct ipv6hdr));
> +   if (unlikely(err))
> +   return err;
> +
> +   nh = ipv6_hdr(skb);
> +   saddr = (__be32 *)&nh->saddr;
> +   daddr = (__be32 *)&nh->daddr;
> +
> +   if (memcmp(ipv6_key->ipv6_src, saddr, sizeof(ipv6_key->ipv6_src)))
> +   set_ipv6_addr(skb, nh, saddr, ipv6_key->ipv6

Re: [ovs-dev] [tests 20/22] ofproto-dpif: Fix "ofproto/trace" of flow for local port.

2012-10-29 Thread Ben Pfaff
Thanks, I pushed it to master.

On Mon, Oct 29, 2012 at 02:20:52PM -0700, Justin Pettit wrote:
> Acked-by: Justin Pettit 
> 
> Thanks,
> 
> --Justin
> 
> 
> On Oct 25, 2012, at 5:02 PM, Ben Pfaff  wrote:
> 
> > The 'in_port' variable is an OpenFlow port, not an ODP port, so we should
> > not translate it to ODP.
> > 
> > Signed-off-by: Ben Pfaff 
> > ---
> > ofproto/ofproto-dpif.c |2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> > index 3e1b7b3..7d914c8 100644
> > --- a/ofproto/ofproto-dpif.c
> > +++ b/ofproto/ofproto-dpif.c
> > @@ -6811,7 +6811,7 @@ ofproto_unixctl_trace(struct unixctl_conn *conn, int 
> > argc, const char *argv[],
> > const char *tun_id_s = argv[3];
> > const char *in_port_s = argv[4];
> > const char *packet_s = argv[5];
> > -uint16_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
> > +uint16_t in_port = atoi(in_port_s);
> > ovs_be64 tun_id = htonll(strtoull(tun_id_s, NULL, 0));
> > uint32_t priority = atoi(priority_s);
> > const char *msg;
> > -- 
> > 1.7.2.5
> > 
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 04/15] Use ODP ports in dpif layer and below.

2012-10-29 Thread Ben Pfaff
Thanks for the review, I pushed the patch.

On Mon, Oct 29, 2012 at 02:21:52PM -0700, Justin Pettit wrote:
> Huh, somehow I missed it in that 22 patch series.  I just reviewed
> that one patch.
> 
> Thanks,
> 
> --Justin
> 
> 
> On Oct 29, 2012, at 2:14 PM, Ben Pfaff  wrote:
> 
> > On Mon, Oct 29, 2012 at 02:11:51PM -0700, Justin Pettit wrote:
> >> On Thu, Oct 18, 2012 at 4:42 PM, Ben Pfaff  wrote:
> >>> In ofproto_unixctl_trace(), it looks very much to me like the conversion
> >>> to an ODP port in the argc == 6 case should not be happening, that is,
> >>> that the ofp_port_to_odp_port call should be omitted here:
> >>> uint32_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
> >>> 
> >> 
> >> Fixed.  Isn't this a bug in the existing code?  Should I do a separate
> >> patch to fix this in existing branches?
> > 
> > I've had a fix for this on a local branch forever and I finally sent
> > it out on Friday, so you can just review it here:
> >http://openvswitch.org/pipermail/dev/2012-October/022319.html
> > 
> > Thanks,
> > 
> > Ben.
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [Single DP 05/15] Allow the OpenFlow port to be requested for a port.

2012-10-29 Thread Ben Pfaff
On Mon, Oct 29, 2012 at 02:59:46PM -0700, Justin Pettit wrote:
> > It looks like the signed 64-bit value from the database gets truncated
> > down to an unsigned 16-bit value in iface_do_create().  Actually in
> > iface_create() too, in the initialization of 'ofp_port'.  I'd check that
> > it's in the valid range [1,OFPP_MAX) and use OFPP_NONE if not.
> 
> As you suggested later, I defined constraints in the database
> definition.  Is that sufficient?

I think so, thanks.

Everything else sounds good.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 3/8] nx-match: Do not check pre-requisites for load actions

2012-10-29 Thread Simon Horman
On Mon, Oct 29, 2012 at 09:58:58AM -0700, Ben Pfaff wrote:
> On Sat, Oct 27, 2012 at 03:05:57PM +0900, Simon Horman wrote:
> > There are (or at least will be) cases where this check can produce false
> > positives.  For example, a flow which matches a non-MPLS packet and then
> > applies an MPLS push action followed by an action to load the MPLS label.
> 
> So, I remember the discussion, but I don't remember coming to exactly
> this conclusion.
> 
> The problem seems to be, at this point, MPLS specific, because for
> most prerequisites, whether it is satisfied cannot be affected by
> actions.  For example, no action can transform a TCP packet into a
> non-TCP packet, and vice versa.
> 
> My thought is, then, for now we should define a new function to test
> whether a prerequisite could ever be satisfied given a particular
> flow.  For all prerequisites other than MPLS, this would return the
> same thing as mf_are_prereqs_ok(); for MPLS, it could just return
> true.

I think that I am comfortable with that approach, I will implement it
and see how things go.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/8] User-Space MPLS actions and matches

2012-10-29 Thread Simon Horman
On Mon, Oct 29, 2012 at 11:28:29AM -0700, Ben Pfaff wrote:
> On Sat, Oct 27, 2012 at 03:05:58PM +0900, Simon Horman wrote:
> > This patch implements use-space datapath and non-datapath code
> > to match and use the datapath API set out in Leo Alterman's patch
> > "user-space datapath: Add basic MPLS support to kernel".
> > 
> > The resulting MPLS implementation supports:
> > * Pushing a single MPLS label
> > * Poping a single MPLS label
> > * Modifying an MPLS lable using set-field or load actions
> >   that act on the label value, tc and bos bit.
> > * There is no support for manipulating the TTL
> >   this is considered future work.
> > 
> > The single-level push pop limitation is implemented by processing
> > push, pop and set-field/load actions in order and discarding information
> > that would require multiple levels of push/pop to be supported.
> > 
> > e.g.
> >push,push -> the first push is discarded
> >pop,pop -> the first pop is discarded
> > 
> > This patch is based heavily on work by Ravi K.
> > 
> > Cc: Isaku Yamahata 
> > Cc: Ravi K 
> > Signed-off-by: Simon Horman 
> 
> Why does this modify datapath/datapath.c and datapath/flow.h?

Sorry, they are some improvements suggested by Yamahata-san
which I should have rolled into an earlier patch in the series.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/8] User-Space MPLS actions and matches

2012-10-29 Thread Simon Horman
On Mon, Oct 29, 2012 at 11:52:23PM +0900, Isaku Yamahata wrote:
> I needed the following patch to pass unit tests.

Thanks, I have also noticed this. I believe that the need for this
patch relates to the presence of the series "Run-Time Open Flow Version
Configuration" and that the patch below is not required when using the
MPLS series on top of master.

> >From 55e62f64c984accf578a3ae8e1b315b7752c7ebe Mon Sep 17 00:00:00 2001
> Message-Id: 
> <55e62f64c984accf578a3ae8e1b315b7752c7ebe.1351522305.git.yamah...@valinux.co.jp>
> From: Isaku Yamahata 
> Date: Mon, 29 Oct 2012 23:43:25 +0900
> Subject: [PATCH] unittest/mpls: OF12 openflow dump-tables
> 
> fb1d6089402a4d108133bd39820bd0b345da0e9a adds OFPXMT12_OFB_MPLS_BOS so that
> OFPXMT12_OFB_MAX is increased by one.
> Thus OF12 dump-tables needs update.
> 
> Signed-off-by: Isaku Yamahata 
> ---
>  tests/ofproto.at |   12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/ofproto.at b/tests/ofproto.at
> index 3865fc5..95c2fa8 100644
> --- a/tests/ofproto.at
> +++ b/tests/ofproto.at
> @@ -575,13 +575,13 @@ AT_CLEANUP
>  AT_SETUP([ofproto - flow table configuration (OpenFlow 1.2)])
>  OVS_VSWITCHD_START
>  # Check the default configuration.
> -(mid="wild=0xf, max=100,"
> +(mid="wild=0x1f, max=100,"
>   tail="
> lookup=0, matched=0
> -   match=0xf, instructions=0x0007, config=0x0003
> +   match=0x1f, instructions=0x0007, config=0x0003
> write_actions=0x, apply_actions=0x
> -   write_setfields=0x000f
> -   apply_setfields=0x000f
> +   write_setfields=0x001f
> +   apply_setfields=0x001f
> metadata_match=0x
> metadata_write=0x"
>   echo "OFPST_TABLE reply (OF1.2) (xid=0x2): 255 tables
> @@ -607,9 +607,9 @@ AT_CHECK(
>  # Check that the configuration was updated.
>  mv expout orig-expout
>  (echo "OFPST_TABLE reply (OF1.2) (xid=0x2): 255 tables
> -  0: main: wild=0xf, max=100, active=0"
> +  0: main: wild=0x1f, max=100, active=0"
>   tail -n +3 orig-expout | head -7
> - echo "  1: table1  : wild=0xf, max=  1024, active=0"
> + echo "  1: table1  : wild=0x1f, max=  1024, active=0"
>   tail -n +11 orig-expout) > expout
>  AT_CHECK([ovs-ofctl --allowed-ofp-versions OpenFlow12 dump-tables br0], [0], 
> [expout])
>  OVS_VSWITCHD_STOP
> -- 
> 1.7.10.4
> -- 
> yamahata
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/8] User-Space MPLS actions and matches

2012-10-29 Thread Ben Pfaff
Can you post a revised version of at least this patch?  Thanks.

On Tue, Oct 30, 2012 at 10:12:28AM +0900, Simon Horman wrote:
> On Mon, Oct 29, 2012 at 11:52:23PM +0900, Isaku Yamahata wrote:
> > I needed the following patch to pass unit tests.
> 
> Thanks, I have also noticed this. I believe that the need for this
> patch relates to the presence of the series "Run-Time Open Flow Version
> Configuration" and that the patch below is not required when using the
> MPLS series on top of master.
> 
> > >From 55e62f64c984accf578a3ae8e1b315b7752c7ebe Mon Sep 17 00:00:00 2001
> > Message-Id: 
> > <55e62f64c984accf578a3ae8e1b315b7752c7ebe.1351522305.git.yamah...@valinux.co.jp>
> > From: Isaku Yamahata 
> > Date: Mon, 29 Oct 2012 23:43:25 +0900
> > Subject: [PATCH] unittest/mpls: OF12 openflow dump-tables
> > 
> > fb1d6089402a4d108133bd39820bd0b345da0e9a adds OFPXMT12_OFB_MPLS_BOS so that
> > OFPXMT12_OFB_MAX is increased by one.
> > Thus OF12 dump-tables needs update.
> > 
> > Signed-off-by: Isaku Yamahata 
> > ---
> >  tests/ofproto.at |   12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/tests/ofproto.at b/tests/ofproto.at
> > index 3865fc5..95c2fa8 100644
> > --- a/tests/ofproto.at
> > +++ b/tests/ofproto.at
> > @@ -575,13 +575,13 @@ AT_CLEANUP
> >  AT_SETUP([ofproto - flow table configuration (OpenFlow 1.2)])
> >  OVS_VSWITCHD_START
> >  # Check the default configuration.
> > -(mid="wild=0xf, max=100,"
> > +(mid="wild=0x1f, max=100,"
> >   tail="
> > lookup=0, matched=0
> > -   match=0xf, instructions=0x0007, 
> > config=0x0003
> > +   match=0x1f, instructions=0x0007, 
> > config=0x0003
> > write_actions=0x, apply_actions=0x
> > -   write_setfields=0x000f
> > -   apply_setfields=0x000f
> > +   write_setfields=0x001f
> > +   apply_setfields=0x001f
> > metadata_match=0x
> > metadata_write=0x"
> >   echo "OFPST_TABLE reply (OF1.2) (xid=0x2): 255 tables
> > @@ -607,9 +607,9 @@ AT_CHECK(
> >  # Check that the configuration was updated.
> >  mv expout orig-expout
> >  (echo "OFPST_TABLE reply (OF1.2) (xid=0x2): 255 tables
> > -  0: main: wild=0xf, max=100, active=0"
> > +  0: main: wild=0x1f, max=100, active=0"
> >   tail -n +3 orig-expout | head -7
> > - echo "  1: table1  : wild=0xf, max=  1024, active=0"
> > + echo "  1: table1  : wild=0x1f, max=  1024, active=0"
> >   tail -n +11 orig-expout) > expout
> >  AT_CHECK([ovs-ofctl --allowed-ofp-versions OpenFlow12 dump-tables br0], 
> > [0], [expout])
> >  OVS_VSWITCHD_STOP
> > -- 
> > 1.7.10.4
> > -- 
> > yamahata
> > 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: Add Upstream id for GRE type.

2012-10-29 Thread Jesse Gross
On Mon, Oct 29, 2012 at 12:57 PM, Pravin Shelar  wrote:
> On Fri, Oct 26, 2012 at 11:32 AM, Jesse Gross  wrote:
>> Also, did you look at the userspace code to see whether it can handle
>> the transition by removing port types that it doesn't understand and
>> recreating them?
>
> Yes, I checked it and it works.

I know that you said it was able to delete the old ports but I thought
that there was an issue recreating them after that?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath: Fix sparse warning for symbol 'BUILD_BUG_ON_NOT_POWER_OF_2'

2012-10-29 Thread Jesse Gross
On Mon, Oct 29, 2012 at 2:32 PM, Pravin B Shelar  wrote:
> v1-v2:
> - Moved BUILD_BUG_ON_NOT_POWER_OF_2 symbol definition for bug.h
> --8<--cut here-->8--
>
> BUILD_BUG_ON_NOT_POWER_OF_2 symbol is moved from kernel.h to
> bug.h in 3.4.  Therefore sparse is giving warning:
>
> include/linux/bug.h:15:9: warning: preprocessor token
> BUILD_BUG_ON_NOT_POWER_OF_2 redefined
> ovs/datapath/linux/compat/include/linux/kernel.h:44:9:
> this was the original definition
>
> Signed-off-by: Pravin B Shelar 

This needs to include bug.h in Modules.mk, otherwise it fails distcheck.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath: Deprecate CAPWAP support.

2012-10-29 Thread Jesse Gross
On Mon, Oct 29, 2012 at 2:32 PM, Pravin B Shelar  wrote:
> diff --git a/NEWS b/NEWS
> index fa0a249..99f7c4b 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -43,6 +43,7 @@ v1.9.0 - xx xxx 
>  - The autopath action.
>  - Interface type "null".
>  - Numeric values for reserved ports (see "ovs-ofctl" note above).
> +- CAPWAP tunnel support.

Since this is going into the 1.9 branch, we should add it to the
Debian changelog as well (which has already been copied over).

>  v1.8.0 - xx xxx 
> diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
> index 39aec42..518d5cc 100644
> --- a/datapath/vport-capwap.c
> +++ b/datapath/vport-capwap.c
> @@ -473,6 +473,7 @@ static struct vport *capwap_create(const struct 
> vport_parms *parms)
> struct vport *vport;
> int err;
>
> +   pr_warn("%s: CAPWAP support is deprecated\n", __func__);

I think we probably can skip the kernel warning altogether.

> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> index 4168959..be6dd91 100644
> --- a/lib/netdev-vport.c
> +++ b/lib/netdev-vport.c
> @@ -590,6 +590,10 @@ parse_tunnel_config(const char *name, const char *type,
>  ovs_be32 saddr = htonl(0);
>  uint32_t flags;
>
> +if (!strcmp(type, "capwap")) {
> +VLOG_WARN("CAPWAP tunnel support is deprecated.");
> +}

I think WARN_ONCE is probably a little nicer to avoid filling up the logs.

> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 0bc4ccd..9510db8 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -1216,6 +1216,9 @@
>
>capwap
>
> +As of oct 2012, CAPWAP support is deprecated. will be removed no
> +earlier than February 2013.

Can we clean up this sentence a little bit and move it to the end of
the CAPWAP description?  Something like:

CAPWAP support is deprecated and will be removed no earlier than February 2013.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath: enable encap for capwap.

2012-10-29 Thread Jesse Gross
On Mon, Oct 29, 2012 at 2:32 PM, Pravin B Shelar  wrote:
> v1-v2:
> - enable encap before incrementing count.
> --8<--cut here-->8--
>
> kernel 3.5 added a switch to turn on UDP encap, capwap needs
> to enable it.
>
> Signed-off-by: Pravin B Shelar 

Acked-by: Jesse Gross 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath: Add support for 3.6 kernel.

2012-10-29 Thread Jesse Gross
On Mon, Oct 29, 2012 at 2:32 PM, Pravin B Shelar  wrote:
> Signed-off-by: Pravin B Shelar 
> ---
>  datapath/datapath.c |4 ++--
>  datapath/tunnel.c   |   33 +++--
>  2 files changed, 21 insertions(+), 16 deletions(-)
>
> diff --git a/datapath/datapath.c b/datapath/datapath.c
> index 65f4dc8..9c253e1 100644
> --- a/datapath/datapath.c
> +++ b/datapath/datapath.c
> @@ -61,8 +61,8 @@
>  #include "vport-internal_dev.h"
>
>  #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,18) || \
> -LINUX_VERSION_CODE >= KERNEL_VERSION(3,6,0)
> -#error Kernels before 2.6.18 or after 3.5 are not supported by this version 
> of Open vSwitch.
> +LINUX_VERSION_CODE >= KERNEL_VERSION(3,7,0)
> +#error Kernels before 2.6.18 or after 3.6 are not supported by this version 
> of Open vSwitch.
>  #endif
>
>  #define REHASH_FLOW_INTERVAL (10 * 60 * HZ)

Isn't this missing the Netlink PID renames in datapath.c or macros to
handle it?  Did I miss a patch somewhere?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath: Add support for 3.6 kernel.

2012-10-29 Thread Pravin Shelar
On Mon, Oct 29, 2012 at 7:21 PM, Jesse Gross  wrote:
> On Mon, Oct 29, 2012 at 2:32 PM, Pravin B Shelar  wrote:
>> Signed-off-by: Pravin B Shelar 
>> ---
>>  datapath/datapath.c |4 ++--
>>  datapath/tunnel.c   |   33 +++--
>>  2 files changed, 21 insertions(+), 16 deletions(-)
>>
>> diff --git a/datapath/datapath.c b/datapath/datapath.c
>> index 65f4dc8..9c253e1 100644
>> --- a/datapath/datapath.c
>> +++ b/datapath/datapath.c
>> @@ -61,8 +61,8 @@
>>  #include "vport-internal_dev.h"
>>
>>  #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,18) || \
>> -LINUX_VERSION_CODE >= KERNEL_VERSION(3,6,0)
>> -#error Kernels before 2.6.18 or after 3.5 are not supported by this version 
>> of Open vSwitch.
>> +LINUX_VERSION_CODE >= KERNEL_VERSION(3,7,0)
>> +#error Kernels before 2.6.18 or after 3.6 are not supported by this version 
>> of Open vSwitch.
>>  #endif
>>
>>  #define REHASH_FLOW_INTERVAL (10 * 60 * HZ)
>
> Isn't this missing the Netlink PID renames in datapath.c or macros to
> handle it?  Did I miss a patch somewhere?

I forgot to mention that those changes are in kernel 3.7, so I dropped those.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: Add Upstream id for GRE type.

2012-10-29 Thread Pravin Shelar
On Mon, Oct 29, 2012 at 6:56 PM, Jesse Gross  wrote:
> On Mon, Oct 29, 2012 at 12:57 PM, Pravin Shelar  wrote:
>> On Fri, Oct 26, 2012 at 11:32 AM, Jesse Gross  wrote:
>>> Also, did you look at the userspace code to see whether it can handle
>>> the transition by removing port types that it doesn't understand and
>>> recreating them?
>>
>> Yes, I checked it and it works.
>
> I know that you said it was able to delete the old ports but I thought
> that there was an issue recreating them after that?

Problem that I saw was related to my update script, after fixing it it
worked fine.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath: Fix sparse warning for symbol 'BUILD_BUG_ON_NOT_POWER_OF_2'

2012-10-29 Thread Pravin Shelar
On Mon, Oct 29, 2012 at 7:07 PM, Jesse Gross  wrote:
> On Mon, Oct 29, 2012 at 2:32 PM, Pravin B Shelar  wrote:
>> v1-v2:
>> - Moved BUILD_BUG_ON_NOT_POWER_OF_2 symbol definition for bug.h
>> --8<--cut here-->8--
>>
>> BUILD_BUG_ON_NOT_POWER_OF_2 symbol is moved from kernel.h to
>> bug.h in 3.4.  Therefore sparse is giving warning:
>>
>> include/linux/bug.h:15:9: warning: preprocessor token
>> BUILD_BUG_ON_NOT_POWER_OF_2 redefined
>> ovs/datapath/linux/compat/include/linux/kernel.h:44:9:
>> this was the original definition
>>
>> Signed-off-by: Pravin B Shelar 
>
> This needs to include bug.h in Modules.mk, otherwise it fails distcheck.

OK, I will post updated patch.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev