On Fri, Jan 6, 2012 at 6:27 PM, Jesse Gross <je...@nicira.com> wrote:
> On Thu, Jan 5, 2012 at 7:59 PM, Pravin B Shelar <pshe...@nicira.com> wrote:
>> diff --git a/datapath/datapath.c b/datapath/datapath.c
>> index 17871e4..76bf8f5 100644
>> --- a/datapath/datapath.c
>> +++ b/datapath/datapath.c
>> +static void __dp_destroy(struct datapath *dp)
>> +{
>
> Can you document the locking expectations for the function (i.e. genl mutex)?
ok.

>
>> +       struct vport *vport, *next_vport;
>>
>> +       list_for_each_entry_safe(vport, next_vport, &dp->port_list, node)
>> +               if (vport->port_no != OVSP_LOCAL)
>> +                       ovs_dp_detach_port(vport);
>
> I think that the above block is the only part of destruction that
> needs to hold RTNL.  Can we just take the lock there and drop it from
> callers?  Then we can also move call_rcu in here as well, so it really
> does the full destruction process.
>

RTNL is also required while detaching dp local vport.

>> @@ -2047,15 +2059,20 @@ error:
>>  static int __rehash_flow_table(void *dummy)
>>  {
>>        struct datapath *dp;
>> +       struct net *net;
>>
>> -       list_for_each_entry(dp, &dps, list_node) {
>> -               struct flow_table *old_table = genl_dereference(dp->table);
>> -               struct flow_table *new_table;
>> +       for_each_net(net) {
>> +               struct ovs_net *ovs_net = net_generic(net, ovs_net_id);
>
> I think you need to take RTNL here, otherwise a namespace could
> disappear while you are iterating over them.
>
ok.

>> +static void dp_destroy_all(struct ovs_net *ovs_net)
>> +{
>> +       struct datapath *dp, *dp_next;
>> +       LIST_HEAD(dp_kill_list);
>> +
>> +       rtnl_lock();
>> +       list_for_each_entry_safe(dp, dp_next, &ovs_net->dps, list_node) {
>
> This list is really protected by genl mutex, not RTNL.

right, I was thinking of using only RTLN lock from net_exit().
as genl commands can not race with it. Only rehash can race. In rehash
we can have both genl and rtnl locks. What do you think?

>
>> +               __dp_destroy(dp);
>> +               list_add_tail(&dp->list_node, &dp_kill_list);
>
> I'm not sure that there's much benefit to batching the calls to
> call_rcu outside of RTNL.  However, if you push them both down into
> __dp_destroy() then you can get the same effect for free.
>
>> +       list_for_each_entry(dp, &dp_kill_list, list_node) {
>> +               call_rcu(&dp->rcu, destroy_dp_rcu);
>> +               module_put(THIS_MODULE);
>
> I believe that we can now just drop module refcounting because this
> actually achieves the goal of deleting datapaths when the module is
> unloaded.  When you call unregister_pernet_device() on module unload,
> it calls ovs_exit_net() on all namespaces that are currently active so
> remaining dps automatically get deleted.
>
I have separate patch for this.

> Can you also annotate ovs_init_net() ovs_exit_net() with __init_net
> and __exit_net respectively?
>
ok.
>> @@ -2085,21 +2154,21 @@ static int __init dp_init(void)
> [...]
>> +       err = register_pernet_device(&ovs_net_ops);
>> +       if (err < 0)
>
> Does this ever return a positive value?  Otherwise, we can just check
> for if (err) like in other places.
>
ok.

>> @@ -2131,9 +2200,9 @@ static void dp_cleanup(void)
>>        rcu_barrier();
>>        dp_unregister_genl(ARRAY_SIZE(dp_genl_families));
>>        unregister_netdevice_notifier(&ovs_dp_device_notifier);
>> +       unregister_pernet_device(&ovs_net_ops);
>
> I think we need to move rcu_barrier() down because
> unregister_pernet_device() can generate rcu callbacks (assuming that
> we drop the refcounting) and those should be complete before the
> module exists.  In general, I think the safest thing to do is just
> make it last, that way we don't have to worry about whether any of the
> exit functions might have callbacks.

As we still have module ref for each dp, this can not happen.
I have separate patch to drop reference counting. I will fold it in this patch.

>
>> diff --git a/datapath/datapath.h b/datapath/datapath.h
>> index 27151b9..d5a8e41 100644
>> --- a/datapath/datapath.h
>> +++ b/datapath/datapath.h
>> @@ -29,10 +29,11 @@
>>
>>  #include "checksum.h"
>>  #include "compat.h"
>> -#include "flow.h"
>>  #include "dp_sysfs.h"
>> +#include "flow.h"
>> +#include "tunnel.h"
>>  #include "vlan.h"
>> -
>> +#include "vport.h"
>>  struct vport;
>
> If you include vport.h then you can drop the forward declaration of
> struct vport.
>
ok.

>> +struct ovs_net {
>> +       /* Per Network namespace list of datapaths to enable dumping them
>> +        * all out. Protected by genl_mutex.
>> +        */
>> +       struct list_head dps;
>> +       /* Per network namespace data for tnl. */
>> +       struct tnl_net tnl_net;
>> +       /* Per network namespace data for vport. */
>> +       struct vport_net vport_net;
>> +};
>
> You should use the kernel comment style to document the struct
> members, as is used other places in this file.
>
ok.

>> +extern int ovs_net_id;
>
> Can you put this together with the other global static variables?
ok

>
>> diff --git a/datapath/linux/compat/include/net/genetlink.h 
>> b/datapath/linux/compat/include/net/genetlink.h
>> index a1ff7c1..4e7f1b6 100644
>> --- a/datapath/linux/compat/include/net/genetlink.h
>> +++ b/datapath/linux/compat/include/net/genetlink.h
>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,32)
>> +#define SET_NETNSOK
>> +#else
>> +#define SET_NETNSOK    .netnsok = true,
>> +#endif
>
> Can you put this in compat.h since it's not going upstream?
>
ok.
>> diff --git a/datapath/linux/compat/include/net/net_namespace.h 
>> b/datapath/linux/compat/include/net/net_namespace.h
>> index 9ce9fcd..1c25cc6 100644
>> --- a/datapath/linux/compat/include/net/net_namespace.h
>> +++ b/datapath/linux/compat/include/net/net_namespace.h
>>  #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,32)
>> -#define INIT_NET_GENL_SOCK init_net.genl_sock
>> +#define GENL_SOCK(net) ((net)->genl_sock)
>>  #else
>> -#define INIT_NET_GENL_SOCK genl_sock
>> +#define GENL_SOCK(net) (genl_sock)
>> +#endif
>
> While you're at it, can you move this to compat.h as well?
>
ok.

>> diff --git a/datapath/linux/compat/net_namespace.c 
>> b/datapath/linux/compat/net_namespace.c
>> new file mode 100644
>> index 0000000..31c4a6e
>> --- /dev/null
>> +++ b/datapath/linux/compat/net_namespace.c
>> +#if LINUX_VERSION_CODE == KERNEL_VERSION(2,6,32)
>> +
>> +#undef pernet_operations
>> +struct pernet_operations pnet_compat;
>> +
>> +struct rpl_pernet_operations *pnet;
>
> These should be static.
>
ok.
>> +static int __net_init ovs_init_net(struct net *net)
>
> Can you name these something different from the ones in datapath.c?
>
ok

>> +{
>> +       int err;
>> +       void *ovs_net = kzalloc(pnet->size, GFP_KERNEL);
>> +
>
> Extra blank line here.
>
ok

>> +       if (!ovs_net)
>> +               return -ENOMEM;
>> +
>> +       err = net_assign_generic(net, *pnet->id, ovs_net);
>> +       if (err) {
>> +               kfree(ovs_net);
>> +               return err;
>> +       }
>> +
>> +       err = pnet->init(net);
>> +       if (err) {
>> +               kfree(ovs_net);
>> +               return err;
>
> Can you make a common exit path?
>
ok

>> +static void __net_init ovs_exit_net(struct net *net)
>
> This should be __net_exit, not __net_init.
>
ok

>> +int rpl_register_pernet_device(struct rpl_pernet_operations *rpl_pnet)
>> +{
>> +       pnet = rpl_pnet;
>> +       pnet_compat.init = ovs_init_net;
>> +       pnet_compat.exit = ovs_exit_net;
>
> Can you just initialize these statically, like you did in the main
> pernet code in datapath.c?
>
ok.

>> +       return register_pernet_gen_subsys(pnet->id, &pnet_compat);
>
> Shouldn't this be register_pernet_gen_device(), since that's what
> you're emulating.
>
right.

>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,32)
>
> These could just be combined into an #elsif.
>
ok

>> diff --git a/datapath/tunnel.c b/datapath/tunnel.c
>> index 33d2fe9..1a66db3 100644
>> --- a/datapath/tunnel.c
>> +++ b/datapath/tunnel.c
>> @@ -82,23 +84,12 @@
>>  #define CACHE_DATA_ALIGN 16
>>  #define PORT_TABLE_SIZE  1024
>>
>> -static struct hlist_head *port_table __read_mostly;
>> -static int port_table_count;
>>
>
> Extra blank line.
>
ok

>>  static void cache_cleaner(struct work_struct *work)
> [...]
>> +       for_each_net(net) {
>
> I think this needs to be for_each_net_rcu().
>
right.

>> +               struct tnl_net *tnl_net = get_tnl_net(net);
>> +
>> +               for (i = 0; i < PORT_TABLE_SIZE; i++) {
>> +
>
> Extra blank line.
>
ok.

>> @@ -1417,7 +1438,8 @@ static int tnl_set_config(struct nlattr *options, 
>> const struct tnl_ops *tnl_ops,
>>
>>        mutable->tunnel_hlen += sizeof(struct iphdr);
>>
>> -       old_vport = port_table_lookup(&mutable->key, &old_mutable);
>> +       tnl_net = get_tnl_net(net);
>
> Can you initialize tnl_net at the beginning of the function so we
> don't have wonder whether it is set?
>
ok.

>> +int ovs_tnl_init_net(struct net *net)
> [...]
>> +       tnl_net->port_table = kmalloc(PORT_TABLE_SIZE * sizeof(struct 
>> hlist_head *),
>> +                                     GFP_KERNEL);
>
> I think that namespaces might affect the tradeoff of whether it makes
> sense to preallocate a tunnel table at startup.  There could be a lot
> of namespaces, none of which use tunnels, so the wasted memory could
> actually be pretty significant.
>
we can have global shared tnl port table like vport table.

>> diff --git a/datapath/tunnel.h b/datapath/tunnel.h
>> index 6865ae6..f470702 100644
>> --- a/datapath/tunnel.h
>> +++ b/datapath/tunnel.h
>> @@ -20,6 +20,9 @@
>>  #define TUNNEL_H 1
>>
>>  #include <linux/version.h>
>> +#include <net/ip_vs.h>
>
> We shouldn't reference anything in ip_vs.h
>
ok.

>> +struct tnl_net {
>> +       struct hlist_head *port_table;
>> +
>> +       /*
>> +        * These are just used as an optimization: they don't require any 
>> kind
>> +        * of synchronization because we could have just as easily read the
>> +        * value before the port change happened.
>> +        */
>> +
>> +       unsigned int key_local_remote_ports;
>> +       unsigned int key_remote_ports;
>> +       unsigned int key_multicast_ports;
>> +       unsigned int local_remote_ports;
>> +       unsigned int remote_ports;
>> +       unsigned int multicast_ports;
>> +};
>
> Does anything outside of tunnel.c actually use struct tnl_net?
>
it is required in datapath.h for ovs_net definition.

>> diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
>> index 6c1b0da..493405b 100644
>> --- a/datapath/vport-capwap.c
>> +++ b/datapath/vport-capwap.c
>> -static int capwap_init(void)
>> +static int capwap_init_net(struct net *net)
>
> Can you annotate this with __net_init as well (actually we should
> annotate all the net functions).
>
ok.

> However, given that we already wanted to start opening capwap sockets
> on demand I wonder whether it makes sense to have these vport init_net
> and exit_net functions at all and just drive everything off port
> creation (and ports will get automatically cleaned up with the
> namespace goes away).
>
ok. I will create socket on first port addition.

>> diff --git a/datapath/vport-patch.c b/datapath/vport-patch.c
>> index 53b24b0..a7be489 100644
>> --- a/datapath/vport-patch.c
>> +++ b/datapath/vport-patch.c
>> -static void update_peers(const char *name, struct vport *vport)
>> +static void update_peers(const char *name, struct net *net, struct vport 
>> *vport)
>>  {
>>        struct hlist_head *bucket = hash_bucket(name);
>
> It would be nice if we also included the net in the hash.
>
ok.

>> diff --git a/datapath/vport.c b/datapath/vport.c
>> index e9ccdbd..83043ec 100644
>> --- a/datapath/vport.c
>> +++ b/datapath/vport.c
>> +static void vport_exit_net(struct net *net, int max)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < max; i++) {
>> +               const struct vport_ops *ops = vport_ops_list[i];
>> +
>> +               if (ops->exit_net)
>> +                       ops->exit_net(net);
>> +       }
>> +}
>> +
>> +int ovs_vport_init_net(struct net *net)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < n_vport_types; i++) {
>> +               const struct vport_ops *ops = vport_ops_list[i];
>> +               int err;
>> +
>> +               err = 0;
>> +
>> +               if (ops->init_net)
>> +                       err = ops->init_net(net);
>> +
>> +               if (err) {
>> +                       vport_exit_net(net, i);
>> +                       return err;
>> +               }
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> +void ovs_vport_exit_net(struct net *net)
>> +{
>> +       vport_exit_net(net, n_vport_types);
>> +}
>> +
>
> We now have a funny model where things that need to be initialized at
> module load and fail will continue without the vport support but
> things initialized at namespace init fail the net.  Since there's no
> clear distinction (from the outside) why something is initialized in
> one or the other, it's somewhat confusing.  This is probably an even
> stronger reason to do things on demand and just get rid of the
> initialization functions altogether.
>
ok.

>> -struct vport *ovs_vport_locate(const char *name)
>> +struct vport *ovs_vport_locate(struct net *net, const char *name)
>>  {
>>        struct hlist_head *bucket = hash_bucket(name);
>>        struct vport *vport;
>>        struct hlist_node *node;
>>
>>        hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
>> -               if (!strcmp(name, vport->ops->get_name(vport)))
>> +               if (!strcmp(name, vport->ops->get_name(vport)) &&
>> +                   vport->net == net)
>
> It would be nice if we could use the net in the hash here too.
>
ok.

>> @@ -187,6 +229,7 @@ struct vport *ovs_vport_alloc(int priv_size, const 
>> struct vport_ops *ops,
>>
>>        vport->dp = parms->dp;
>>        vport->port_no = parms->port_no;
>> +       vport->net = parms->dp->net;
>
> vport->net is always exactly the same as vport->dp->net, right?
>
> Assuming that's the case, I don't think it is a very good idea to make
> a copy of it.  It's confusing if an internal device is moved to a
> different namespace and generally redundant.  You could create a
> function get this if you want but that probably won't save any typing
> anyways.
>
ok.

>> diff --git a/datapath/vport.h b/datapath/vport.h
>> index 44cf603..44f3bf0 100644
>> --- a/datapath/vport.h
>> +++ b/datapath/vport.h
>> @@ -20,25 +20,32 @@
>>  #define VPORT_H 1
>>
>>  #include <linux/list.h>
>> +#include <linux/netlink.h>
>
> Is netlink.h actually used?
ok.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to