Andy Gospodarek <[EMAIL PROTECTED]> wrote:
[...]
>My initial concern was that a slave device could disappear out from
>under us, but it seems like this certainly isn't the case since all
>calls to bond_release are protected by rtnl-locks, so I think you are
>correct that we are safe.  I'll test this on my setup here and let you
>know if I see any problems.

        Yep, all entries into enslave or remove come in with RTNL, so if
we have RTNL there then slaves can't vanish.

        On further inspection, I don't think it's safe to simply drop
the locks in bond_set_multicast_list, I'm seeing a couple of cases that
could be troublesome:

        bond_set_promiscuity and bond_set_allmulti both reference
curr_active_slave, which isn't protected from change by RTNL, so that
could conflict with a change_active_slave calling bond_mc_swap (which is
also holding the wrong locks for dev_set_promisc/allmulti).

        It also looks like there are paths (igmp6 for one) into
dev_mc_add that just hold a bunch of regular locks, and not RTNL, so
those wouldn't be safe from having slaves vanish due to concurrent
deslavement.

        Looks like read_lock_bh for bond-lock and curr_slave_lock is
needed in bond_set_multicast_list, and some dropping of locks is needed
inside bond_set_promisc/allmulti.  Methinks that without any locks,
bond_mc_add/delete could race with either a change of active slave or a
de-enslavement of the active slave.

        I'm wondering if this is worth trying to make perfect for 2.6.24
(and maybe making things worse), and, instead, just do this:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..8b9e33a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,7 +3937,7 @@ static void bond_set_multicast_list(struct net_device 
*bond_dev)
        struct bonding *bond = bond_dev->priv;
        struct dev_mc_list *dmi;
 
-       write_lock_bh(&bond->lock);
+       read_lock_bh(&bond->lock);
 
        /*
         * Do promisc before checking multicast_mode
@@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device 
*bond_dev)
        bond_mc_list_destroy(bond);
        bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
 
-       write_unlock_bh(&bond->lock);
+       read_unlock_bh(&bond->lock);
 }
 
 /*


        This should silence the lockdep (if I'm understanding what
everybody's saying), and keep the change set to a minimum.  This might
not even be worth pushing for 2.6.24; I'm not exactly sure how difficult
the lockdep problem would be to trigger.

        The other stuff I mention above can be dealt with later; they're
very low-probability races that would be pretty difficult to hit even on
purpose.

        Thoughts?

        -J

---
        -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to