Andrew Morton wrote:
On Wed, 28 Feb 2007 10:12:01 +0100 (CET) Jaroslav Kysela <[EMAIL PROTECTED]> 
wrote:

Hi,

please, review and apply to mm tree for further testing. The patch is also available at ftp://ftp.alsa-project.org/pub/kernel-patches/bonding-workqueue.patch .

Please cc netdev@vger.kernel.org on net-related patches, thanks.

                                        Thank you,
                                                Jaroslav

==================
bonding: replace system timer with work queue

This patch replaces system timer with work queue in monitor functions.
The reason for this change is that bonding handlers calls various
sleeping functions from the timer handler which is not allowed.

Which sleeping functions?  I'd have expected the kernel to spew runtime
warnings when this happens, but I don't recall any such reports.


Because we cannot share the main workqueue threads (rtnl_lock is used
also in linkwatch_event) - new bond workqueue thread is created.

Signed-off-by: Jaroslav Kysela <[EMAIL PROTECTED]>

diff -rupN linux-2.6.20.orig/drivers/net/bonding/bond_3ad.c 
linux-2.6.20/drivers/net/bonding/bond_3ad.c
--- linux-2.6.20.orig/drivers/net/bonding/bond_3ad.c    2007-02-04 
19:44:54.000000000 +0100
+++ linux-2.6.20/drivers/net/bonding/bond_3ad.c 2007-02-28 09:19:43.831369202 
+0100
@@ -2097,8 +2097,10 @@ void bond_3ad_unbind_slave(struct slave * times out, and it selects an aggregator for the ports that are yet not
  * related to any aggregator, and selects the active aggregator for a bond.
  */
-void bond_3ad_state_machine_handler(struct bonding *bond)
+void bond_3ad_state_machine_handler(struct work_struct *work)
 {
+       struct ad_bond_info *ad_info = container_of(work, struct ad_bond_info, 
ad_work.work);
+       struct bonding *bond = (struct bonding *)((char *)ad_info - 
offsetof(struct bonding, ad_info));

We can use containers_of here too?

-void bond_alb_monitor(struct bonding *bond)
+void bond_alb_monitor(struct work_struct *work)
 {
-       struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+       struct alb_bond_info *bond_info = container_of(work, struct 
alb_bond_info, alb_work.work);
+       struct bonding *bond = (struct bonding *)((char *)bond_info - 
offsetof(struct bonding, alb_info));

And here.

+               cancel_rearming_delayed_workqueue(bond_wq, 
&(BOND_AD_INFO(bond).ad_work));

As I mentioned earlier this call to cancel_rearming_delayed_workqueue can deadlock
with netlink_watch. This happens if:

dev_close
   rtnl_lock                     carrier lost on device
bond_close netlink related workqueue event waiting for rtnl
      cancel_workqueue
         spinning waiting for workq to drain

The agreed upon semantics is to never do any operation that waits for workq
to drain with RTNL held.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to