On Wed, 13 Jun 2007 11:25:37 +0200 Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> On Tue, Jun 12, 2007 at 01:02:33PM +0200, Jarek Poplawski wrote: > ... > > Of course such a problem should preferably be fixed by somebody who > > knows the code (alas I don't know netconsole), to be sure all needed > > cancels are still done after this change. I hope Jason's patch is > > right but I'm a little surprised I can't see netdev in cc (I'll try > > to fix this). > > So, I've had a look into netpoll and, unfortunately, I don't > think this patch is right... > > > > From: Jason Wessel <[EMAIL PROTECTED]> > > > > > > Do not call cancel_rearming_delayed_work() if there is no > > > pending work. > > > > > > Signed-off-by: Jason Wessel <[EMAIL PROTECTED]> > > > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > > > --- > > > > > > net/core/netpoll.c | 6 ++++-- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff -puN net/core/netpoll.c~a net/core/netpoll.c > > > --- a/net/core/netpoll.c~a > > > +++ a/net/core/netpoll.c > > > @@ -784,8 +784,10 @@ void netpoll_cleanup(struct netpoll *np) > > > if (atomic_dec_and_test(&npinfo->refcnt)) { > > > skb_queue_purge(&npinfo->arp_tx); > > > skb_queue_purge(&npinfo->txq); > > > - cancel_rearming_delayed_work(&npinfo->tx_work); > > > - flush_scheduled_work(); > > > + if (delayed_work_pending(&npinfo->tx_work)) { > > > + > > > cancel_rearming_delayed_work(&npinfo->tx_work); > > > + flush_scheduled_work(); > > > + } > > > > > > kfree(npinfo); > > > } > > > _ > > There are such possibilities: > > 1. After positive delayed_work_pending(&npinfo->tx_work) test > some work is queued, but there is no guarantee that when running > it'll rearm again, so cancel_rearming_delayed_work can loop again; > > 2. After negative delayed_work_pending(&npinfo->tx_work) test > a work is just running, eg. waiting on netif_tx_lock, while > kfree(npinfo) is done here (oops?!). > > I've found an additional problem here with or without this patch: > after deleting a timer in cancel_rearming_delayed_work() there could > stay a last skb queued in npinfo->txq, and after kfree(npinfo) > we have small memory leak. If I'm right here similar fix is needed > in the current netpoll code: additional npinfo->txq purging only > or maybe the whole cancel_rearming_ changed like this. > > I've tried to eliminate these problems in attached below patch > proposal. I'm not sure it's all right: as I've written earlier I > don't know netconsole enough, but it's probably a little better > than above solution. > > I've some doubts yet (I didn't have time to check this all): > > 1. I hope this other schedule_delayed_work() from netpoll_send_skb() > is not possible when netpoll_cleanup() runs - if I'm wrong additional > check of npinfo->refcnt should be done there; > 2. I also hope npinfo->refcnt before scheduling should be enough here > - if not - another possibility is adding some locking eg.: > netif_tx_lock before cancel for synchronization. > > Of course it would be very nice if somebody could test or verify > this patch more. > > Regards, > Jarek P. > > > Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> > > --- > > diff -Nurp 2.6.21-/net/core/netpoll.c 2.6.21/net/core/netpoll.c > --- 2.6.21-/net/core/netpoll.c 2007-04-26 15:08:32.000000000 +0200 > +++ 2.6.21/net/core/netpoll.c 2007-06-12 21:05:23.000000000 +0200 > @@ -73,7 +73,8 @@ static void queue_process(struct work_st > netif_tx_unlock(dev); > local_irq_restore(flags); > > - schedule_delayed_work(&npinfo->tx_work, HZ/10); > + if (atomic_read(&npinfo->refcnt)) > + schedule_delayed_work(&npinfo->tx_work, HZ/10); > return; > } > netif_tx_unlock(dev); > @@ -780,9 +781,15 @@ void netpoll_cleanup(struct netpoll *np) > if (atomic_dec_and_test(&npinfo->refcnt)) { > skb_queue_purge(&npinfo->arp_tx); > skb_queue_purge(&npinfo->txq); > - cancel_rearming_delayed_work(&npinfo->tx_work); > + cancel_delayed_work(&npinfo->tx_work); > flush_scheduled_work(); > > + /* clean after last, unfinished work */ > + if (!skb_queue_empty(&npinfo->txq)) { > + struct sk_buff *skb; > + skb = __skb_dequeue(&npinfo->txq); > + kfree_skb(skb); > + } > kfree(npinfo); > } > } Everything went quiet? If this patch has been tested and fixes the bug, can you please send a version which is ready for merging? (ie: add a suitable description of what it does). Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html