Hello! I've discovered a bug while testing the new multiQ NAPI code. In hi-load situations when we take down an interface we get a kernel panic. The oops is below.
>From what I see this happens when driver does napi_disable() and clears NAPI_STATE_SCHED. In net_rx_action there is a check for work == weight a sort indirect test but that's now not enough to cover the load situation. where we have NAPI_STATE_SCHED cleared by e1000_down in my case and still full quota. Latest git but I'll guess the is the same in all later kernels. There might be different solutions... one variant is below: Signed-off-by: Robert Olsson <[EMAIL PROTECTED]> diff --git a/net/core/dev.c b/net/core/dev.c index 043e2f8..1031233 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2207,7 +2207,7 @@ static void net_rx_action(struct softirq_action *h) * still "owns" the NAPI instance and therefore can * move the instance around on the list at-will. */ - if (unlikely(work == weight)) + if (unlikely(work == weight) && (test_bit(NAPI_STATE_SCHED, &n->state))) list_move_tail(&n->poll_list, list); netpoll_poll_unlock(have); Cheers --ro labb:/# ifconfig eth0 down BUG: unable to handle kernel paging request at virtual address 00100104 printing eip: c0433d67 *pde = 00000000 Oops: 0002 [#1] SMP Modules linked in: Pid: 4, comm: ksoftirqd/0 Not tainted (2.6.24-rc3bifrost-gb3664d45-dirty #32) EIP: 0060:[<c0433d67>] EFLAGS: 00010046 CPU: 0 EIP is at net_rx_action+0x107/0x120 EAX: 00100100 EBX: f757d4e0 ECX: c200d334 EDX: 00200200 ESI: 00000040 EDI: c200d334 EBP: 000000ec ESP: f7c6bf78 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process ksoftirqd/0 (pid: 4, ti=f7c6a000 task=f7c58ab0 task.ti=f7c6a000) Stack: c0236217 c200ce9c c200ce9c 00000000 fffcf892 00000040 00000005 c05b2a98 c0603e60 00000008 c022a275 00000000 c06066c0 c06066c0 00000246 00000000 c022a5e0 00000000 c022a327 c06066c0 c022a636 fffffffc 00000000 c02384f2 Call Trace: [<c0236217>] __rcu_process_callbacks+0x107/0x190 [<c022a275>] __do_softirq+0x75/0xf0 [<c022a5e0>] ksoftirqd+0x0/0xd0 [<c022a327>] do_softirq+0x37/0x40 [<c022a636>] ksoftirqd+0x56/0xd0 [<c02384f2>] kthread+0x42/0x70 [<c02384b0>] kthread+0x0/0x70 [<c02039df>] kernel_thread_helper+0x7/0x18 ======================= Code: 88 8c 52 c0 e8 4b 1d df ff e8 96 0c dd ff c7 05 64 7d 63 c0 01 00 00 00 e9 61 ff ff ff 8d b4 26 00 00 00 00 8b 03 8b 53 04 89 f9 <89> 50 04 89 02 89 d8 8b 57 04 e8 5a 34 eb ff e9 4a ff ff ff 90 EIP: [<c0433d67>] net_rx_action+0x107/0x120 SS:ESP 0068:f7c6bf78 Kernel panic - not syncing: Fatal exception in interrupt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html