Could be this commit (just speculation):
commit 499ab5ccbd42839f40d5572e7a4799c412986a11
Author: akepner <[email protected]>
Date: Wed Mar 13 14:54:58 2013 +0000
ixgbe: in shutdown, do netif_running() under rtnl_lock
During shutdown it's possible for __dev_close() (which holds
rtnl_lock) to clear the __LINK_STATE_START bit, and for ixgbe
to then read that bit (without holding rtnl_lock), and then
not fail to free irqs, etc. The result is a crash like this:
------------[ cut here ]------------
kernel BUG at drivers/pci/msi.c:313!
invalid opcode: 0000 [#1] SMP
last sysfs file:
/sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 1
Pid: 5910, comm: reboot Tainted: P ---------------- 2.6.32 #1 em
RIP: 0010:[<ffffffff81305c2b>] [<ffffffff81305c2b>]
free_msi_irqs+0x11b/0x1
RSP: 0018:ffff880185c9bc88 EFLAGS: 00010282
RAX: ffff880219f58bc0 RBX: ffff88021ac53b00 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000246 RDI: 000000000000004a
RBP: ffff880185c9bcc8 R08: 0000000000000002 R09: 0000000000000106
R10: 0000000000000000 R11: 0000000000000006 R12: ffff88021e524778
R13: 0000000000000001 R14: ffff88021e524000 R15: 0000000000000000
FS: 00007f90821b7700(0000) GS:ffff880028220000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f90818bd010 CR3: 0000000132c64000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process reboot (pid: 5910, threadinfo ffff880185c9a000, task
ffff88021bf04a8
Stack:
ffff880185c9bc98 000000018130529d ffff880185c9bcc8 ffff88021e524000
<0> 0000000000000004 ffff88021948c700 0000000000000000 ffff880185c9bda7
<0> ffff880185c9bce8 ffffffff81305cbd ffff880185c9bce8 ffff88021948c700
Call Trace:
[<ffffffff81305cbd>] pci_disable_msix+0x3d/0x50
[<ffffffffa00501d5>] ixgbe_reset_interrupt_capability+0x65/0x90
[ixgbe]
[<ffffffffa00512f6>] ixgbe_clear_interrupt_scheme+0xb6/0xd0 [ixgbe]
[<ffffffffa005330b>] __ixgbe_shutdown+0x5b/0x200 [ixgbe]
[<ffffffffa00534ca>] ixgbe_shutdown+0x1a/0x60 [ixgbe]
[<ffffffff812f6c7c>] pci_device_shutdown+0x2c/0x50
[<ffffffff813727fb>] device_shutdown+0x4b/0x160
[<ffffffff8107d98c>] kernel_restart_prepare+0x2c/0x40
ehci timer_action, mod_timer io_watchdog
[<ffffffff8107d9e6>] kernel_restart+0x16/0x60
[<ffffffff8107dbfd>] sys_reboot+0x1ad/0x200
[<ffffffff811676cf>] ? __d_free+0x3f/0x60
[<ffffffff81167748>] ? d_free+0x58/0x60
[<ffffffff8116f7c0>] ? mntput_no_expire+0x30/0x100
[<ffffffff81152b11>] ? __fput+0x191/0x200
[<ffffffff816565fe>] ? do_page_fault+0x3e/0xa0
[<ffffffff8100b132>] system_call_fastpath+0x16/0x1b
Code: 4c 89 ef e8 98 8c e3 ff 4d 39 f4 48 8b 43 10 75 cf 48 83 c4
18 5b 41 5
41 5d 41 5e 41 5f c9 c3 49 8b 7d 20 e8 07 5a d3 ff eb c9 <0f> 0b 0f
1f 00 eb
66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
ehci timer_action, mod_timer io_watchdog
RIP [<ffffffff81305c2b>] free_msi_irqs+0x11b/0x130
RSP <ffff880185c9bc88>
---[ end trace 27de882a0fe75593 ]---
(This was seen on a pretty old kernel/driver, but looks like
the same bug is still possible.)
Signed-off-by: <[email protected]>
Tested-by: Phil Schmitt <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
b/drivers/net/etherne
index 6bd1dd1..48f3fd5 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -5123,14 +5123,14 @@ static int __ixgbe_shutdown(struct pci_dev
*pdev, bool *
netif_device_detach(netdev);
+ rtnl_lock();
if (netif_running(netdev)) {
- rtnl_lock();
ixgbe_down(adapter);
ixgbe_free_irq(adapter);
ixgbe_free_all_tx_resources(adapter);
ixgbe_free_all_rx_resources(adapter);
- rtnl_unlock();
}
+ rtnl_unlock();
ixgbe_clear_interrupt_scheme(adapter);
On 06/17/2013 09:43 PM, Stephen Hemminger wrote:
> I am seeing this error on shutdown with 3.10 and net-next (on Debian 7.0)
> Looks like a problem with lockdep seeing issues with canceling work in
> ixgbe with rtnl held?
>
> Not a bit issue, shutdown still works and no real big issue.
>
> [ 310.416382] INFO: trying to register non-static key.
> [ 310.417499] the code is fine but needs lockdep annotation.
> [ 310.418625] turning off the locking correctness validator.
> [ 310.419769] CPU: 6 PID: 4592 Comm: ip Tainted: G W
> 3.10.0-rc6-net-next+ #2
> [ 310.420930] Hardware name: System manufacturer System Product Name/P8Z77-V
> LX, BIOS 0801 07/17/2012
> [ 310.422108] 0000000000000002 ffff88040c36d3f8 ffffffff81599da9
> ffff88040c36d4f8
> [ 310.423333] ffffffff810ee889 0000000000000002 ffffffff81a37830
> 0000000000000000
> [ 310.424564] 0000000000000000 ffff880400000000 0000000000000046
> ffff88040c906dc0
> [ 310.425802] Call Trace:
> [ 310.427017] [<ffffffff81599da9>] dump_stack+0x19/0x1b
> [ 310.428234] [<ffffffff810ee889>] __lock_acquire+0x17a9/0x1dd0
> [ 310.429447] [<ffffffff810efe1e>] ? mark_held_locks+0xae/0x120
> [ 310.430680] [<ffffffff815a0100>] ? _raw_spin_unlock_irq+0x30/0x50
> [ 310.431887] [<ffffffff810ef498>] lock_acquire+0x98/0x140
> [ 310.433103] [<ffffffff810b1f45>] ? flush_work+0x5/0x280
> [ 310.434304] [<ffffffff810b1f7d>] flush_work+0x3d/0x280
> [ 310.435490] [<ffffffff810b1f45>] ? flush_work+0x5/0x280
> [ 310.436668] [<ffffffff8159d04d>] ? __mutex_unlock_slowpath+0xdd/0x1a0
> [ 310.437844] [<ffffffff810efe1e>] ? mark_held_locks+0xae/0x120
> [ 310.439029] [<ffffffff810b47f4>] ? __cancel_work_timer+0x74/0x120
> [ 310.440208] [<ffffffff810effad>] ? trace_hardirqs_on_caller+0x11d/0x1e0
> [ 310.441379] [<ffffffff810b4807>] __cancel_work_timer+0x87/0x120
> [ 310.442561] [<ffffffff810b48d0>] cancel_work_sync+0x10/0x20
> [ 310.443765] [<ffffffffa005b50c>] ixgbe_ptp_stop+0x2c/0x90 [ixgbe]
> [ 310.444977] [<ffffffffa00448aa>] ixgbe_close+0x2a/0x100 [ixgbe]
> [ 310.446225] [<ffffffff81497fa5>] __dev_close_many+0x95/0xe0
> [ 310.447444] [<ffffffff81498034>] __dev_close+0x44/0x90
> [ 310.448671] [<ffffffff8149fe61>] __dev_change_flags+0xa1/0x180
> [ 310.449866] [<ffffffff814a0008>] dev_change_flags+0x28/0x70
> [ 310.451068] [<ffffffff814ae411>] do_setlink+0x371/0x960
> [ 310.452269] [<ffffffff812fa011>] ? nla_parse+0x31/0xe0
> [ 310.453470] [<ffffffff814af0d9>] rtnl_newlink+0x369/0x580
> [ 310.454674] [<ffffffff814aebd4>] rtnetlink_rcv_msg+0xa4/0x240
> [ 310.455882] [<ffffffff8159ce6d>] ? mutex_lock_nested+0x2bd/0x3c0
> [ 310.457094] [<ffffffff814ab067>] ? rtnl_lock+0x17/0x20
> [ 310.458318] [<ffffffff814aeb30>] ? __rtnl_unlock+0x20/0x20
> [ 310.459519] [<ffffffff814cc1c1>] netlink_rcv_skb+0xb1/0xc0
> [ 310.460711] [<ffffffff814ab095>] rtnetlink_rcv+0x25/0x40
> [ 310.461905] [<ffffffff814cb75d>] netlink_unicast+0x10d/0x190
> [ 310.463106] [<ffffffff814cbaf1>] netlink_sendmsg+0x311/0x740
> [ 310.464327] [<ffffffff81484486>] sock_sendmsg+0xa6/0xd0
> [ 310.465514] [<ffffffff810eef50>] ? lock_release_non_nested+0xa0/0x310
> [ 310.466704] [<ffffffff8148483c>] ___sys_sendmsg+0x38c/0x3a0
> [ 310.467893] [<ffffffff810bfda3>] ? up_read+0x23/0x40
> [ 310.469076] [<ffffffff81079754>] ? __do_page_fault+0x214/0x4b0
> [ 310.470278] [<ffffffff8118caf8>] ? do_brk+0x2b8/0x330
> [ 310.471454] [<ffffffff8118ca0b>] ? do_brk+0x1cb/0x330
> [ 310.472619] [<ffffffff81485a69>] __sys_sendmsg+0x49/0x90
> [ 310.473786] [<ffffffff81485ac2>] SyS_sendmsg+0x12/0x20
> [ 310.474961] [<ffffffff815a1359>] system_call_fastpath+0x16/0x1b
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired