Hi Alex,
At 09/01/2016 10:12 AM, Alex Williamson wrote:
[...]
I had to move to a different system where I could actually inject an
aer error and created a config similar to above but with the 82576
ports downstream of the ioh3420 root port. When I inject a malformed
TLP uncorrectable error, my RHEL7.2 guest does this:
[ 35.995645] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal) error
received: id=0200
[ 35.998483] igb 0000:02:00.0: PCIe Bus Error: severity=Uncorrected (Fatal),
type=Unaccessible, id=0200(Unregistered Agent ID)
[ 36.001965] igb 0000:02:00.0 enp2s0f0: PCIe link lost, device now detached
[ 36.015092] igb 0000:02:00.1 enp2s0f1: PCIe link lost, device now detached
[ 39.133185] igb 0000:02:00.0: enabling device (0000 -> 0002)
[ 40.071245] igb 0000:02:00.1: enabling device (0000 -> 0002)
[ 41.014451] BUG: unable to handle kernel paging request at 0000000000003818
[ 41.015969] IP: [<ffffffffa02b438d>] igb_configure_tx_ring+0x14d/0x280 [igb]
[ 41.017507] PGD 367e2067 PUD 7ae56067 PMD 0
[ 41.018497] Oops: 0002 [#1] SMP
[ 41.019242] Modules linked in: ip6t_rpfilter ip6t_REJECT ipt_REJECT
xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_security iptable_raw iptable_filter snd_hda_codec_generic snd_hda_intel
snd_hda_codec ppdev snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt
iTCO_vendor_support bochs_drm snd_pcm syscopyarea sysfillrect sysimgblt ttm
virtio_balloon snd_timer snd igb drm_kms_helper soundcore ptp pps_core
i2c_algo_bit i2c_i801 dca drm shpchp lpc_ich mfd_core pcspkr i2c_core
parport_pc parport ip_tables xfs libcrc32c virtio_blk virtio_console virtio_net
ahci libahci crc32c_intel serio_raw libata virtio_pci virtio_ring virtio
dm_mirror dm_region_hash dm_log dm_mod
[ 41.040590] CPU: 0 PID: 29 Comm: kworker/0:1 Not tainted
3.10.0-327.el7.x86_64 #1
[ 41.042180] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
[ 41.044635] Workqueue: events aer_isr
[ 41.045478] task: ffff880179435080 ti: ffff880179680000 task.ti:
ffff880179680000
[ 41.047097] RIP: 0010:[<ffffffffa02b438d>] [<ffffffffa02b438d>]
igb_configure_tx_ring+0x14d/0x280 [igb]
[ 41.049151] RSP: 0018:ffff880179683bf8 EFLAGS: 00010246
[ 41.050260] RAX: 0000000000003818 RBX: 0000000000000000 RCX: 0000000000003818
[ 41.051747] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 00000000002896b3
[ 41.053268] RBP: ffff880179683c20 R08: 0000000001010100 R09: 00000000ffffffe7
[ 41.054730] R10: ffffea0001eb6100 R11: ffffffffa02afa31 R12: 0000000000000000
[ 41.056201] R13: ffff880035dbc8c0 R14: ffff880175d03f80 R15: 000000017716e000
[ 41.057673] FS: 0000000000000000(0000) GS:ffff88017fc00000(0000)
knlGS:0000000000000000
[ 41.059337] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 41.060548] CR2: 0000000000003818 CR3: 0000000178331000 CR4: 00000000000006f0
[ 41.062028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 41.063534] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 41.065025] Stack:
[ 41.065473] ffff880035dbc8c0 ffff880035dbce70 0000000000000001
ffff880035dbc8c8
[ 41.067119] ffff880035dbce70 ffff880179683c80 ffffffffa02b8a77
fefdf27269fb3cd8
[ 41.068781] 2009f9ee3386436f eb9e4e66756bbfdd 34002f8114a5d65f
9535990856231c4b
[ 41.094179] Call Trace:
[ 41.118688] [<ffffffffa02b8a77>] igb_configure+0x267/0x450 [igb]
[ 41.144286] [<ffffffffa02b94f1>] igb_up+0x21/0x1a0 [igb]
[ 41.170606] [<ffffffffa02b96a7>] igb_io_resume+0x37/0x70 [igb]
[ 41.195846] [<ffffffff813381e0>] ?
pci_cleanup_aer_uncorrect_error_status+0x90/0x90
[ 41.221767] [<ffffffff81338228>] report_resume+0x48/0x60
[ 41.246455] [<ffffffff8131e359>] pci_walk_bus+0x79/0xa0
[ 41.270722] [<ffffffff813381e0>] ?
pci_cleanup_aer_uncorrect_error_status+0x90/0x90
[ 41.296747] [<ffffffff813382f0>] broadcast_error_message+0xb0/0x100
[ 41.321552] [<ffffffff81338509>] do_recovery+0x1c9/0x280
[ 41.345507] [<ffffffff81338f58>] aer_isr+0x348/0x430
[ 41.368851] [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[ 41.392157] [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[ 41.416852] [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[ 41.441577] [<ffffffff810a5aef>] kthread+0xcf/0xe0
[ 41.465029] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[ 41.488341] [<ffffffff81645858>] ret_from_fork+0x58/0x90
[ 41.511247] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[ 41.535442] Code: c1 49 89 4e 30 49 8b 85 b8 05 00 00 48 85 c0 0f 84 39 01 00 00
81 c2 10 38 00 00 48 63 d2 48 01 d0 31 d2 89 10 49 8b 46 30 31 d2 <89> 10 41 8b
95 3c 06 00 00 b8 14 01 10 02 83 fa 05 74 0b 83 fa
[ 41.587718] RIP [<ffffffffa02b438d>] igb_configure_tx_ring+0x14d/0x280 [igb]
[ 41.610872] RSP <ffff880179683bf8>
[ 41.632301] CR2: 0000000000003818
And then it reboots. So what RAS improvement have we bought ourselves
here? What endpoints have you tested with this? Which ones recovered
reliably? Thanks,
I am working on it.
The endpoints I used to test areļ¼
-device ioh3420,bus=pcie.0,addr=1c.0,port=1,id=bridge1,chassis=1 \
-device vfio-pci,host=06:00.1,bus=bridge1,addr=00.1,id=net1,aer=true \
-device
vfio-pci,host=06:00.0,bus=bridge1,addr=00.0,id=net0,aer=true,multifunction=on
\
When I tested them, sometimes, my guest even be kernel panic.
And I found that the bug was related with the states of the devices.
when the enp1s0f1 was not up, which just like below:
ifconfig -a
...
enp1s0f1: flags=4098<BROADCAST,MULTICAST> mtu 1500
ether 00:1b:21:67:3b:bd txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0xfe420000-fe43ffff
...
Then, I injected fatal/nonfatal error, it could recovered reliably.
But, when I executed this command to make the network up:
ifconfig enp1s0f1 up
dmesg:
...
[ 34.109886] IPv6: ADDRCONF(NETDEV_UP): enp1s0f1: link is not ready
[ 36.232118] igb 0000:01:00.1 enp1s0f1: igb: enp1s0f1 NIC Link is Up
1000 Mbps Full Duplex, Flow Control: RX/TX
[ 36.232397] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0f1: link becomes ready
...
ifconfig:
...
enp1s0f1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
...
Then, I did the tests, I got the kernel panic:
[ 910.702002] ---[ end trace 3cd6a977a579a0bd ]---
[ 910.702002] Kernel panic - not syncing: Fatal exception
[ 910.702002] Kernel Offset: 0x0 from 0xffffffff81000000
(relocation range: 0xf
fffffff80000000-0xffffffff9fffffff)
[ 910.702002] ---[ end Kernel panic - not syncing: Fatal exception
----------------------
The bug happened in Guest with "PCIe link lost, device now detached",
but both Host and QEMU can reset the devices well.
At First, I guessed it might be a igb bug[1] and did many jobs on it.
But, it don't work.
Currently, I guess if I should make the devices non-working before QEMU
reset the devices ?
OR
I guess if I need to investigate the bug in igb driver ?
Now, I have no idea about it, Could you give me some advice?
[1]:http://www.gossamer-threads.com/lists/linux/kernel/2274363
Thanks,
Dou