On 20.08.2018 05:47, Jian-Hong Pan wrote: > 2018-08-20 4:34 GMT+08:00 Heiner Kallweit <hkallwe...@gmail.com>: >> The three of you reported an MSI-X-related error when the system >> resumes from suspend. This has been fixed for now by disabling MSI-X >> on certain chip versions. However more versions may be affected. >> >> I checked with Realtek and they confirmed that on certain chip >> versions a MSIX-related value in PCI config space is reset when >> resuming from S3. >> >> I would appreciate if you could test the following experimental patch >> and whether warning "MSIX address lost, re-configuring" appears in >> your dmesg output after resume from suspend. >> >> Thanks a lot for your efforts. > > Tested with the experiment patch on ASUS X441UAR. > > This is the information before suspend: > > dev@endless:~$ dmesg | grep r8169 > [ 10.279565] libphy: r8169: probed > [ 10.279947] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, > XID 44900000, IRQ 127 > [ 10.445952] r8169 0000:02:00.0 enp2s0: renamed from eth0 > [ 15.676229] Generic PHY r8169-200:00: attached PHY driver [Generic > PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) > [ 17.455392] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - > flow control off > > dev@endless:~$ ip addr show enp2s0 > 4: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast > state UP group default qlen 1000 > link/ether 0c:9d:92:32:67:b4 brd ff:ff:ff:ff:ff:ff > inet 10.100.13.152/24 brd 10.100.13.255 scope global noprefixroute > dynamic enp2s0 > valid_lft 86347sec preferred_lft 86347sec > inet6 fe80::2873:a2a9:6ca1:c79d/64 scope link noprefixroute > valid_lft forever preferred_lft forever > > This is the information after resume: > > dev@endless:~$ dmesg | grep r8169 > [ 10.279565] libphy: r8169: probed > [ 10.279947] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, > XID 44900000, IRQ 127 > [ 10.445952] r8169 0000:02:00.0 enp2s0: renamed from eth0 > [ 15.676229] Generic PHY r8169-200:00: attached PHY driver [Generic > PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) > [ 17.455392] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - > flow control off > [ 95.594265] r8169 0000:02:00.0 enp2s0: Link is Down > [ 96.242074] Generic PHY r8169-200:00: attached PHY driver [Generic > PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) > > dev@endless:~$ ip addr show enp2s0 > 4: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc > pfifo_fast state DOWN group default qlen 1000 > link/ether 0c:9d:92:32:67:b4 brd ff:ff:ff:ff:ff:ff > > There is no "MSIX address lost, re-configuring" in dmesg. > The ethernet interface is still down after resume. >
Thanks a lot for testing. Unfortunately I don't have test hardware affected by this MSI-X issue, so maybe you can help me to understand the issue a little better. Below is a patch printing the MSI-X table entry in different contexts, it's not supposed to fix anything. Could you please let me know what the output is on your system? I want to get an idea whether the issue clears the complete entry or just corrupts certain parts. That's what I get on my system (RTL8168E-VL). In your case you'll come only till the first suspend. [ 3.743404] r8169 0000:03:00.0: MSI-X entry: context probe: fee01004 0 40ef 1 [ 29.539250] r8169 0000:03:00.0: MSI-X entry: context suspend: fee02004 0 4028 0 [ 29.837457] r8169 0000:03:00.0: MSI-X entry: context resume: fee01004 0 402b 0 [ 36.921370] r8169 0000:03:00.0: MSI-X entry: context suspend: fee01004 0 402b 0 [ 37.239407] r8169 0000:03:00.0: MSI-X entry: context resume: fee01004 0 402b 0 diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index 54f53c8c0..f32645119 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -11,6 +11,7 @@ #include <linux/module.h> #include <linux/moduleparam.h> #include <linux/pci.h> +#include <linux/msi.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> #include <linux/delay.h> @@ -6822,6 +6823,20 @@ rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) pm_runtime_put_noidle(&pdev->dev); } +static void rtl_print_msix_entry(struct rtl8169_private *tp, const char *context) +{ + struct msi_desc *desc = first_pci_msi_entry(tp->pci_dev); + u32 data[4]; + + data[0] = readl(desc->mask_base + PCI_MSIX_ENTRY_LOWER_ADDR); + data[1] = readl(desc->mask_base + PCI_MSIX_ENTRY_UPPER_ADDR); + data[2] = readl(desc->mask_base + PCI_MSIX_ENTRY_DATA); + data[3] = readl(desc->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL); + + dev_info(tp_to_dev(tp), "MSI-X entry: context %s: %x %x %x %x\n", + context, data[0], data[1], data[2], data[3]); +} + static void rtl8169_net_suspend(struct net_device *dev) { struct rtl8169_private *tp = netdev_priv(dev); @@ -6846,9 +6861,12 @@ static int rtl8169_suspend(struct device *device) { struct pci_dev *pdev = to_pci_dev(device); struct net_device *dev = pci_get_drvdata(pdev); + struct rtl8169_private *tp = netdev_priv(dev); rtl8169_net_suspend(dev); + rtl_print_msix_entry(tp, "suspend"); + return 0; } @@ -6875,6 +6893,9 @@ static int rtl8169_resume(struct device *device) { struct pci_dev *pdev = to_pci_dev(device); struct net_device *dev = pci_get_drvdata(pdev); + struct rtl8169_private *tp = netdev_priv(dev); + + rtl_print_msix_entry(tp, "resume"); if (netif_running(dev)) __rtl8169_resume(dev); @@ -7075,11 +7096,6 @@ static int rtl_alloc_irq(struct rtl8169_private *tp) RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~MSIEnable); RTL_W8(tp, Cfg9346, Cfg9346_Lock); flags = PCI_IRQ_LEGACY; - } else if (tp->mac_version == RTL_GIGA_MAC_VER_40) { - /* This version was reported to have issues with resume - * from suspend when using MSI-X - */ - flags = PCI_IRQ_LEGACY | PCI_IRQ_MSI; } else { flags = PCI_IRQ_ALL_TYPES; } @@ -7354,6 +7370,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) return rc; } + rtl_print_msix_entry(tp, "probe"); + tp->saved_wolopts = __rtl8169_get_wol(tp); mutex_init(&tp->wk.mutex); -- 2.18.0