Re: [PATCH] PCI: Add link_change error handler and vfio-pci user

2019-04-24 Thread Alex_Gagniuc
On 4/23/2019 5:42 PM, Alex Williamson wrote: > The PCIe bandwidth notification service generates logging any time a > link changes speed or width to a state that is considered downgraded. > Unfortunately, it cannot differentiate signal integrity related link > changes from those intentionally initi

Re: [PATCH v2] PCI: pciehp: Report degraded links via link bandwidth notification

2019-02-27 Thread Alex_Gagniuc
On 2/24/19 8:29 PM, Lukas Wunner wrote: > On Fri, Dec 07, 2018 at 12:20:00PM -0600, Alexandru Gagniuc wrote: > > >> Q: Why is this unconditionally compiled in? >> A: The symmetrical check in pci probe() is also always compiled in. > > Hm, it looks like the convention is to provide a separate Kco

Re: [PATCH] nvme-pci: Prevent mmio reads if pci channel offline

2019-02-27 Thread Alex_Gagniuc
On 2/27/19 11:51 AM, Keith Busch wrote: > I can't tell where you're going with this. It doesn't sound like you're > talking about hotplug anymore, at least. We're trying to fix an issue related to hotplug. However, the proposed fixes may have unintended consequences and side-effects. I want to ma

Re: [PATCH] nvme-pci: Prevent mmio reads if pci channel offline

2019-02-27 Thread Alex_Gagniuc
On 2/26/19 7:02 PM, Linus Torvalds wrote: > On Tue, Feb 26, 2019 at 2:37 PM wrote: >> >> Then nobody gets the (error) message. You can go a bit further and try >> 'pcie_ports=native". Again, nobody gets the memo. ): > > So? The error was bogus to begin with. Why would we care? Of course nobody c

Re: [PATCH] nvme-pci: Prevent mmio reads if pci channel offline

2019-02-26 Thread Alex_Gagniuc
On 2/25/19 9:55 AM, Keith Busch wrote: > On Sun, Feb 24, 2019 at 03:27:09PM -0800, alex_gagn...@dellteam.com wrote: >> [ 57.680494] {1}[Hardware Error]: Hardware error from APEI Generic >> Hardware Error Source: 1 >> [ 57.680495] {1}[Hardware Error]: event severity: fatal >> [ 57.680496] {1

Re: [PATCH] nvme-pci: Prevent mmio reads if pci channel offline

2019-02-24 Thread Alex_Gagniuc
On 2/24/19 4:42 PM, Linus Torvalds wrote: > On Sun, Feb 24, 2019 at 12:37 PM wrote: >> >> Dell r740xd to name one. r640 is even worse -- they probably didn't give >> me one because I'd have too much stuff to complain about. >> >> On the above machines, firmware-first (FFS) tries to guess when ther

Re: [PATCH RFC v2 2/4] PCI: pciehp: Do not turn off slot if presence comes up after link

2019-02-24 Thread Alex_Gagniuc
On 2/23/19 12:50 AM, Lukas Wunner wrote: > > [EXTERNAL EMAIL] > > On Fri, Feb 22, 2019 at 07:56:28PM +, alex_gagn...@dellteam.com wrote: >> On 2/21/19 1:36 AM, Lukas Wunner wrote: >>> On Tue, Feb 19, 2019 at 07:20:28PM -0600, Alexandru Gagniuc wrote: mutex_lock(&ctrl->state_lo

Re: [PATCH] nvme-pci: Prevent mmio reads if pci channel offline

2019-02-24 Thread Alex_Gagniuc
On 2/22/19 3:29 PM, Linus Torvalds wrote: > On Thu, Feb 21, 2019 at 5:07 PM Jon Derrick > wrote: >> >> Some platforms don't seem to easily tolerate non-posted mmio reads on >> lost (hot removed) devices. This has been noted in previous >> modifications to other layers where an mmio read to a lost

Re: [PATCH RFC v2 2/4] PCI: pciehp: Do not turn off slot if presence comes up after link

2019-02-22 Thread Alex_Gagniuc
On 2/21/19 1:36 AM, Lukas Wunner wrote: > On Tue, Feb 19, 2019 at 07:20:28PM -0600, Alexandru Gagniuc wrote: >> mutex_lock(&ctrl->state_lock); >> +present = pciehp_card_present(ctrl); >> +link_active = pciehp_check_link_active(ctrl); >> switch (ctrl->state) { > > These two assign

Re: [PATCH RFC v2 4/4] PCI: hotplug: Add quirk For Dell nvme pcie switches

2019-02-22 Thread Alex_Gagniuc
On 2/21/19 8:05 PM, Oliver wrote: > On Fri, Feb 22, 2019 at 5:38 AM wrote: >> On 2/21/19 1:57 AM, Lukas Wunner wrote: [snip] >>> If the quirk is x86-specific, please enclose it in "#ifdef CONFIG_X86" >>> to reduce kernel footprint on other arches. >> >> That's a tricky one. If you look at p. 185 o

Re: [PATCH RFC v2 4/4] PCI: hotplug: Add quirk For Dell nvme pcie switches

2019-02-21 Thread Alex_Gagniuc
On 2/21/19 1:57 AM, Lukas Wunner wrote: > > [EXTERNAL EMAIL] > > On Tue, Feb 19, 2019 at 07:20:30PM -0600, Alexandru Gagniuc wrote: >> --- a/drivers/pci/hotplug/pciehp_hpc.c >> +++ b/drivers/pci/hotplug/pciehp_hpc.c >> @@ -952,3 +952,23 @@ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_QCOM, >> 0x0

Re: [PATCH RFC v2 1/4] PCI: hotplug: Add support for disabling in-band presence

2019-02-21 Thread Alex_Gagniuc
On 2/21/19 1:20 AM, Lukas Wunner wrote: > > [EXTERNAL EMAIL] > > On Tue, Feb 19, 2019 at 07:20:27PM -0600, Alexandru Gagniuc wrote: >> @@ -846,6 +846,9 @@ struct controller *pcie_init(struct pcie_device *dev) >> if (pdev->is_thunderbolt) >> slot_cap |= PCI_EXP_SLTCAP_NCCS; >>

Re: [PATCH] PCI: pciehp: Do not turn off slot if presence comes up after link

2019-02-14 Thread Alex_Gagniuc
On 2/14/19 1:01 AM, Lukas Wunner wrote: > On Wed, Feb 13, 2019 at 06:55:46PM +, alex_gagn...@dellteam.com wrote: >> On 2/13/19 2:36 AM, Lukas Wunner wrote: (*) A bit hypothetical: There is no hardware yet implementing the ECN. >>> >>> Hm, this contradicts Austin Bolen's e-mail of Jan 25 th

Re: [PATCH] PCI: pciehp: Do not turn off slot if presence comes up after link

2019-02-13 Thread Alex_Gagniuc
On 2/13/19 2:36 AM, Lukas Wunner wrote: >> (*) A bit hypothetical: There is no hardware yet implementing the ECN. > > Hm, this contradicts Austin Bolen's e-mail of Jan 25 that "Yes, this > platform disables in-band presence" (when asked whether your host > controller already adheres to the ECN).

Re: [PATCH] PCI: pciehp: Do not turn off slot if presence comes up after link

2019-02-12 Thread Alex_Gagniuc
On 2/12/19 2:30 AM, Lukas Wunner wrote: > The PCI SIG > should probably consider granting access to specs to open source > developers to ease adoption of new features in Linux et al. Don't get me started on just how ridiculous I think PCI-SIG's policy is with respect to public availability of sta

Re: [PATCH] PCI: pciehp: Do not turn off slot if presence comes up after link

2019-02-11 Thread Alex_Gagniuc
On 2/9/19 5:58 AM, Lukas Wunner wrote: > > [EXTERNAL EMAIL] > > On Tue, Feb 05, 2019 at 03:06:56PM -0600, Alexandru Gagniuc wrote: >> According to PCIe 3.0, the presence detect state is a logical OR of >> in-band and out-of-band presence. > > Since Bjorn asked for a spec reference: > > PCIe r4.

Re: [RFC] PCI / ACPI: Implementing Type 3 _HPX records

2019-01-17 Thread Alex_Gagniuc
Hi Bjorn On 1/14/19 2:01 PM, Bjorn Helgaas wrote: > On Thu, Jan 10, 2019 at 05:11:27PM -0600, Alexandru Gagniuc wrote: >> _HPX Type 3 is intended to be more generic and allow configuration of >> settings not possible with Type 2 tables. For example, FW could ensure >> that the completion timeout v

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-12-27 Thread Alex_Gagniuc
On 11/8/18 2:09 PM, Bjorn Helgaas wrote: > > [EXTERNAL EMAIL] > Please report any suspicious attachments, links, or requests for sensitive > information. > > > [+cc Jonathan, Greg, Lukas, Russell, Sam, Oliver for discussion about > PCI error recovery in general] > > On Wed, Nov 07, 2018 at 05:

Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Alex_Gagniuc
On 11/29/2018 5:05 PM, Bjorn Helgaas wrote: > On Thu, Nov 29, 2018 at 08:13:12PM +0100, Lukas Wunner wrote: >> I guess the interrupt is shared with hotplug and PME? In that case write >> a separate pcie_port_service_driver and request the interrupt with >> IRQF_SHARED. Define a new service type i

Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Alex_Gagniuc
On 11/29/2018 10:06 AM, Mika Westerberg wrote: >> @@ -515,7 +515,8 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id) >> struct controller *ctrl = (struct controller *)dev_id; >> struct pci_dev *pdev = ctrl_dev(ctrl); >> struct device *parent = pdev->dev.parent; >> -u16 stat

Re: [PATCH] PCI: pciehp: Report degraded links via link bandwidth notification

2018-11-29 Thread Alex_Gagniuc
On 11/29/2018 11:36 AM, Bjorn Helgaas wrote: > On Wed, Nov 28, 2018 at 06:08:24PM -0600, Alexandru Gagniuc wrote: >> A warning is generated when a PCIe device is probed with a degraded >> link, but there was no similar mechanism to warn when the link becomes >> degraded after probing. The Link Band

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-11-08 Thread Alex_Gagniuc
On 11/08/2018 02:09 PM, Bjorn Helgaas wrote: > > [EXTERNAL EMAIL] > Please report any suspicious attachments, links, or requests for sensitive > information. > > > [+cc Jonathan, Greg, Lukas, Russell, Sam, Oliver for discussion about > PCI error recovery in general] Has anyone seen seen the EC

Re: [PATCH] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-09-18 Thread Alex_Gagniuc
On 9/12/2018 4:28 PM, Bjorn Helgaas wrote: > On Mon, Jul 30, 2018 at 04:21:44PM -0500, Alexandru Gagniuc wrote: >> When a PCI device is gone, we don't want to send IO to it if we can >> avoid it. We expose functionality via the irq_chip structure. As >> users of that structure may not know about th

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-08-09 Thread Alex_Gagniuc
On 08/09/2018 09:16 AM, Bjorn Helgaas wrote: > On Tue, Jul 17, 2018 at 10:31:23AM -0500, Alexandru Gagniuc wrote: >> When we don't own AER, we shouldn't touch the AER error bits. This >> happens unconditionally on device probe(). Clearing AER bits >> willy-nilly might cause firmware to miss errors.

Re: [PATCH v5] PCI: Check for PCIe downtraining conditions

2018-08-06 Thread Alex_Gagniuc
On 08/05/2018 02:06 AM, Tal Gilboa wrote: > On 7/31/2018 6:10 PM, Alex G. wrote: >> On 07/31/2018 01:40 AM, Tal Gilboa wrote: >> [snip] >> @@ -2240,6 +2258,9 @@ static void pci_init_capabilities(struct >> pci_dev *dev) >>   /* Advanced Error Reporting */ >>   pci_aer_init(

Re: [PATCH v5] PCI: Check for PCIe downtraining conditions

2018-07-30 Thread Alex_Gagniuc
On 07/24/2018 08:40 AM, Tal Gilboa wrote: > On 7/24/2018 2:59 AM, Alex G. wrote: >> >> >> On 07/23/2018 05:14 PM, Jakub Kicinski wrote: >>> On Tue, 24 Jul 2018 00:52:22 +0300, Tal Gilboa wrote: On 7/24/2018 12:01 AM, Jakub Kicinski wrote: > On Mon, 23 Jul 2018 15:03:38 -0500, Alexandru Gag

Re: [PATCH v3] PCI: Check for PCIe downtraining conditions

2018-07-16 Thread Alex_Gagniuc
On 7/16/2018 4:17 PM, Bjorn Helgaas wrote: > [+cc maintainers of drivers that already use pcie_print_link_status() > and GPU folks] Thanks for finding them! [snip] >> identifying this from userspace is neither intuitive, nor straigh >> forward. > > s/straigh/straight/ > In this context, I think

Re: [PATCH] PCI: access.c: Piggyback user config access on pci_read/write_*()

2018-06-04 Thread Alex_Gagniuc
On 6/4/2018 11:09 AM, Keith Busch wrote: > On Mon, Jun 04, 2018 at 10:48:02AM -0500, Alexandru Gagniuc wrote: >> +++ b/drivers/pci/access.c >> @@ -223,16 +223,9 @@ int pci_user_read_config_##size >> \ >> (struct pci_dev *dev, int pos, type *val)

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex_Gagniuc
On 5/31/2018 10:28 AM, Sinan Kaya wrote: > On 5/31/2018 11:05 AM, Alexandru Gagniuc wrote: >> +static void pcie_max_link_cap(struct pci_dev *dev, enum pci_bus_speed >> *speed, >> +enum pcie_link_width *width) >> +{ >> +uint32_t lnkcap; >> + >> +pcie_capability_r