Hi Casey,

On Thu, May 07, 2015 at 12:34:51AM +0000, Casey Leedom wrote:
> Hello,
> 
>   I've included both the Network and PCI Development mailing lists because 
> this crosses both domains.  If this violates protocols, I apologise in 
> advance.  I believe that this ~probably~ will be a "PCI Development" issue 
> but since it affects a Network Device, I figured input from that quarter 
> should be solicited as well.
> 
>   We have a Network Device, T5, which unfortunately has a PCI Compliance bug 
> in it[1].  Doubly unfortunately, the PCI SIG 3.0 Compliance Tests which we 
> dutifully ranb before releasing the silicon two years ago didn't include a 
> vector for this, so it wasn't discovered till very recently.  In order to 
> solve this bug, we need to modify the Root Complex's PCI Express Capability 
> Device Control register in order to turn off the Enable No Snoop and Enable 
> Relaxed Ordering features.
> 
>   I've cons'ed up demonstration code within the Device Driver for T5, cxgb4, 
> to traverse the PCI graph to the Root Complex for the T5 Device and perform 
> this modification[2].  This demonstrates the "workability" of doing the 
> operation, but I'm betting that this is _not_ where either the Network or PCI 
> Development teams would like such code to go.  My expectation is that this 
> code needs to go into drivers/pci/quirks.c.
> 
>   So, this mail is a request for advice on how the Network and PCI 
> Development teams would like this handled.  Once I receive this input, I will 
> submit a patch to the correct team for inclusion in the kernel.  Thank you 
> for your time and attention.

There are a lot of fixups in drivers/pci/quirks.c.  For things that have to
be worked around either before a driver claims the device or if there is no
driver at all, the fixup *has* to go in drivers/pci/quirks.c

But for things like this, where the problem can only occur after a driver
claims the device, I think it makes more sense to put the fixup in the
driver itself.  The only wrinkle here is that the fixup has to be done on a
separate device, not the device claimed by the driver.  But I think it
probably still makes sense to put this fixup in the driver.

> Casey
> 
> [1] Chelsio T5 PCI-E Compliance Bug:
> 
>     The bug is that when the Root Complex send a Transaction Layer Packet 
> (TLP)
>     Request downstream to a Device,the TLP may contain Attributes.  The PCI
>     Specification states that two of these Attributes, No Snoop and Relaxed
>     Ordering, must be included in the Device's TLP Response.  Further, the PCI
>     Specification "encourages" Root Complexes to drop TLP Responses which
>     are out of compliance with this rule.

Can you include a pointer to the relevant part of the spec?

> [2] Demonstration Code for clearing Root Complex No Snoop and Relaxed 
> Ordering:
> 
> --- a/drivers/net/ethernet/chelsio/cxgb4_main.c       Mon Apr 06 09:27:21 
> 2015 -0700
> +++ b/drivers/net/ethernet/chelsio/cxgb4_main.c       Tue Apr 07 13:39:05 
> 2015 -0700
> @@ -9956,6 +9956,36 @@ static void enable_pcie_relaxed_ordering
>       pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN);
>  }
>  
> +/*
> + * Find the highest PCI-Express bridge above a PCI Device.  If found, that's
> + * the Root Complex PCI-PCI Bridge for the PCI Device.  If we find the Root
> + * Comples, clear the Enable Relaxed Ordering and Enable No Snoop bits in 
> that

s/Comples/Complex/, but the Root Complex itself does not appear as a PCI
device, so we'll never actually find *it*.  But I think we should *always*
find a Root Port.  Your code and text suggests that it's possible we
wouldn't (since you say "*If* found, ...").  Is there a case you're
thinking of where we wouldn't find a Root Port?

> + * bridge's PCI-E Capability Device Control register.  This will prevent the
> + * Root Complex from setting those attributes in the Transaction Layer 
> Packets
> + * of the Requests which it sends down stream to the PCI Device.
> + */
> +static void clear_root_complex_tlp_attributes(struct pci_dev *pdev)
> +{
> +     struct pci_bus *bus = pdev->bus;
> +     struct pci_dev *highest_pcie_bridge = NULL;
> +
> +     while (bus) {
> +             struct pci_dev *bridge = bus->self;
> +
> +             if (!bridge || !bridge->pcie_cap)
> +                     break;
> +             highest_pcie_bridge = bridge;
> +             bus = bus->parent;
> +     }

Can you use pci_upstream_bridge() here?  There are a couple places where we
want to find the Root Port, so we might factor that out someday.  It'll be
easier to find all those places if they use with pci_upstream_bridge().

> +
> +     if (highest_pcie_bridge)
> +             pcie_capability_clear_and_set_word(highest_pcie_bridge,
> +                                                PCI_EXP_DEVCTL,
> +                                                PCI_EXP_DEVCTL_RELAX_EN |
> +                                                PCI_EXP_DEVCTL_NOSNOOP_EN,
> +                                                0);

Please include a dmesg note here, especially since the driver is changing
the config of a device other than its own.

> +}
> +
>  static int init_one(struct pci_dev *pdev,
>                             const struct pci_device_id *ent)
>  {
> @@ -9973,6 +10003,19 @@ static int init_one(struct pci_dev *pdev
>               ++version_printed;
>       }
>  
> +     /*
> +      * T5 has a PCI-E Compliance bug in it where it doesn't copy the
> +      * Transaction Layer Packet Attributes from downstream Requests into
> +      * it's upstream Responses.  Most Root Complexes are fine with this

s/it's/its/

> +      * but a few get prissy and drop the non-compliant T5 Responses
> +      * leading to endless Device Timeouts when TLP Attributes are set.  So
> +      * if we're a T5, attempt to clear our Root Complex's enable bits for
> +      * TLP Attributes ...
> +      */
> +     if (CHELSIO_PCI_ID_VER(pdev->device) == CHELSIO_T5 ||
> +         CHELSIO_PCI_ID_VER(pdev->device) == CHELSIO_T5_FPGA)
> +             clear_root_complex_tlp_attributes(pdev);
> +
>       err = pci_request_regions(pdev, KBUILD_MODNAME);
>       if (err) {
>               /* Just info, some other driver may have claimed the device. 
> */--

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to