On Fri, 19 May 2017 08:43:38 -0700
Alexander Duyck <alexander.du...@gmail.com> wrote:

> On Mon, May 15, 2017 at 10:53 AM, Alex Williamson
> <alex.william...@redhat.com> wrote:
> > On Mon, 15 May 2017 08:19:28 +0000
> > "Cheng, Collins" <collins.ch...@amd.com> wrote:
> >  
> >> Hi Williamson,
> >>
> >> We cannot assume BIOS supports SR-IOV, actually only newer server 
> >> motherboard BIOS supports SR-IOV. Normal desktop motherboard BIOS or older 
> >> server motherboard BIOS doesn't support SR-IOV. This issue would happen if 
> >> an user plugs our AMD SR-IOV capable GPU card to a normal desktop 
> >> motherboard.  
> >
> > Servers should be supporting SR-IOV for a long time now.  What really
> > is there to a BIOS supporting SR-IOV anyway, it's simply reserving
> > sufficient bus number and MMIO resources such that we can enable the
> > VFs.  This process isn't exclusively reserved for the BIOS.  Some
> > platforms may choose to only initialize boot devices, leaving the rest
> > for the OS to program.  The initial proposal here to disable SR-IOV if
> > not programmed at OS hand-off disables even the possibility of the OS
> > reallocating resources for this device.  
> 
> There are differences between supporting SR-IOV and supporting SR-IOV
> on devices with massive resources. I know I have seen NICs that will
> keep a system from completing POST if SR-IOV is enabled, and MMIO
> beyond 4G is not. My guess would be that the issues being seen are
> probably that they disable SR-IOV in the BIOS in such a setup and end
> up running into issues when they try to boot into the Linux kernel as
> it goes through and tries to allocate resources for SR-IOV even though
> it was disabled in the BIOS.
> 
> It might make sense to add a kernel parameter something like a
> "pci=nosriov" that would allow for disabling SR-IOV and related
> resource allocation if that is what we are talking about. That way you
> could plug in these types of devices into a system with a legacy bios
> or that doesn't wan to allocate addresses above 32b for MMIO, and this
> parameter would be all that is needed to disable SR-IOV so you could
> plug in a NIC that has SR-IOV associated with it.

Hi,

a) I think we're still ignoring the real bug that is something goes
wrong with the reallocation leaving the PF without resources.

b) Why does an option to avoid re-allocation need to be sr-iov
specific?  Shouldn't pci=realloc=off cover this?

Thanks,
Alex

> >> I agree that failure to allocate VF resources should leave the device in 
> >> no worse condition than before it tried. I hope kernel could allocate PF 
> >> device resource before allocating VF device resource, and keep PF device 
> >> resource valid and functional if failed to allocate VF device resource.
> >>
> >> I will send out dmesg log lspci info tomorrow. Thanks.  
> >
> > Thanks,
> > Alex
> >  
> >> -----Original Message-----
> >> From: Alex Williamson [mailto:alex.william...@redhat.com]
> >> Sent: Friday, May 12, 2017 10:43 PM
> >> To: Cheng, Collins <collins.ch...@amd.com>
> >> Cc: Bjorn Helgaas <bhelg...@google.com>; linux-...@vger.kernel.org; 
> >> linux-kernel@vger.kernel.org; Deucher, Alexander 
> >> <alexander.deuc...@amd.com>; Zytaruk, Kelly <kelly.zyta...@amd.com>; 
> >> Yinghai Lu <ying...@kernel.org>
> >> Subject: Re: [PATCH] PCI: Make SR-IOV capable GPU working on the SR-IOV 
> >> incapable platform
> >>
> >> On Fri, 12 May 2017 04:51:43 +0000
> >> "Cheng, Collins" <collins.ch...@amd.com> wrote:
> >>  
> >> > Hi Williamson,
> >> >
> >> > I verified the patch is working for both AMD SR-IOV GPU and Intel SR-IOV 
> >> > NIC. I don't think it is redundant to check the VF BAR valid before call 
> >> > sriov_init(), it is safe and saving boot time, also there is no a better 
> >> > method to know if system BIOS has correctly initialized the SR-IOV 
> >> > capability or not.  
> >>
> >> It also masks an underlying bug and creates a maintenance issue that we 
> >> won't know when it's safe to remove this workaround.  I don't think faster 
> >> boot is valid rationale, in one case SR-IOV is completely disabled, the 
> >> other we attempt to allocate the resources the BIOS failed to provide.  I 
> >> expect this is also a corner case, the BIOS should typically support 
> >> SR-IOV, therefore this situation should be an exception.
> >>  
> >> > I did not try to fix the issue from the kernel resource allocation 
> >> > perspective, it is because:
> >> >     1. I am not very familiar with the PCI resource allocation scheme in 
> >> > kernel. For example, in sriov_init(), kernel will re-assign the PCI 
> >> > resource for both VF and PF. I don't understand why kernel allocates 
> >> > resource for VF firstly, then PF. If it is PF firstly, then this issue 
> >> > could be avoided.
> >> >     2. I am not sure if kernel has error handler if PCI resource 
> >> > allocation failed. In this case, kernel cannot allocate enough resource 
> >> > to PF. It should trigger some error handler to either just keep original 
> >> > BAR values set by system BIOS, or disable this device and log errors.  
> >>
> >> I think these are the issues we should be trying to solve and I'm sure 
> >> folks on the linux-pci list can help us identify the bug.  Minimally, 
> >> failure to allocate VF resources should leave the device in no worse 
> >> condition than before it tried.  Perhaps you could post more details about 
> >> the issue, boot with pci=earlydump, post dmesg of a boot where the PF 
> >> resources are incorrectly re-allocated, and include lspci -vvv for the 
> >> SR-IOV device.  Also, please test with the latest upstream kernel, 
> >> upstream only patches old kernels through stable backports of commits to 
> >> the latest kernel.  Adding Yinghai as a resource allocation expert. Thanks,
> >>
> >> Alex
> >>  
> >> > -----Original Message-----
> >> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> >> > Sent: Friday, May 12, 2017 12:01 PM
> >> > To: Cheng, Collins <collins.ch...@amd.com>
> >> > Cc: Bjorn Helgaas <bhelg...@google.com>; linux-...@vger.kernel.org;
> >> > linux-kernel@vger.kernel.org; Deucher, Alexander
> >> > <alexander.deuc...@amd.com>; Zytaruk, Kelly <kelly.zyta...@amd.com>
> >> > Subject: Re: [PATCH] PCI: Make SR-IOV capable GPU working on the
> >> > SR-IOV incapable platform
> >> >
> >> > On Fri, 12 May 2017 03:42:46 +0000
> >> > "Cheng, Collins" <collins.ch...@amd.com> wrote:
> >> >  
> >> > > Hi Williamson,
> >> > >
> >> > > GPU card needs more BAR aperture resource than other PCI devices. For 
> >> > > example, Intel SR-IOV network card only require 512KB memory resource 
> >> > > for all VFs. AMD SR-IOV GPU card needs 256MB x16 VF = 4GB memory 
> >> > > resource for frame buffer BAR aperture.
> >> > >
> >> > > If the system BIOS supports SR-IOV, it will reserve enough resource 
> >> > > for all VF BARs. If the system BIOS doesn't support SR-IOV or cannot 
> >> > > allocate the enough resource for VF BARs, only PF BAR will be assigned 
> >> > > and VF BARs are empty. Then system boots to Linux kernel and kernel 
> >> > > doesn't check if the VF BARs are empty or valid. Kernel will re-assign 
> >> > > the BAR resources for PF and all VFs. The problem I saw is that kernel 
> >> > > will fail to allocate PF BAR resource because some resources are 
> >> > > assigned to VF, this is not expected. So kernel might need to do some 
> >> > > check before re-assign the PF/VF resource, so that PF device will be 
> >> > > correctly assigned BAR resource and user can use PF device.  
> >> >
> >> > So the problem is that something bad happens when the kernel is trying
> >> > to reallocate resources in order to fulfill the requirements of the
> >> > VFs, leaving the PF resources incorrectly programmed?  Why not just
> >> > fix that bug rather than creating special handling for this
> >> > vendor/class of device which disables any attempt to fixup resources
> >> > for SR-IOV?  IOW, this patch just avoids the problem for your devices
> >> > rather than fixing the bug.  I'd suggest fixing the bug such that the
> >> > PF is left in a functional state if the kernel is unable to allocate
> >> > sufficient resources for the VFs.  Thanks,
> >> >
> >> > Alex
> >> >  
> >> > > -----Original Message-----
> >> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> >> > > Sent: Friday, May 12, 2017 11:21 AM
> >> > > To: Cheng, Collins <collins.ch...@amd.com>
> >> > > Cc: Bjorn Helgaas <bhelg...@google.com>; linux-...@vger.kernel.org;
> >> > > linux-kernel@vger.kernel.org; Deucher, Alexander
> >> > > <alexander.deuc...@amd.com>; Zytaruk, Kelly <kelly.zyta...@amd.com>
> >> > > Subject: Re: [PATCH] PCI: Make SR-IOV capable GPU working on the
> >> > > SR-IOV incapable platform
> >> > >
> >> > > On Fri, 12 May 2017 02:50:32 +0000
> >> > > "Cheng, Collins" <collins.ch...@amd.com> wrote:
> >> > >  
> >> > > > Hi Helgaas,
> >> > > >
> >> > > > Some AMD GPUs have hardware support for graphics SR-IOV.
> >> > > > If the SR-IOV capable GPU is plugged into the SR-IOV incapable
> >> > > > platform. It would cause a problem on PCI resource allocation in
> >> > > > current Linux kernel.
> >> > > >
> >> > > > Therefore in order to allow the PF (Physical Function) device of
> >> > > > SR-IOV capable GPU to work on the SR-IOV incapable platform, it is
> >> > > > required to verify conditions for initializing BAR resources on
> >> > > > AMD SR-IOV capable GPUs.
> >> > > >
> >> > > > If the device is an AMD graphics device and it supports SR-IOV it
> >> > > > will require a large amount of resources.
> >> > > > Before calling sriov_init() must ensure that the system BIOS also
> >> > > > supports SR-IOV and that system BIOS has been able to allocate
> >> > > > enough resources.
> >> > > > If the VF BARs are zero then the system BIOS does not support
> >> > > > SR-IOV or it could not allocate the resources and this platform
> >> > > > will not support AMD graphics SR-IOV.
> >> > > > Therefore do not call sriov_init().
> >> > > > If the system BIOS does support SR-IOV then the VF BARs will be
> >> > > > properly initialized to non-zero values.
> >> > > >
> >> > > > Below is the patch against to Kernel 4.8 & 4.9. Please review.
> >> > > >
> >> > > > I checked the drivers/pci/quirks.c, it looks the workarounds/fixes
> >> > > > in quirks.c are for specific devices and one or more device ID are
> >> > > > defined for the specific devices. However my patch is for all AMD
> >> > > > SR-IOV capable GPUs, that includes all existing and future AMD 
> >> > > > server GPUs.
> >> > > > So it doesn't seem like a good fit to put the fix in quirks.c.  
> >> > >
> >> > >
> >> > > Why is an AMD graphics card unique here?  Doesn't sriov_init()
> >> > > always need to be able to deal with devices of any type where the
> >> > > BIOS hasn't initialized the SR-IOV capability?  Some SR-IOV devices
> >> > > can fit their VFs within a minimum bridge aperture, most cannot.  I
> >> > > don't understand why the VF resource requirements being
> >> > > exceptionally large dictates that they receive special handling.
> >> > > Thanks,
> >> > >
> >> > > Alex
> >> > >  
> >> > > > Signed-off-by: Collins Cheng <collins.ch...@amd.com>
> >> > > > ---
> >> > > >  drivers/pci/iov.c | 63
> >> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++---
> >> > > >  1 file changed, 60 insertions(+), 3 deletions(-)
> >> > > >
> >> > > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index
> >> > > > e30f05c..e4f1405 100644
> >> > > > --- a/drivers/pci/iov.c
> >> > > > +++ b/drivers/pci/iov.c
> >> > > > @@ -523,6 +523,45 @@ static void sriov_restore_state(struct pci_dev 
> >> > > > *dev)
> >> > > >                 msleep(100);
> >> > > >  }
> >> > > >
> >> > > > +/*
> >> > > > + * pci_vf_bar_valid - check if VF BARs have resource allocated
> >> > > > + * @dev: the PCI device
> >> > > > + * @pos: register offset of SR-IOV capability in PCI config space
> >> > > > + * Returns true any VF BAR has resource allocated, false
> >> > > > + * if all VF BARs are empty.
> >> > > > + */
> >> > > > +static bool pci_vf_bar_valid(struct pci_dev *dev, int pos) {
> >> > > > +       int i;
> >> > > > +       u32 bar_value;
> >> > > > +       u32 bar_size_mask = ~(PCI_BASE_ADDRESS_SPACE |
> >> > > > +                       PCI_BASE_ADDRESS_MEM_TYPE_64 |
> >> > > > +                       PCI_BASE_ADDRESS_MEM_PREFETCH);
> >> > > > +
> >> > > > +       for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> >> > > > +               pci_read_config_dword(dev, pos + PCI_SRIOV_BAR + i * 
> >> > > > 4, &bar_value);
> >> > > > +               if (bar_value & bar_size_mask)
> >> > > > +                       return true;
> >> > > > +       }
> >> > > > +
> >> > > > +       return false;
> >> > > > +}
> >> > > > +
> >> > > > +/*
> >> > > > + * is_amd_display_adapter - check if it is an AMD/ATI GPU device
> >> > > > + * @dev: the PCI device
> >> > > > + *
> >> > > > + * Returns true if device is an AMD/ATI display adapter,
> >> > > > + * otherwise return false.
> >> > > > + */
> >> > > > +
> >> > > > +static bool is_amd_display_adapter(struct pci_dev *dev) {
> >> > > > +       return (((dev->class >> 16) == PCI_BASE_CLASS_DISPLAY) &&
> >> > > > +               (dev->vendor == PCI_VENDOR_ID_ATI ||
> >> > > > +               dev->vendor == PCI_VENDOR_ID_AMD)); }
> >> > > > +
> >> > > >  /**
> >> > > >   * pci_iov_init - initialize the IOV capability
> >> > > >   * @dev: the PCI device
> >> > > > @@ -537,9 +576,27 @@ int pci_iov_init(struct pci_dev *dev)
> >> > > >                 return -ENODEV;
> >> > > >
> >> > > >         pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV);
> >> > > > -       if (pos)
> >> > > > -               return sriov_init(dev, pos);
> >> > > > -
> >> > > > +       if (pos) {
> >> > > > +       /*
> >> > > > +        * If the device is an AMD graphics device and it supports
> >> > > > +        * SR-IOV it will require a large amount of resources.
> >> > > > +        * Before calling sriov_init() must ensure that the system
> >> > > > +        * BIOS also supports SR-IOV and that system BIOS has been
> >> > > > +        * able to allocate enough resources.
> >> > > > +        * If the VF BARs are zero then the system BIOS does not
> >> > > > +        * support SR-IOV or it could not allocate the resources
> >> > > > +        * and this platform will not support AMD graphics SR-IOV.
> >> > > > +        * Therefore do not call sriov_init().
> >> > > > +        * If the system BIOS does support SR-IOV then the VF BARs
> >> > > > +        * will be properly initialized to non-zero values.
> >> > > > +        */
> >> > > > +               if (is_amd_display_adapter(dev)) {
> >> > > > +                       if (pci_vf_bar_valid(dev, pos))
> >> > > > +                               return sriov_init(dev, pos);
> >> > > > +               } else {
> >> > > > +                       return sriov_init(dev, pos);
> >> > > > +               }
> >> > > > +       }
> >> > > >         return -ENODEV;
> >> > > >  }
> >> > > >  
> >> > >  
> >> >  
> >>  
> >  

Reply via email to