On Wed, Mar 04, 2015 at 11:01:24AM +0800, Wei Yang wrote:
> On Tue, Feb 24, 2015 at 03:00:37AM -0600, Bjorn Helgaas wrote:
> >On Tue, Feb 24, 2015 at 02:34:57AM -0600, Bjorn Helgaas wrote:
> >> From: Wei Yang <weiy...@linux.vnet.ibm.com>
> >> 
> >> On PowerNV platform, resource position in M64 implies the PE# the resource
> >> belongs to.  In some cases, adjustment of a resource is necessary to locate
> >> it to a correct position in M64.
> >> 
> >> Add pnv_pci_vf_resource_shift() to shift the 'real' PF IOV BAR address
> >> according to an offset.
> >> 
> >> [bhelgaas: rework loops, rework overlap check, index resource[]
> >> conventionally, remove pci_regs.h include, squashed with next patch]
> >> Signed-off-by: Wei Yang <weiy...@linux.vnet.ibm.com>
> >> Signed-off-by: Bjorn Helgaas <bhelg...@google.com>
> >
> >...
> >
> >> +#ifdef CONFIG_PCI_IOV
> >> +static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
> >> +{
> >> +  struct pci_dn *pdn = pci_get_pdn(dev);
> >> +  int i;
> >> +  struct resource *res, res2;
> >> +  resource_size_t size;
> >> +  u16 vf_num;
> >> +
> >> +  if (!dev->is_physfn)
> >> +          return -EINVAL;
> >> +
> >> +  /*
> >> +   * "offset" is in VFs.  The M64 windows are sized so that when they
> >> +   * are segmented, each segment is the same size as the IOV BAR.
> >> +   * Each segment is in a separate PE, and the high order bits of the
> >> +   * address are the PE number.  Therefore, each VF's BAR is in a
> >> +   * separate PE, and changing the IOV BAR start address changes the
> >> +   * range of PEs the VFs are in.
> >> +   */
> >> +  vf_num = pdn->vf_pes;
> >> +  for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> >> +          res = &dev->resource[i + PCI_IOV_RESOURCES];
> >> +          if (!res->flags || !res->parent)
> >> +                  continue;
> >> +
> >> +          if (!pnv_pci_is_mem_pref_64(res->flags))
> >> +                  continue;
> >> +
> >> +          /*
> >> +           * The actual IOV BAR range is determined by the start address
> >> +           * and the actual size for vf_num VFs BAR.  This check is to
> >> +           * make sure that after shifting, the range will not overlap
> >> +           * with another device.
> >> +           */
> >> +          size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
> >> +          res2.flags = res->flags;
> >> +          res2.start = res->start + (size * offset);
> >> +          res2.end = res2.start + (size * vf_num) - 1;
> >> +
> >> +          if (res2.end > res->end) {
> >> +                  dev_err(&dev->dev, "VF BAR%d: %pR would extend past %pR 
> >> (trying to enable %d VFs shifted by %d)\n",
> >> +                          i, &res2, res, vf_num, offset);
> >> +                  return -EBUSY;
> >> +          }
> >> +  }
> >> +
> >> +  for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
> >> +          res = &dev->resource[i + PCI_IOV_RESOURCES];
> >> +          if (!res->flags || !res->parent)
> >> +                  continue;
> >> +
> >> +          if (!pnv_pci_is_mem_pref_64(res->flags))
> >> +                  continue;
> >> +
> >> +          size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
> >> +          res2 = *res;
> >> +          res->start += size * offset;
> >
> >I'm still not happy about this fiddling with res->start.
> >
> >Increasing res->start means that in principle, the "size * offset" bytes
> >that we just removed from res are now available for allocation to somebody
> >else.  I don't think we *will* give that space to anything else because of
> >the alignment restrictions you're enforcing, but "res" now doesn't
> >correctly describe the real resource map.
> >
> >Would you be able to just update the BAR here while leaving the struct
> >resource alone?  In that case, it would look a little funny that lspci
> >would show a BAR value in the middle of the region in /proc/iomem, but
> >the /proc/iomem region would be more correct.
> 
> Bjorn,
> 
> I did some tests, while the result is not good.
> 
> What I did is still write the shifted resource address to the device by
> pci_update_resource(), but I revert the res->start to the original one. If
> this step is not correct, please let me know.
> 
> This can't work since after we revert the res->start, those VFs will be given
> resources from res->start instead of (res->start + offset * size). This is not
> what we expect.

Hmm, yes, I suppose we'd have to have a hook in pci_bus_alloc_from_region()
or something.  That's getting a little messy.  I still don't like messing
with the resource after it's in the resource tree, but I don't have a
better idea right now.  So let's just go with what you have.

> >> +
> >> +          dev_info(&dev->dev, "VF BAR%d: %pR shifted to %pR (enabling %d 
> >> VFs shifted by %d)\n",
> >> +                   i, &res2, res, vf_num, offset);
> >> +          pci_update_resource(dev, i + PCI_IOV_RESOURCES);
> >> +  }
> >> +  pdn->max_vfs -= offset;
> >> +  return 0;
> >> +}
> >> +#endif /* CONFIG_PCI_IOV */
> 
> -- 
> Richard Yang
> Help you, Help me
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to