On Sun, Jul 02, 2023 at 04:55:48AM -0400, Michael S. Tsirkin wrote:
> On Sun, Jul 02, 2023 at 05:46:38PM +0900, Akihiko Odaki wrote:
> > On 2023/07/02 13:58, Michael S. Tsirkin wrote:
> > > On Sat, Jul 01, 2023 at 04:01:22PM +0900, Akihiko Odaki wrote:
> > > > The function number must be lower than the next function number
> > > > advertised with ARI.
> > > >
> > > > Signed-off-by: Akihiko Odaki <[email protected]>
> > >
> > > I don't get this logic at all - where is the limitation coming from?
> > >
> > > All I see in the spec is:
> > > Next Function Number - With non-VFs, this field indicates the Function
> > > Number of the next higher
> > > numbered Function in the Device, or 00h if there are no higher numbered
> > > Functions. Function 0 starts
> > > this linked list of Functions.
> > > The presence of Shadow Functions does not affect this field.
> > > For VFs, this field is undefined since VFs are located using First VF
> > > Offset (see § Section 9.3.3.9 ) and VF
> > > Stride (see § Section 9.3.3.10 ).
> > >
> > > and
> > >
> > > To improve the enumeration performance and create a more deterministic
> > > solution, software can
> > > enumerate Functions through a linked list of Function Numbers. The next
> > > linked list element is
> > > communicated through each Function’s ARI Capability Register.
> > > i. Function 0 acts as the head of a linked list of Function Numbers.
> > > Software detects a
> > > non-Zero Next Function Number field within the ARI Capability Register
> > > as the next
> > > Function within the linked list. Software issues a configuration probe
> > > using the Bus Number
> > > captured by the Device and the Function Number derived from the ARI
> > > Capability Register
> > > to locate the next associated Function’s configuration space.
> > > ii. Function Numbers may be sparse and non-sequential in their
> > > consumption by an ARI
> > > Device.
> >
> > The statement "With non-VFs, this field indicates the Function Number of the
> > next higher numbered Function in the Device, or 00h if there are no higher
> > numbered Functions." implies the Function Number of the device should be
> > lower than the value advertised by the field (for non-VFs; this patch does
> > not check if it's VF or not.)
>
>
> Now I get it. Good point! I'd say if we want this check we should add
> it in pcie_ari_init, making that return int.
> But for now it's dead code since your are changing it to 0.
> So maybe a comment in pcie_ari_init is enough:
>
> /*
> * Note: nextfn must be the Function Number of the
> * next higher numbered Function in the Device, or 00h if there are no higher
> * numbered Functions.
> * TODO: validate this.
> */
Or add an assert, and
TODO: in case this can ever come from command line, we'll have
to replace the assert below with a runtime check.
> > >
> > >
> > >
> > >
> > >
> > > > ---
> > > > hw/pci/pci.c | 15 +++++++++++++++
> > > > 1 file changed, 15 insertions(+)
> > > >
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index e2eb4c3b4a..568665ee42 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev,
> > > > Error **errp)
> > > > Error *local_err = NULL;
> > > > bool is_default_rom;
> > > > uint16_t class_id;
> > > > + uint16_t ari;
> > > > + uint16_t nextfn;
> > > > /*
> > > > * capped by systemd (see: udev-builtin-net_id.c)
> > > > @@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev,
> > > > Error **errp)
> > > > }
> > > > }
> > > > + if (pci_is_express(pci_dev)) {
> > > > + ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
> > > > + if (ari) {
> > > > + nextfn = (pci_get_long(pci_dev->config + ari +
> > > > PCI_ARI_CAP) >> 8) & 0xff;
> > > > + if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
> > > > + error_setg(errp, "PCI: function number %u is not lower
> > > > than ARI next function number %u",
> > > > + pci_dev->devfn & 0xff, nextfn);
> > > > + pci_qdev_unrealize(DEVICE(pci_dev));
> > > > + return;
> > > > + }
> > > > + }
> > > > + }
> > > > +
> > > > if (pci_dev->failover_pair_id) {
> > > > if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
> > > > error_setg(errp, "failover primary device must be on "
> > > > --
> > > > 2.41.0
> > >