> On 05-Jul-2023, at 7:09 AM, Akihiko Odaki <akihiko.od...@daynix.com> wrote:
> 
> 
> 
> On 2023/07/05 0:07, Ani Sinha wrote:
>>> On 04-Jul-2023, at 7:58 PM, Igor Mammedov <imamm...@redhat.com> wrote:
>>> 
>>> On Tue, 4 Jul 2023 19:20:00 +0530
>>> Ani Sinha <anisi...@redhat.com> wrote:
>>> 
>>>>> On 04-Jul-2023, at 6:18 PM, Igor Mammedov <imamm...@redhat.com> wrote:
>>>>> 
>>>>> On Tue, 4 Jul 2023 21:02:09 +0900
>>>>> Akihiko Odaki <akihiko.od...@daynix.com> wrote:
>>>>> 
>>>>>> On 2023/07/04 20:59, Ani Sinha wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> On 04-Jul-2023, at 5:24 PM, Akihiko Odaki <akihiko.od...@daynix.com> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> On 2023/07/04 20:25, Ani Sinha wrote:
>>>>>>>>> PCI Express ports only have one slot, so PCI Express devices can only 
>>>>>>>>> be
>>>>>>>>> plugged into slot 0 on a PCIE port. Add a warning to let users know 
>>>>>>>>> when the
>>>>>>>>> invalid configuration is used. We may enforce this more strongly 
>>>>>>>>> later on once
>>>>>>>>> we get more clarity on whether we are introducing a bad regression 
>>>>>>>>> for users
>>>>>>>>> currenly using the wrong configuration.
>>>>>>>>> The change has been tested to not break or alter behaviors of ARI 
>>>>>>>>> capable
>>>>>>>>> devices by instantiating seven vfs on an emulated igb device (the 
>>>>>>>>> maximum
>>>>>>>>> number of vfs the linux igb driver supports). The vfs instantiated 
>>>>>>>>> correctly
>>>>>>>>> and are seen to have non-zero device/slot numbers in the conventional 
>>>>>>>>> PCI BDF
>>>>>>>>> representation.
>>>>>>>>> CC: jus...@redhat.com
>>>>>>>>> CC: imamm...@redhat.com
>>>>>>>>> CC: m...@redhat.com
>>>>>>>>> CC: akihiko.od...@daynix.com
>>>>>>>>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
>>>>>>>>> Signed-off-by: Ani Sinha <anisi...@redhat.com>
>>>>>>>>> Reviewed-by: Julia Suvorova <jus...@redhat.com>
>>>>>>>>> ---
>>>>>>>>> hw/pci/pci.c | 15 +++++++++++++++
>>>>>>>>> 1 file changed, 15 insertions(+)
>>>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>>>>> index e2eb4c3b4a..47517ba3db 100644
>>>>>>>>> --- a/hw/pci/pci.c
>>>>>>>>> +++ b/hw/pci/pci.c
>>>>>>>>> @@ -65,6 +65,7 @@ bool pci_available = true;
>>>>>>>>> static char *pcibus_get_dev_path(DeviceState *dev);
>>>>>>>>> static char *pcibus_get_fw_dev_path(DeviceState *dev);
>>>>>>>>> static void pcibus_reset(BusState *qbus);
>>>>>>>>> +static bool pcie_has_upstream_port(PCIDevice *dev);
>>>>>>>>>   static Property pci_props[] = {
>>>>>>>>>     DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
>>>>>>>>> @@ -2121,6 +2122,20 @@ static void pci_qdev_realize(DeviceState 
>>>>>>>>> *qdev, Error **errp)
>>>>>>>>>         }
>>>>>>>>>     }
>>>>>>>>> +    /*
>>>>>>>>> +     * With SRIOV and ARI, vfs can have non-zero slot in the 
>>>>>>>>> conventional
>>>>>>>>> +     * PCI interpretation as all five bits reserved for slot 
>>>>>>>>> addresses are
>>>>>>>>> +     * also used for function bits for the various vfs. Ignore that 
>>>>>>>>> case.
>>>>>>>> 
>>>>>>>> You don't have to mention SR/IOV; it affects all ARI-capable devices. 
>>>>>>>> A PF can also have non-zero slot number in the conventional 
>>>>>>>> interpretation so you shouldn't call it vf either.
>>>>>>> 
>>>>>>> Can you please help write a comment that explains this properly for all 
>>>>>>> cases - ARI/non-ARI, PFs and VFs? Once everyone agrees that its clear 
>>>>>>> and correct, I will re-spin.
>>>>>> 
>>>>>> Simply, you can say:
>>>>>> With ARI, the slot number field in the conventional PCI interpretation
>>>>>> can have a non-zero value as the field bits are reused to extend the
>>>>>> function number bits. Ignore that case.
>>>>> 
>>>>> mentioning 'conventional PCI interpretation' in comment and then 
>>>>> immediately
>>>>> checking 'pci_is_express(pci_dev)' is confusing. Since comment belongs
>>>>> only to PCIE branch it would be better to talk in only about PCIe stuff
>>>>> and referring to relevant portions of spec.
>>>> 
>>>> Ok so how about this?
>>>> 
>>>>   * With ARI, devices can have non-zero slot in the traditional BDF
>>>>     * representation as all five bits reserved for slot addresses are
>>>>     * also used for function bits. Ignore that case.
>>> 
>>> you still refer to traditional (which I misread as 'conventional'),
>>> steal the linux comment and argument it with ARI if necessary,
>>> something like this (probably needs some more massaging):
>> The comment messaging in these patches seems to exceed the value of the 
>> patch itself :-)
>> How about this?
>>     /*
>>      * A PCIe Downstream Port normally leads to a Link with only Device
>>      * 0 on it (PCIe spec r3.1, sec 7.3.1).
>>      * With ARI, PCI_SLOT() can return non-zero value as all five bits
>>      * reserved for slot addresses are also used for function bits.
>>      * Hence, ignore ARI capable devices.
>>      */
> 
> Perhaps: s/normally leads to/must lead to/
> 
> From the kernel perspective, they may need to deal with a quirky hardware 
> that does not conform with the specification, but from QEMU perspective, it 
> is what we *must* conform with.

PCI base spec 4.0, rev 3, section 7.3.1 says:

"  
Downstream Ports that do not have ARI Forwarding enabled must associate only 
Device 0 with the device attached to the Logical Bus representing the Link from 
the Port. Configuration Requests 15 targeting the Bus Number associated with a 
Link specifying Device Number 0 are delivered to the device attached to the 
Link; Configuration Requests specifying all other Device Numbers (1-31) must be 
terminated by the Switch Downstream Port or the Root Port with an Unsupported 
Request Completion Status (equivalent to Master Abort in PCI). Non-ARI Devices 
must not assume that Device Number 0 is associated with their Upstream Port, 
but must capture their assigned Device Number as discussed in Section 2.2.6.2. 
Non-ARI Devices must respond to all Type 0 Configuration Read Requests, 
regardless of the Device Number specified in the Request.

…

With an ARI Device, its Device Number is implied to be 0 rather than specified 
by a field within an ID. The traditional 5-bit Device Number and 3-bit Function 
Number fields in its associated Routing IDs, Requester IDs, and Completer IDs 
are interpreted as a single 8-bit Function Number. See Section 6.13. Any Type 0 
Configuration Request targeting an unimplemented Function in an ARI Device must 
be handled as an Unsupported Request.

“

So it seems they do indeed use the “must” clause. I prefer to use the line from 
the spec verbatim as possible. Hence, this is what I am going with and be done 
with this patchset:

    /*                                                                          
                                                                        
     * A PCIe Downstream Port that do not have ARI Forwarding enabled must      
                                                                        
     * associate only Device 0 with the device attached to the bus              
                                                                        
     * representing the Link from the Port (PCIe base spec rev 4.0 ver 0.3,     
                                                                        
     * sec 7.3.1).                                                              
                                                                        
     * With ARI, PCI_SLOT() can return non-zero value as the traditional        
                                                                        
     * 5-bit Device Number and 3-bit Function Number fields in its associated   
                                                                        
     * Routing IDs, Requester IDs and Completer IDs are interpreted as a        
                                                                        
     * single 8-bit Function Number. Hence, ignore ARI capable devices.         
                                                                        
     */


> 
> Otherwise looks good to me.
> 
>>> 
>>> 
>>>         /*
>>>         * A PCIe Downstream Port normally leads to a Link with only Device
>>>         * 0 on it (PCIe spec r3.1, sec 7.3.1).
>>>          However PCI_SLOT() is broken if ARI is enabled, hence work around 
>>> it
>>>          by skipping check if the later cap is present.
>>>         */
>>> 
>>>> 
>>>> 
>>>>> (for example see how it's done in kernel code: only_one_child(...)
>>>>> 
>>>>> PS:
>>>>> kernel can be forced  to scan for !0 device numbers, but that's rather
>>>>> a hack, so we shouldn't really care about that.
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>>> +     */
>>>>>>>>> +    if (pci_is_express(pci_dev) &&
>>>>>>>>> +        !pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI) &&
>>>>>>>>> +        pcie_has_upstream_port(pci_dev) &&
>>>>>>>>> +        PCI_SLOT(pci_dev->devfn)) {
>>>>>>>>> +        warn_report("PCI: slot %d is not valid for %s,"
>>>>>>>>> +                    " parent device only allows plugging into slot 
>>>>>>>>> 0.",
>>>>>>>>> +                    PCI_SLOT(pci_dev->devfn), pci_dev->name);
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>>     if (pci_dev->failover_pair_id) {
>>>>>>>>>         if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
>>>>>>>>>             error_setg(errp, "failover primary device must be on "
>>>> 
>>> 
> 


Reply via email to