>>> On 20.03.15 at 21:01, <boris.ostrov...@oracle.com> wrote:
> On 03/20/2015 12:26 PM, Jan Beulich wrote:
>>>>> On 19.03.15 at 22:54, <boris.ostrov...@oracle.com> wrote:
>>> --- a/xen/common/sysctl.c
>>> +++ b/xen/common/sysctl.c
>>> @@ -399,6 +399,67 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
>>> u_sysctl)
>>>           break;
>>>   #endif
>>>   
>>> +#ifdef HAS_PCI
>>> +    case XEN_SYSCTL_pcitopoinfo:
>>> +    {
>>> +        xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
>>> +
>>> +        if ( guest_handle_is_null(ti->devs) ||
>>> +             guest_handle_is_null(ti->nodes) ||
>>> +             (ti->first_dev > ti->num_devs) )
>>> +        {
>>> +            ret = -EINVAL;
>>> +            break;
>>> +        }
>>> +
>>> +        while ( ti->first_dev < ti->num_devs )
>>> +        {
>>> +            physdev_pci_device_t dev;
>>> +            uint32_t node;
>>> +            struct pci_dev *pdev;
>>> +
>>> +            if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
>>> +            {
>>> +                ret = -EFAULT;
>>> +                break;
>>> +            }
>>> +
>>> +            spin_lock(&pcidevs_lock);
>>> +            pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
>>> +            if ( !pdev || (pdev->node == NUMA_NO_NODE) )
>>> +                node = XEN_INVALID_NODE_ID;
>> I really think the two cases folded here should be distinguishable
>> by the caller.
> 
> How about making  ti->devs array an IN/OUT argument and updating the 
> entry with -1s (which I think is an invalid PCI device)? This will make 
> the original deviceID disappear though so the callers would be expected 
> to stash the array before making the call if they want to know which 
> devices were not reported.

Sadly all ones in physdev_pci_device_t still could be a valid device.

> Alternatively, since node is 32-bit value while nodeid_t is 8-bit, I can 
> add another token that signifies an invalid device. The main problem 
> with this approach is that logically we use 'nodes' array for passing 
> nodeIDs, not information about devices.

I realize that. I wonder whether passing in a bad device shouldn't
simply result in -ENODEV, perhaps with first_dev pointing at the
bad slot?

>>> +            else
>>> +                node = pdev->node;
>>> +            spin_unlock(&pcidevs_lock);
>>> +
>>> +            if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
>>> +            {
>>> +                ret = -EFAULT;
>>> +                break;
>>> +            }
>>> +
>>> +            ti->first_dev++;
>>> +
>>> +            if ( hypercall_preempt_check() )
>>> +                break;
>>> +        }
>>> +
>>> +        if ( !ret )
>>> +        {
>>> +            if ( __copy_field_to_guest(u_sysctl, op, 
>>> u.pcitopoinfo.first_dev) )
>>> +            {
>>> +                ret = -EFAULT;
>>> +                break;
>>> +            }
>>> +
>>> +            if ( ti->first_dev < ti->num_devs )
>>> +                ret = hypercall_create_continuation(__HYPERVISOR_sysctl,
>>> +                                                    "h", u_sysctl);
>> Considering this is a tools only interface, enforcing a not too high
>> limit on num_devs would seem better than this not really clean
>> continuation mechanism. The (tool stack) caller(s) can be made
>> iterate.
> 
> What's a reasonable limit per call? 100?

Commonly we use powers of two for these, even if not strictly
needed to be that way. Hence I'd suggest 64. But please be sure
not to make this implementation detail part of the ABI.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to