On 2/5/21 12:13 PM, Xueming(Steven) Li wrote: > >> -----Original Message----- >> From: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> >> Sent: Friday, February 5, 2021 3:35 PM >> To: Xueming(Steven) Li <xuemi...@nvidia.com> >> Cc: dev@dpdk.org; Slava Ovsiienko <viachesl...@nvidia.com>; Asaf Penso >> <as...@nvidia.com>; Thomas Monjalon >> <tmonja...@nvidia.com> >> Subject: Re: [dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction >> representor >> >> On 2/4/21 5:15 PM, Xueming(Steven) Li wrote: >>> >>>> -----Original Message----- >>>> From: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> >>>> Sent: Monday, February 1, 2021 4:39 PM >>>> To: Xueming(Steven) Li <xuemi...@nvidia.com> >>>> Cc: dev@dpdk.org; Slava Ovsiienko <viachesl...@nvidia.com>; Asaf >>>> Penso <as...@nvidia.com> >>>> Subject: Re: [dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction >>>> representor >>>> >>>> On 1/28/21 5:31 PM, Xueming(Steven) Li wrote: >>>>> <snip> >>>>>>> The patch of device SF capability, but seems I misunderstood your >>>>>>> suggestion. >>>>>>> Let me explain process to create a SF: >>>>>>> 1. SF can be created on the fly with scripts, unlike VF which is >>>>>>> statically pre-created. >>>>>> >>>>>> Is there a maximum index and maximum total number of SF's created? How >>>>>> to find it? >>>>> >>>>> The maximum index is defined by firmware configuration, all SF's >>>>> information could be found from sysfs. To create a SF, both PCI and sfnum >>>>> have to be specified. >>>> >>>> sysfs is obviously Linux specific. I think the information should be >>>> available via DPDK API. >>> >>> Yes, the new api discussed below should resolve this issue. >>> >>>> >>>>>> >>>>>>> 2. SF is created on a PF with a SF number. SF number is named per PF, >>>>>>> different PF may have same SF number. >>>>>>> 3. For standalone PF, hot plug to DPDK using "PF#_BDF,representor=sf#", >>>>>>> no need to use pf#sf# here. >>>>>>> 4. For bonding netdev, hot plug to DPDK using >>>>>>> "PF0_BDF,representor=pf#sf#" >>>>>>> If using new api to return all representor IDs, need some way >>>>>>> locate the new created SF by PF and SF number, that's why "pf#sf#" is >>>>>>> used in this patch set. >>>>>> >>>>>> I think the API should simply reserve/report space for maximum >>>>>> number of SFs. So, IDs are stable across restart/reboot in >>>>>> assumption that NIC is not reconfigured (changed maximum number of >>>>>> VF or >>>> maximum number of SFs of any PF). >>>>> >>>>> Yes, IDs should be stable as long as no NIC firmware configuration >>>>> change. >>>>> >>>>> Just clarify, this api should be common enough to report all devices that >>>>> a bus device supports: >>>>> 1. name, might contains controller and pf info, example: >>>>> "eth:representor:c0pf1vf" >>>>> 2. ID range, example: 0-127 >>>>> The api describes ID ranges for each sub device type, users have to query >>>>> the api and choose representor ID to probe. >>>>> >>>>> Prototype: >>>>> struct rte_bus_device_range { >>>>> char name[64]; >>>>> uint32_t base; >>>>> uint32_t number; >>>>> } >>>>> /* return number of ranges filled, or number of ranges if list is >>>>> NULL. */ int rte_bus_ dev_range_get(struct rte_bus_device_range >>>>> *list, int n); >>>> >>>> Hm, I thought about more port representor specific API. >>>> For me it is hard to tell if such generic naming is good or bad. I >>>> think it should be proven that such generic API makes sense. Any other >>>> potential users / use cases? >>> >>> I was thinking about SF, but SF is PCI specific, not suitable for this api. >>> So I'm fine to make it as ethdev api. >>> To append new api into eth_dev_ops, is there ABI concern? >> >> No, eth_dev_ops are internal >> >>>> I've considered ethdev API which returns (in similar way as >>>> above) list of possible port representors which could be controlled >>>> by the device. Also I think it would be useful to include type information >>>> (enum with PF, VF, SF), controller ID. >>> >>> Agree. >>> >>> There is a new concern from orchestration side, currently, no >>> interface in openstack and OVS to retrieve representor ID range info, >>> It will take time to adapt this solution. To probe a representor, >>> orchestration need to know how to calculate representor ID, and the ID >>> might vary on different max SF number, i.e. VF4 on PP1 >> might got different ID. Representor ID change before that will break the >> product. >> >> I see. >> >>> Considering both orchestration and testpmd users, how about keeping both >>> solution together? This will bring max flexibility IMHO. >> >> As I said before I don't mind and I really think it is a good idea to add >> suggested interface to specify representor (i.e. cXpfYvfZ), but the >> problem is making bitmap from representor ID. >> >> ethdev API should use new representor info API to make a representor ID from >> controller/PF/{VF,SF}. >> Or do you see any problems with such approach? > > Sorry I thought the user to figure out representor ID from api. > This combination look good, thanks for clarification :) > > So the new api looks like this:
Roughly speaking - yes > struct rte_eth_representor_info { > Enum representor_type; > Uint16_t controller; // -1 for any I'm not sure that I understand what does "any" mean in this case. I think it should be the zero in examples below. I think that API should return caller controller ID and PF ID. It would allow to interpret "vf5" correctly when caller is not controller #0 and/or PF #0. > Uint16_t port; // -1 for any port sounds like physical port, but it should be PF (pf, phys_fn or something like this). It could be many PFs per physical network port. > Uint16_t representor_id; May be base_id? Or rep_base_id? The question is what to do if range for VF or SF is not contiguous. Should we have one more index after phys_fn to represent it? E.g. union { uint16_t vf; uint16_t sf; }; > Uint16_t count; May be id_range which should be 1 to show one function. It could be convenient to treat 0 this way as well, but I doubt that it is a good idea. > char name[N]; > > int rte_eth_representor_info_get(struct rte_eth_representor_info *infos); > - Return number of entries. > - NULL infos just return number of entries supported. > Sample outputs: > VF, -1, 0, 0, 128, "pf0vf" > SF, -1, 0, 128, 2048, "pf0sf" > PF, -1, 0, 32767, 1, "pf" > VF, -1, 1, 32768, 128, "pf1vf" > SF, -1, 0, (32768+128), 2048, "pf1sf" > PF, -1, 0, 65535, 1, "pf" > >> >>> In struct rte_eth_dev_data, reserved bits could be used to define >>> controller and port, this will avoid bitmap. How do you think? >> >> Could you add a bit more on it? Just a bit more details to the idea since I >> don't understand what exactly you mean and how it could >> help. > > The idea is replacing reserved_64s and adding more device location info in > rte_eth-dev_data like this: > Uint16_t representor_id; > Uint16_t port_id; > Uint16_t controller_id; > Enum representor_type; > Compare them all when matching a device, this will also avoid bitmap > encoding. > Reserved_64s[] was added to mitigate ABI conflicts, IIRC. > But seems no need if making representor info API to make ID. > >> >>>> >>>> There is one more bit which is not in the picture yet - >>>> switch_info.port_id. Should it be equal to representor ID? Or different >>>> and provided in the info structure? >>> >>> Not exactly same AFAIK, the id used in e-switch. >>> >>> >