Fri, Mar 15, 2019 at 04:32:24PM CET, pa...@mellanox.com wrote: > > >> -----Original Message----- >> From: Samudrala, Sridhar <sridhar.samudr...@intel.com> >> Sent: Friday, March 15, 2019 12:58 AM >> To: Parav Pandit <pa...@mellanox.com>; Jakub Kicinski >> <jakub.kicin...@netronome.com> >> Cc: Jiri Pirko <j...@resnulli.us>; da...@davemloft.net; >> netdev@vger.kernel.org; oss-driv...@netronome.com >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI >> ports >> >> >> On 3/14/2019 7:40 PM, Parav Pandit wrote: >> > >> > >> >> -----Original Message----- >> >> From: Samudrala, Sridhar <sridhar.samudr...@intel.com> >> >> Sent: Thursday, March 14, 2019 9:16 PM >> >> To: Parav Pandit <pa...@mellanox.com>; Jakub Kicinski >> >> <jakub.kicin...@netronome.com> >> >> Cc: Jiri Pirko <j...@resnulli.us>; da...@davemloft.net; >> >> netdev@vger.kernel.org; oss-driv...@netronome.com >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on >> >> devlink PCI ports >> >> >> >> >> >> >> >> On 3/14/2019 6:28 PM, Parav Pandit wrote: >> >>> >> >>> >> >>>> -----Original Message----- >> >>>> From: Jakub Kicinski <jakub.kicin...@netronome.com> >> >>>> Sent: Thursday, March 14, 2019 6:39 PM >> >>>> To: Parav Pandit <pa...@mellanox.com> >> >>>> Cc: Jiri Pirko <j...@resnulli.us>; da...@davemloft.net; >> >>>> netdev@vger.kernel.org; oss-driv...@netronome.com >> >>>> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on >> >>>> devlink PCI ports >> >>>> >> >>>> On Thu, 14 Mar 2019 22:35:36 +0000, Parav Pandit wrote: >> >>>>>>> Then instances of flavour pci_vf are going to appear in the same >> >>>>>>> devlink instance. Those are the switch ports: >> >>>>>>> pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0 >> >>>>>>> flavour pci_vf pf 0 vf 0 >> >>>>>>> switch_id 00154d130d2f peer >> >>>>>>> pci/0000:05:10.1/0 >> >>>>>>> pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0 >> >>>>>>> flavour pci_vf pf 0 vf 0 subport 1 >> >>>>>>> switch_id 00154d130d2f peer >> >>>>>>> pci/0000:05:10.1/1 >> >>>>>>> >> >>>>>>> With that, peers are going to appear too, and those are the >> >>>>>>> actual VF/VF >> >>>>>>> subport: >> >>>>>>> pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host >> >>>>>>> peer pci/0000:05:00.0/10002 >> >>>>>>> pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host >> >>>>>>> peer pci/0000:05:00.0/10003 >> >>>>>>> >> >>>>>>> Later you can push this VF along with all subports to VM. So in >> >>>>>>> VM, you are going to see the VF like this: >> >>>>>>> $ devlink dev >> >>>>>>> pci/0000:00:08.0 >> >>>>>>> $ devlink port >> >>>>>>> pci/0000:00:08.0/0: type eth netdev ??? flavour pci_vf_host >> >>>>>>> pci/0000:00:08.0/1: type eth netdev ??? flavour pci_vf_host >> >>>>>>> >> >>>>>>> And back to your question of how are they connected in eswitch. >> >>>>>>> That is totally up to the original user John who did the creation. >> >>>>>>> He is in charge of the eswitch on baremetal, he would configure >> >>>>>>> the forwarding however he likes. >> >>>>>> >> >>>>>> Ack, so I think you're saying VM has to communicate to the cloud >> >>>>>> environment to have this provisioned using some service API, not >> >>>>>> a kernel API. That's what I wanted to confirm. >> >>>>>> >> >>>>>> I don't see any benefit to having the "host ports" under devlink, >> >>>>>> as such I think it's a matter of preference. >> >>>>> >> >>>>> We need 'host ports' to configure parameters of this host port >> >>>>> which is not exposed by the rep-netdev. >> >>>>> Such as mac address. >> >>>> >> >>>> Please look at the quote of what Jiri wrote above - the host port >> >>>> gets passed to the VM, you can't use it as a handle to set the MAC. >> >>>> >> >>>> The way to set the MAC remains: >> >>>> >> >>>> # devlink port set pci/0000:05:00.0/10002 peer mac_addr >> >>>> 00:11:22:33:44:55 >> >>>> >> >>> Even though it can be done, I think this is wrong model to program >> >> hostport mac address using eswitch port. >> >>> All devlink objects are control objects, so what is passed to VM is >> >>> what is >> >> represented by devlink. >> >>> VF in the VM will anyway create its devlink object. >> >>> What is wrong in programming hostport? >> >>> It gives a very clear view to users of topology and objects. >> >> >> >> The VF or any subport MAC address should be configured by the >> >> orchestration layer that is running on the hypervisor and when a VF >> >> is assigned to a VF, the host port is not visible to the hypervisor. >> > What prevents creation of hostport due to which is not visible? >> > Hostport is control port to program host side of parameters. >> > It should be created when user wants to program the parameters. >> > >> > Model is really straight forward. >> > Program host port params using hostport object. >> > Program switchport params using rep-netdev. >> >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects for each port - >> host facing ports and switch facing ports. This is in addition to the netdevs >> that are created today. >> >I am not proposing any different. >I am proposing only two changes. >1. control hostport params via referring hostport (not via indirect peer)
Not really possible. If you passthrough VF into VM, the hostport goes along with it. >2. flavour should not be vf/pf, flavour should be hostport, switchport. >Because switch is flat and agnostic of pf/vf/mdev. Not sure. It's good to have this kind of visibility. > >> Are you suggesting that all the devlink objects should be visible only at the >> hypervisor layer? >> >Of course not. > >Ports and params controlled by hypervisor should be exposed at >hypervisor/eswitch wherever its parent devlink instance exist. >Ports which should be visible inside a VM should be exposed inside a VM. >So for a given VF, > >If eswitch is at hypervisor level, >$ devlink port show >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f >peer pci/0000:05:10.1/0 >pci/0000:05:10.1/0 eth netdev flavour hostport switch_id 00154d130d2f peer >pci/0000:05:00.0/10002 > >where VF is enumerated, >$ devlink port show >pci/0000:05:10.1/0 eth netdev flavour hostport So this is how it looks like in VM, right? >This is because unprivileged VF doesn't have visibility to eswitch and its >links. > >> I think the terminology need to be defined clearly so that we are all on the >> same page. >> >> > >> >> Currently we have ndo_set_vf_mac_addr api that works with PF netdev, >> >> but i think we are trying to move away from that API and do all the >> >> configuration via the port representor netdevs. >> > This is fine rep-netdev represents eswitch port. >> > You normally don't go to switch to program host port params. >> > >> >> As the mac address cannot be configured using this netdev, i think >> >> Jakub is suggesting creating a devlink opject for each port >> >> representor and use that interface to set peer mac address. >> > >> > I understand but is convoluted interface. >> > When you program host NIC mac address you talk to iLo or BIOS. >> > When you program switch side mac address, you go switch/router/modem. >> > >> > Also programming host params on host side, also doesn't make >> assumption that its connected to eswitch. >> > It also doesn't assume that same connectivity for its life. >> > >> > If you model around how physical devices are configured, it will almost >> never go wrong and still provides same level of flexibility. >> > >> >> We should be able use this to configure port vlan too. >> >> >> >> Also, instead of subport, can we call vport and support different >> >> types of vports - sr-iov, siov, vmdq etc. >> >> >> > At switch level there are just ports. >> > sriov, siov, mdev, vmdq are their couter part (peer) where it is connected. >> > >> >>> >> >>> Also eswitch is flat. There is no need of pf/vf flavour for port. >> >>> It doesn't make sense to define 'mdev' flavour which we are already >> >> working. >> >>> At eswitch level it is just a port, it happen to be connected to vf >> >>> or pf or >> >> other objects, it doesn't matter. >> >>> Port should be flavoured as 'hostport' or 'switchport'. >> >>> >> >>> >> >>>> (using the port ids from above)