On Thu, 11 Apr 2019 10:24:45 -0700 "Stephen Hemminger via Lists.Fd.Io" <stephen=networkplumber....@lists.fd.io> wrote:
> On Thu, 11 Apr 2019 09:30:12 +0000 > "Benoit Ganne (bganne)" <bga...@cisco.com> wrote: > > > Hi Stephen, > > > > > The rdma-core stuff is likely to be a problem on Azure. The Mellanox > > > device is always hidden as a secondary device behind a synthetic virtual > > > device based on VMBus. There are two DPDK different ways this is used. One > > > is with vdev_netvsc/failsafe/tap and the other is with netvsc PMD. > > > In either case, the virtual device expects to see the MLX device show up > > > as a VF if enabled. > > > So unless you want to write native VPP drivers for VMBus as well, I don't > > > see how having native rdma-core driver will help. It will only get > > > confused. > > > > I'd be interested to better understand how it works. My current > > understanding: > > - the Mellanox device shows up as a PCI SR-IOV VF in the Linux VM ("the > > VF") > > - the VF is associated with a VMBUS (not PCI) netvsc virtual device > > - the netvsc and the VF shares the same MAC. The netvsc is guaranteed to > > be always there whereas the VF can disappear between migrations etc. > > - control plane traffic (multicast eg. ARP, other?) always go through the > > netvsc > > - if the VF is present is gets all the traffic minus control plane. If it > > is not present, all traffic gets rerouted through netvsc as a fallback > > > > Am I correct? Could you elaborate a little bit more about what is the > > control plane traffic (basically what we should miss on the VF)? > > If so, I see at least 2 problems we need to tackle: > > - we need to receive control plane traffic from netvsc because of eg. ARP. > > However, being control-plane/fallback, it could be slow and use eg. TAP or > > even AF_PACKET. We will also need to use it to populate the neighbor > > entries of the VF > > - the VF can appear/disappear w/o notice in case of migration etc. That > > could be handled in different ways, eg. by the control plane agent > > (removing the rdma iface from VPP prior to migration, adding it after post > > migration) or by leveraging bonding or... > > > > Thanks for your help, > > Ben > > Very close. > - In Azure the first packet of every flow arrives on the netvsc (slow path). > Packets arrive to FPGA, anything with non-matching flow goes to Azure > host flow control (VFP) which evaluates > it against rules. On success, it programs FPGA and forwards packet over > netvsc path. > - The presence of VF is reported as an event on VMBus > - VF is not enabled until netvsc sends message to host, acts as ack that VF > is ok. > - VF removal is reported to netvsc, and it switches datapath back > - the VF should not be manipulated directly, it is hidden in DPDK; in > Linux/FreeBSD it is marked as slave Additionally, Windows and recent versions of Linux associate netvsc with VF from the VMBus serial number. This is simpler and more reliable than the MAC address.
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#12770): https://lists.fd.io/g/vpp-dev/message/12770 Mute This Topic: https://lists.fd.io/mt/30795618/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-