Hello Ben, All first of all, thanks for the prompt reply.
Now let me clarify the questions, which confused you because of my ignorance.... 1. How to use the VPP host stack (TCP): I refined my understanding having seen the presentation of Florin (https://www.youtube.com/watch?v=3Pp7ytZeaLk): Up to now the revised diagram would be: Process A (TCP client or server) <--> VCL (VPP Comms Library) or VCL+LDP or ??? <--> VPP (Session -> TCP -> Eth) <--> Memif or ??? custom plug-in <--> Process B (our radio stack including driver) i.e. we would like to use VPP network stack (also termed host stack) as an user-space TCP stack over our radio stack. Florin was saying that there is an alternative for VCL: we don't have legacy BSD socket apps, hence we are free to use the most advanced interface. Possibly we would like to be zero-copy insofar as possible. The "North" of the TCP stack is the (client/server) apps, the "South" of the stack are IP or Eth frames. Ideally we would like to know what are the best options to interface with VPP north-bound and south-bound. We don't exit into a NIC card, that would be: Process A --> VCL --> VPP --> DPDK plug-in (or ??? AF_Packet / AF_XDP) --> NIC Hence what are the best possible solutions? 2. VPP multi-instances. I'm not asking for multi-threading (which I already use successfully), but for running multiple VPP processes in parallel, of course paying attention to core pinning. My question was, what are the options to change in startup.conf ? 3. My tests and the RSS mechanism. My set-up is the following: two identical machines, X12's (two xeon, 38+38 cores), each one equipped with one Mellanox 100Gbps NIC card (one connectx-4 one connectx-5) Using iperf3 with LDP+VCL to interface with VPP, hence the flow is: Iperf3 client <--> VCL+LDP -> VPP -> DPDK plug-in -> Mlx NIC <-link-> Mlx NIC -> DPDK plug-in -> VPP -> VCL+LDP <--> Iperf3 server Machine A <-------> Machine B Distribution Ubuntu 20.04 LTS, kernel low latency customised, isolated all cores except two. VPP version 21.10 recompiled natively on the machines. I'm using DPDK not the RDMA driver. What I'm observing is strange variations in throughput for the following scenario: Iperf3 single tcp stream on one isolated core, VPP 8 cores pinned to 8 NIC queues sometimes it is 15Gbps, sometimes it is 36Gbps ("show hardware" says 3 queues are used....) Hence I was a bit dazzled about RSS. I'm not expecting such large variations from run to run. I'm not a VPP expert.... so if you have suggestions to what to look for, they are welcome :-) Thank you in advance for your patience and for your time Kind regards Federico
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22188): https://lists.fd.io/g/vpp-dev/message/22188 Mute This Topic: https://lists.fd.io/mt/95024801/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-