Hi folks, I am getting performance scalability issues with DPDK on Mellanox Connectx-3 .
Each of our machine has 16 cores and a single-port 40G Mellanox Connectx-3 EN. We find out the server throughput *does not scale* with number of cores. With a single thread on one core, we can get about 2 Mpps with a simple echo server implementation. However, the performance number does not increase as we use more cores. Our implementation is based on the l2fwd example. I'd greatly appreciate it if anyone could provide some insights on what might be the problem and how can we improve the performance with Mellanox Connectx-3 EN. Thanks! Best, Xiaozhou