As we are about to switch from VPP 19.01 to 19.08 we encountered a problem with NAT performance. We try to use the same settings (as far as possible) for 19.08 as we did for 19.01, on the same computer.
In 19.01 we used 11 worker threads in total, combined with "set nat workers 0-6" so that 7 of the worker threads were handling NAT work. That worked fine in 19.01, but now that we try the same with 19.08 the performance gets really bad. The problem seems related to the choice of NAT treads. Examples to illustrate the issue: "set nat workers 0-1" --> works fine for both 19.01 and 19.08. "set nat workers 2-3" --> works fine for 19.01, but gives bad performance for 19.08. It seems as if, for version 19.08, only threads 0 and 1 can do NAT work with decent performance; as soon as any other threads are specified, performance gets bad. In contrast, for version 19.01, seemingly any of the threads can be used for NAT without performance problems. "Bad" performance here means that things work something like 10x slower, e.g. VPP starts to drop packets already at only 10% of the amount of traffic that it could handle otherwise. So it is really a big difference. Using gdb I was able to verify that the NAT functions are really executed by those worker threads that were chosen using "set nat workers", and as long as there is not too much traffic vpp still processes the packets correctly, it is just that it gets really slow when using other NAT threads than 0 and 1. My best guess is that the problem has something to do with how threads are bound (or not) to certain CPU cores and/or NUMA memory banks. But we have not changed any configuration options related to such things. Maybe if there has been a change in default behavior between 19.01 and 19.08 then that could explain it. The behavior for the current master branch seems to be the same as for 19.08. Questions: Are there some new configuration options that we need to use to make 19.08 work with good performance using more than 2 NAT threads? Has the default behavior regarding binding of threads to CPU cores changed between VPP versions 19.01 and 19.08? Other ideas of what could be causing this and/or how to troubleshoot further? (In case that matters, we are using Mellanox hardware interfaces that required "make dpdk-install-dev DPDK_MLX5_PMD=y DPDK_MLX5_PMD_DLOPEN_DEPS=n" when building for vpp 19.01, while for 19.08 the interfaces are setup using "create int rdma host-if ...".) Best regards, Elias
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#14104): https://lists.fd.io/g/vpp-dev/message/14104 Mute This Topic: https://lists.fd.io/mt/34379814/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-