As we are about to switch from VPP 19.01 to 19.08 we encountered a
problem with NAT performance. We try to use the same settings (as far
as possible) for 19.08 as we did for 19.01, on the same computer.

In 19.01 we used 11 worker threads in total, combined with "set nat
workers 0-6" so that 7 of the worker threads were handling NAT work.
That worked fine in 19.01, but now that we try the same with 19.08 the
performance gets really bad. The problem seems related to the choice of
NAT treads.

Examples to illustrate the issue:

"set nat workers 0-1" --> works fine for both 19.01 and 19.08.

"set nat workers 2-3" --> works fine for 19.01, but gives bad
performance for 19.08.

It seems as if, for version 19.08, only threads 0 and 1 can do NAT work
with decent performance; as soon as any other threads are specified,
performance gets bad.
In contrast, for version 19.01, seemingly any of the threads can be
used for NAT without performance problems.

"Bad" performance here means that things work something like 10x
slower, e.g. VPP starts to drop packets already at only 10% of the
amount of traffic that it could handle otherwise. So it is really a big
difference.

Using gdb I was able to verify that the NAT functions are really
executed by those worker threads that were chosen using "set nat
workers", and as long as there is not too much traffic vpp still
processes the packets correctly, it is just that it gets really slow
when using other NAT threads than 0 and 1.

My best guess is that the problem has something to do with how threads
are bound (or not) to certain CPU cores and/or NUMA memory banks. But
we have not changed any configuration options related to such things.
Maybe if there has been a change in default behavior between 19.01 and
19.08 then that could explain it.

The behavior for the current master branch seems to be the same as for
19.08.

Questions:

Are there some new configuration options that we need to use to make
19.08 work with good performance using more than 2 NAT threads?

Has the default behavior regarding binding of threads to CPU cores
changed between VPP versions 19.01 and 19.08?

Other ideas of what could be causing this and/or how to troubleshoot
further?

(In case that matters, we are using Mellanox hardware interfaces that
required "make dpdk-install-dev DPDK_MLX5_PMD=y
DPDK_MLX5_PMD_DLOPEN_DEPS=n" when building for vpp 19.01, while for
19.08 the interfaces are setup using "create int rdma host-if ...".)

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14104): https://lists.fd.io/g/vpp-dev/message/14104
Mute This Topic: https://lists.fd.io/mt/34379814/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to