Hello VPP experts,

For our NAT44 usage of VPP we have encountered a problem with VPP
running out of memory, which now, after much headache and many out-of-
memory crashes over the past several months, has turned out to be
caused by an infinite loop where VPP gets stuck repeating the three
nodes ip4-lookup, ip4-local and nat44-hairpinning. A single packet gets
passed around and around between those three nodes, eating more and
more memory which causes that worker thread to get stuck and VPP to run
out of memory after a few seconds. (Earlier we speculated that it was
due to a memory leak but now it seems it was not.)

This concerns the current master branch as well as the stable/2009
branches and earlier VPP versions as well.

One scenario when this happens is when a UDP (or TCP) packet is sent
from a client on the inside with a destination IP address that matches
an existing static NAT mapping that maps that IP address on the inside
to the same IP address on the outside.

Then, the problem can be triggered for example by doing this from a
client on the inside, where DESTINATION_IP is the IP address of such a
static mapping:

echo hello > /dev/udp/$DESTINATION_IP/33333

Here is the packet trace for the thread that receives the packet at
rdma-input:

----------

Packet 42

00:03:07:636840: rdma-input
  rdma: Interface179 (4) next-node bond-input l2-ok l3-ok l4-ok ip4 udp
00:03:07:636841: bond-input
  src d4:6a:35:52:30:db, dst 02:fe:8d:23:60:a7, Interface179 ->
BondEthernet0
00:03:07:636843: ethernet-input
  IP4: d4:6a:35:52:30:db -> 02:fe:8d:23:60:a7 802.1q vlan 1013
00:03:07:636844: ip4-input
  UDP: SOURCE_IP_INSIDE -> DESTINATION_IP
    tos 0x00, ttl 63, length 34, checksum 0xe7e3 dscp CS0 ecn NON_ECN
    fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 48824 -> 33333
    length 14, checksum 0x781e
00:03:07:636846: ip4-sv-reassembly-feature
  [not-fragmented]
00:03:07:636847: nat44-in2out-worker-handoff
  NAT44_IN2OUT_WORKER_HANDOFF : next-worker 8 trace index 41

----------

So it is doing handoff to thread 8 with trace index 41. Nothing wrong
so far, I think.

Here is the beginning of the corresponding packet trace for the
receiving thread:

----------

Packet 57

00:03:07:636850: handoff_trace
  HANDED-OFF: from thread 7 trace index 41
00:03:07:636850: nat44-in2out
  NAT44_IN2OUT_FAST_PATH: sw_if_index 6, next index 3, session -1
00:03:07:636855: nat44-in2out-slowpath
  NAT44_IN2OUT_SLOW_PATH: sw_if_index 6, next index 0, session 11
00:03:07:636927: ip4-lookup
  fib 0 dpo-idx 577 flow hash: 0x00000000
  UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
    tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
    fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 63957 -> 33333
    length 14, checksum 0xb40b
00:03:07:636930: ip4-local
    UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
      tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
      fragment id 0x50fe, flags DONT_FRAGMENT
    UDP: 63957 -> 33333
      length 14, checksum 0xb40b
00:03:07:636932: nat44-hairpinning
  new dst addr DESTINATION_IP port 33333 fib-index 0 is-static-mapping
00:03:07:636934: ip4-lookup
  fib 0 dpo-idx 577 flow hash: 0x00000000
  UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
    tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
    fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 63957 -> 33333
    length 14, checksum 0xb40b
00:03:07:636936: ip4-local
    UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
      tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
      fragment id 0x50fe, flags DONT_FRAGMENT
    UDP: 63957 -> 33333
      length 14, checksum 0xb40b
00:03:07:636937: nat44-hairpinning
  new dst addr DESTINATION_IP port 33333 fib-index 0 is-static-mapping
00:03:07:636937: ip4-lookup
  fib 0 dpo-idx 577 flow hash: 0x00000000
  UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
    tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
    fragment id 0x50fe, flags DONT_FRAGMENT
  UDP: 63957 -> 33333
    length 14, checksum 0xb40b
00:03:07:636939: ip4-local
    UDP: SOURCE_IP_OUTSIDE -> DESTINATION_IP
      tos 0x00, ttl 63, length 34, checksum 0x5eee dscp CS0 ecn NON_ECN
      fragment id 0x50fe, flags DONT_FRAGMENT
    UDP: 63957 -> 33333
      length 14, checksum 0xb40b
00:03:07:636940: nat44-hairpinning
  new dst addr DESTINATION_IP port 33333 fib-index 0 is-static-mapping

...

... and so on. In principle it never ends. To get this trace I had
added a hack in nat44-hairpinning to stop when my added debug counter
exceeded a few thousand. Without that, it seems to loop forever, that
worker thread gets stuck.

What happens seems to be that the nat44-hairpinning node determines
that there is an existing session and then decides the packet should go
to the ip4-lookup node, followed by the ip4-local, followed by the
nat44-hairpinning node which makes the same decision again, so it just
goes round and round like that. Inside the snat_hairpinning() function
it always comes to the "Destination is behind the same NAT, use
internal address and port" part and returns 1 there which causes it to
choose the ip4-lookup node as next node, even if nothing actually
changed.

I have a fix that involves changing the snat_hairpinning() function so
that it checks if nothing has changed and in that case returns 0,
effectively breaking the otherwise infinite loop, but I am not
convinced that this is a good solution, feels a bit like an ugly fix
even though it seems to solve the problem in practice.

Two different questions related to this:

(1) The specific NAT hairpinning issue, what should be done to handle
it properly?

(2) A more general question about detecting if VPP gets into an
infinite loop where the same packet gets passed around a ridiculously
large number of times among different nodes, maybe it would be a good
idea to try to detect when that happens and give an error message about
it instead of hanging or crashing when it happens?

Best regards,
Elias
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18225): https://lists.fd.io/g/vpp-dev/message/18225
Mute This Topic: https://lists.fd.io/mt/78662322/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to