Moin, ok, had a hunch, and i think i got closer to this. I can now semi- reproduce this in a lab environment. with six OpenBSD 7.4. I guess the last missing component is bringing in a Linux router, i.e., in a pure openbsd setup it is not that bad because openbsd does not send type 2 ad infinum (unlike Linux). Still, packets seem to remain looping for some time even after the connections are gone as well; Only far less packets.
The issue occurs if I have async routing with one path going via a lower MTU link, while bgpd has two routes with equal path length tagged as multipath on the client node; For some reason i seem to not stumble into this when this is not the case (or i am holding sth. wrong there). The setup has six hosts: rtr-1a1.tst.as59645.net .. rtr-1a6.tst.as59645.net The configs / routingtable state can be found here: https://rincewind.home.aperture-labs.org/~tfiebig/mtucfg/ (vio0 on all hosts is used for mgmt). If, in that setup, we execute this on rtr-1a1: rtr-1a1.tst.as59645.net ~ # dd if=/dev/random of=/tmp/testfile bs=1M count=32 rtr-1a1.tst.as59645.net ~ # cat /tmp/testfile | nc -6 -l 2342 and then on rtr-1a6: rtr-1a6.tst.as59645.net ~ # nc -6 -s 2a06:d1c4::1a6 2a06:d1c4::1a1 2342 | pv > /dev/null The connection immediately stalls, and on rtr-1a3 we can see the Type2 loop going on vio1. (Sample pcap here: https://rincewind.home.aperture-labs.org/~tfiebig/mtucfg/configs_rtr-1a3.tst.as59645.net/ignored_mtu.pcap ) I assume that we would see full link congestion if we replaced rtr-1a3 with a linux box that is less conservative about resending ICMP6 messages than the openbsd box used in this lab case. So, essentially, these are two issues: - Linux (at least 6.1.65-amd64) seems to just shell out ICMP6 Type2 repeatedly without ratelimiting (which MUST NOT be done per RFC4443) - OpenBSD seems to have some cornercases where ICMP6 Type 2 are ignored. With best regards, Tobias