Hi Andre,

When you say you took down the linux router, what does that mean? You
stopped the BGP daemon? Or rebooted it? Or took the interface it uses to
communicate with the VPP router down?

The symptoms you describe seem reminiscent of an issue mentioned in a
comment here -
https://github.com/FDio/vpp/blob/master/src/plugins/linux-cp/lcp_router.c#L516.
The gist of it is that in some cases when the link on an interface goes
down, the kernel can automatically remove routes resolving through that
interface and it does not send an RTM_DELROUTE so linux-cp does not receive
any indication that the route should be deleted. If your VPP router is
directly attached to your linux router and the interface on the linux
router side went down and took the VPP router interface's link down, maybe
this is your problem. At the time that comment was added I believe this
affected FRR but did not affect bird. But it's possible that the behavior
of bird or the linux kernel has changed since then. There was an option
added to startup.conf for linux-cp which automatically deletes routes that
resolve through an interface when the link state of the interface goes
down. You could try adding del-dynamic-on-link-down to the linux-cp section
of your startup.conf and see if that helps.

If that option doesn't help any, you could possibly get more information on
exactly what routes are being added/replaced/deleted via one of the
following:
- vppctl set logging class linux-cp log-level debug syslog-level debug
- ip link add dev nlmon0 type nlmon; ip link set dev nlmon0 up; tcpdump -w
nlmon.pcap -i nlmon

I see many log messages like this with "journalctl -xeu vpp", but I'm not
> sure if they are relevant.



Aug 07 12:48:45 router1 vpp[3332459]: linux-cp/router: Failed to delete
> neighbor: <some-ip-address> BondEthernet0


Those messages are noise. When VPP does not have a neighbor cache entry for
some IP address and the kernel announces it has deleted it's neighbor entry
for that address, the operation fails. The failure is inconsequential,
there are several legitimate reasons why VPP would not have a neighbor
entry (e.g. linux's neighbor entry was not fully resolved yet and was in
some intermediate state). The log level for the message written when that
happens should probably be changed to debug instead of notice.

-Matt



On Thu, Aug 7, 2025 at 8:00 AM Andre Nathan via lists.fd.io <andre=
digirati.com...@lists.fd.io> wrote:

> Hi Stanislav
>
> On 8/7/25 7:27 AM, Stanislav Zaikin via lists.fd.io wrote:
>  > Hello Andre,
>  >
>  > I'd suggest upgrading to the latest stable release.
>
> I'm going to try the upgrade as soon as possible.
>
>  > For wrong ECMP: are you sure you have different metrics in Linux? can
>  > you show the output of ip route?
>
> Those routes are not in Linux, though they remain in the vpp fib, so I
> can't see what they look like:
>
> # ip route | grep 'via a.b.c.d' | wc -l
> # 0
> # vppctl show ip fib | grep 'via a.b.c.d' | wc -l
> 788
>
>  > For stale entries: did you adjust sysctl to support big buffers for
>  > netlink socket? Do you see in lcp log messages about re-synchronization?
>
> I have net.core.rmem_default, net.core.wmem_default, net.core.rmem_max
> and net.core.wmem_max all set to 67108864. Do you think increasing them
> further can be helpful? Are there other sysctls I should increase?
>
> I see many log messages like this with "journalctl -xeu vpp", but I'm
> not sure if they are relevant.
>
> Aug 07 12:48:45 router1 vpp[3332459]: linux-cp/router: Failed to delete
> neighbor: <some-ip-address> BondEthernet0
>
> Thanks,
> Andre
>
> 
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#26251): https://lists.fd.io/g/vpp-dev/message/26251
Mute This Topic: https://lists.fd.io/mt/114547460/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to