Hi Andrey,
FD.io is now using GitHub Issues in place of Jira. Please feel free to
open an Issue there.
Also, please include the VPP version in the issue report ("show version").
Thanks,
-daw-
On 8/18/25 8:07 AM, Andrey Zelentsov via lists.fd.io wrote:
Hello VPP Developers,
We are writing to report a recurring VPP crash.
The issue occurs when we attempt to send traffic from the Linux host
system through an LCP interface into a GRE tunnel terminated on VPP,
for example
ip netns exec vppDataplane ping 10.88.0.65
We've observed that pinging the tunnel directly from VPP's ping plugin
works correctly without causing a crash.
Here is some additional context about our environment and the steps
we've already taken:
System Details:
- VPP is running on a bare-metal server.
- We were unable to reproduce the issue on servers with a different
CPU, specifically Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz. LCP
worked as expected, and ping from linux was successful.
Troubleshooting Steps Taken:
- We applied the recommended BIOS settings as per the performance
optimization guide on the fd.io <http://fd.io/> wiki
(https://wiki.fd.io/view/VPP/How_To_Optimize_Performance_(System_Tuning)),
but the issue persists.
- We have tried running VPP in single-threaded mode, reducing the
allocated memory, and adjusting various LCP settings. None of these
actions resolved the problem.
This leads us to believe the issue may be related to the interaction
between the LCP interface and the GRE encapsulation process, possibly
specific to certain hardware.
Error logs
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from
/lib/x86_64-linux-gnu/libc.so.6
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #5 0x000070f966729c3c
__clone + 0x24c
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from
/lib/x86_64-linux-gnu/libc.so.6
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #5
0x000070f966729c3c __clone + 0x24c
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from
/lib/x86_64-linux-gnu/libc.so.6
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #4 0x000070f96669caa4
pthread_condattr_setpshared + 0x684
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from
/lib/x86_64-linux-gnu/libc.so.6
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #4
0x000070f96669caa4 pthread_condattr_setpshared + 0x684
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from
/lib/x86_64-linux-gnu/libvlib.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #3 0x000070f966a7f77e
vlib_worker_thread_bootstrap_fn + 0x4e
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from
/lib/x86_64-linux-gnu/libvlib.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #3
0x000070f966a7f77e vlib_worker_thread_bootstrap_fn + 0x4e
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from
/lib/x86_64-linux-gnu/libvlib.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #2 0x000070f966a3c53e
vlib_exit_with_status + 0x375e
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from
/lib/x86_64-linux-gnu/libvlib.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #2
0x000070f966a3c53e vlib_exit_with_status + 0x375e
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from
/lib/x86_64-linux-gnu/libvlib.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #1 0x000070f966a395ef
vlib_exit_with_status + 0x80f
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from
/lib/x86_64-linux-gnu/libvlib.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #1
0x000070f966a395ef vlib_exit_with_status + 0x80f
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from
/lib/x86_64-linux-gnu/libvnet.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #0 0x000070f9681e0347
adj_l2_midchain_node_fn_skx + 0x737
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from
/lib/x86_64-linux-gnu/libvnet.so.25.06
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #0
0x000070f9681e0347 adj_l2_midchain_node_fn_skx + 0x737
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: Code: 41 0f b7 4c 1c 46 48
83 f9 14 0f 85 ce 00 00 00 c4 c1 7a 6f
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: Code: 41 0f b7
4c 1c 46 48 83 f9 14 0f 85 ce 00 00 00 c4 c1 7a 6f
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: received signal SIGSEGV, PC
0x70f9681e0347, faulting address 0x71f47e8a37c6
Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: received signal
SIGSEGV, PC 0x70f9681e0347, faulting address 0x71f47e8a37c6
Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vlib/file: file error:
nl_route_error_cb: Error polling netlink socket 1698
Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file
error: nl_route_error_cb: Error polling netlink socket 1698
Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling
netlink socket (fd 1698)
Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error
polling netlink socket (fd 1698)
Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vlib/file: file error:
nl_route_error_cb: Error polling netlink socket 1698
Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling
netlink socket (fd 1698)
Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file
error: nl_route_error_cb: Error polling netlink socket 1698
Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error
polling netlink socket (fd 1698)
Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vlib/file: file error:
nl_route_error_cb: Error polling netlink socket 1698
Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling
netlink socket (fd 1698)
Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file
error: nl_route_error_cb: Error polling netlink socket 1698
Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error
polling netlink socket (fd 1698)
The commands that I've used to configure the gre tunnel
create gre tunnel src 10.10.25.5 dst 10.10.35.5 instance 0
set interface state gre0 up
lcp create gre0 host-if gre0@vpp tun
set interface ip address gre0 10.88.0.64/31 <http://10.88.0.64/31>
Linux distro is Ubuntu 24.04.2 LTS
exit interface for gre tunnel info
driver: mlx5_core
version: 6.14.0-27-generic
firmware-version: 16.31.1014 (MT_0000000013)
expansion-rom-version:
bus-info: 0000:d8:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
Affected host CPU info:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
Model name: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
BIOS Model name: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
CPU @ 2.1GHz
BIOS CPU family: 179
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 20
Socket(s): 2
Stepping: 7
CPU(s) scaling MHz: 71%
CPU max MHz: 3900.0000
CPU min MHz: 800.0000
BogoMIPS: 4200.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep
mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht
tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfm
on pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx
smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic
movbe popcnt tsc_deadline_timer aes xsave
avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3
cdp_l3 intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow
flexpriority ept vpid ept_ad fsgsbase
tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f
avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx5
12vl xsaveopt xsavec xgetbv1 xsaves cqm_llc
cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp
hwp_act_window hwp_epp hwp_pkg_req vnmi pku ospke avx512_vnni
md_clear flush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 1.3 MiB (40 instances)
L1i: 1.3 MiB (40 instances)
L2: 40 MiB (40 instances)
L3: 55 MiB (2 instances)
NUMA:
NUMA node(s): 2
NUMA node0 CPU(s): 0-19
NUMA node1 CPU(s): 20-39
Vulnerabilities:
Gather data sampling: Vulnerable
Ghostwrite: Not affected
Itlb multihit: KVM: Mitigation: Split huge pages
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Mitigation; Clear CPU buffers; SMT disabled
Reg file data sampling: Not affected
Retbleed: Mitigation; Enhanced IBRS
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass
disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and
__user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB
conditional; PBRSB-eIBRS SW sequence; BHI SW loop, KVM SW loop
Srbds: Not affected
Tsx async abort: Mitigation; TSX disabled
root@localhost:~# vppctl show version verbose cmdline
Version: v25.06-release
Compiled by: root
Compile host: e29c327af67c
Compile date: 2025-06-25T13:23:10
Compile location: /w/workspace/vpp-merge-2506-ubuntu2404-x86_64
Compiler: Clang/LLVM 18.1.3 (1ubuntu1)
Current PID: 26500
Command line arguments:
vppctl show version verbose command extensive output in attached file
P.S: I have not created a Jira ticket, because jira.fd.io
<http://jira.fd.io/> fails in dns resolution.
Thank you for your time and consideration.
--
Kind regards,
Andrey Zelentsov
Network Engineer
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#26279): https://lists.fd.io/g/vpp-dev/message/26279
Mute This Topic: https://lists.fd.io/mt/114761798/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-