Hello Andrey, Can you install gdb and vpp-dbg, then attach with gdb and perform `bt full` after it crashes?
On Mon, 18 Aug 2025 at 14:07, Andrey Zelentsov via lists.fd.io <andrey.zelentsov=my.ga...@lists.fd.io> wrote: > Hello VPP Developers, > > We are writing to report a recurring VPP crash. > > The issue occurs when we attempt to send traffic from the Linux host > system through an LCP interface into a GRE tunnel terminated on VPP, for > example > ip netns exec vppDataplane ping 10.88.0.65 > > We've observed that pinging the tunnel directly from VPP's ping plugin > works correctly without causing a crash. > > Here is some additional context about our environment and the steps we've > already taken: > > System Details: > > - VPP is running on a bare-metal server. > > - We were unable to reproduce the issue on servers with a different CPU, > specifically Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz. LCP worked as > expected, and ping from linux was successful. > > Troubleshooting Steps Taken: > > - We applied the recommended BIOS settings as per the performance > optimization guide on the fd.io wiki ( > https://wiki.fd.io/view/VPP/How_To_Optimize_Performance_(System_Tuning)), > but the issue persists. > > - We have tried running VPP in single-threaded mode, reducing the > allocated memory, and adjusting various LCP settings. None of these actions > resolved the problem. > > This leads us to believe the issue may be related to the interaction > between the LCP interface and the GRE encapsulation process, possibly > specific to certain hardware. > > Error logs > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from > /lib/x86_64-linux-gnu/libc.so.6 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #5 0x000070f966729c3c __clone > + 0x24c > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from > /lib/x86_64-linux-gnu/libc.so.6 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #5 > 0x000070f966729c3c __clone + 0x24c > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from > /lib/x86_64-linux-gnu/libc.so.6 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #4 0x000070f96669caa4 > pthread_condattr_setpshared + 0x684 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from > /lib/x86_64-linux-gnu/libc.so.6 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #4 > 0x000070f96669caa4 pthread_condattr_setpshared + 0x684 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from > /lib/x86_64-linux-gnu/libvlib.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #3 0x000070f966a7f77e > vlib_worker_thread_bootstrap_fn + 0x4e > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from > /lib/x86_64-linux-gnu/libvlib.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #3 > 0x000070f966a7f77e vlib_worker_thread_bootstrap_fn + 0x4e > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from > /lib/x86_64-linux-gnu/libvlib.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #2 0x000070f966a3c53e > vlib_exit_with_status + 0x375e > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from > /lib/x86_64-linux-gnu/libvlib.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #2 > 0x000070f966a3c53e vlib_exit_with_status + 0x375e > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from > /lib/x86_64-linux-gnu/libvlib.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #1 0x000070f966a395ef > vlib_exit_with_status + 0x80f > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from > /lib/x86_64-linux-gnu/libvlib.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #1 > 0x000070f966a395ef vlib_exit_with_status + 0x80f > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: from > /lib/x86_64-linux-gnu/libvnet.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #0 0x000070f9681e0347 > adj_l2_midchain_node_fn_skx + 0x737 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: from > /lib/x86_64-linux-gnu/libvnet.so.25.06 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #0 > 0x000070f9681e0347 adj_l2_midchain_node_fn_skx + 0x737 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: Code: 41 0f b7 4c 1c 46 48 83 > f9 14 0f 85 ce 00 00 00 c4 c1 7a 6f > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: Code: 41 0f b7 4c > 1c 46 48 83 f9 14 0f 85 ce 00 00 00 c4 c1 7a 6f > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: received signal SIGSEGV, PC > 0x70f9681e0347, faulting address 0x71f47e8a37c6 > Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: received signal > SIGSEGV, PC 0x70f9681e0347, faulting address 0x71f47e8a37c6 > Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vlib/file: file error: > nl_route_error_cb: Error polling netlink socket 1698 > Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file > error: nl_route_error_cb: Error polling netlink socket 1698 > Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling netlink > socket (fd 1698) > Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error > polling netlink socket (fd 1698) > Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vlib/file: file error: > nl_route_error_cb: Error polling netlink socket 1698 > Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling netlink > socket (fd 1698) > Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file > error: nl_route_error_cb: Error polling netlink socket 1698 > Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error > polling netlink socket (fd 1698) > Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vlib/file: file error: > nl_route_error_cb: Error polling netlink socket 1698 > Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling netlink > socket (fd 1698) > Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file > error: nl_route_error_cb: Error polling netlink socket 1698 > Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error > polling netlink socket (fd 1698) > > The commands that I've used to configure the gre tunnel > > create gre tunnel src 10.10.25.5 dst 10.10.35.5 instance 0 > > set interface state gre0 up > > lcp create gre0 host-if gre0@vpp tun > > set interface ip address gre0 10.88.0.64/31 > > > Linux distro is Ubuntu 24.04.2 LTS > > exit interface for gre tunnel info > driver: mlx5_core > version: 6.14.0-27-generic > firmware-version: 16.31.1014 (MT_0000000013) > expansion-rom-version: > bus-info: 0000:d8:00.0 > supports-statistics: yes > supports-test: yes > supports-eeprom-access: no > supports-register-dump: no > supports-priv-flags: yes > > Affected host CPU info: > > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Address sizes: 46 bits physical, 48 bits virtual > Byte Order: Little Endian > CPU(s): 40 > On-line CPU(s) list: 0-39 > Vendor ID: GenuineIntel > BIOS Vendor ID: Intel(R) Corporation > Model name: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz > BIOS Model name: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz CPU @ > 2.1GHz > BIOS CPU family: 179 > CPU family: 6 > Model: 85 > Thread(s) per core: 1 > Core(s) per socket: 20 > Socket(s): 2 > Stepping: 7 > CPU(s) scaling MHz: 71% > CPU max MHz: 3900.0000 > CPU min MHz: 800.0000 > BogoMIPS: 4200.00 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe > syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfm > on pebs bts rep_good nopl xtopology nonstop_tsc > cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 > sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic > movbe popcnt tsc_deadline_timer aes xsave avx > f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 > intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow > flexpriority ept vpid ept_ad fsgsbase tsc_adjust > bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx > smap clflushopt clwb intel_pt avx512cd avx512bw avx5 > 12vl xsaveopt xsavec xgetbv1 xsaves cqm_llc > cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp > hwp_act_window hwp_epp hwp_pkg_req vnmi pku ospke avx512_vnni > md_clear flush_l1d arch_capabilities > Virtualization features: > Virtualization: VT-x > Caches (sum of all): > L1d: 1.3 MiB (40 instances) > L1i: 1.3 MiB (40 instances) > L2: 40 MiB (40 instances) > L3: 55 MiB (2 instances) > NUMA: > NUMA node(s): 2 > NUMA node0 CPU(s): 0-19 > NUMA node1 CPU(s): 20-39 > Vulnerabilities: > Gather data sampling: Vulnerable > Ghostwrite: Not affected > Itlb multihit: KVM: Mitigation: Split huge pages > L1tf: Not affected > Mds: Not affected > Meltdown: Not affected > Mmio stale data: Mitigation; Clear CPU buffers; SMT disabled > Reg file data sampling: Not affected > Retbleed: Mitigation; Enhanced IBRS > Spec rstack overflow: Not affected > Spec store bypass: Mitigation; Speculative Store Bypass disabled > via prctl > Spectre v1: Mitigation; usercopy/swapgs barriers and __user > pointer sanitization > Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB > conditional; PBRSB-eIBRS SW sequence; BHI SW loop, KVM SW loop > Srbds: Not affected > Tsx async abort: Mitigation; TSX disabled > > root@localhost:~# vppctl show version verbose cmdline > Version: v25.06-release > Compiled by: root > Compile host: e29c327af67c > Compile date: 2025-06-25T13:23:10 > Compile location: /w/workspace/vpp-merge-2506-ubuntu2404-x86_64 > Compiler: Clang/LLVM 18.1.3 (1ubuntu1) > Current PID: 26500 > Command line arguments: > vppctl show version verbose command extensive output in attached file > > P.S: I have not created a Jira ticket, because jira.fd.io fails in dns > resolution. > > Thank you for your time and consideration. > > -- > > Kind regards, > > Andrey Zelentsov > > Network Engineer > > > > > > > -- Best regards Stanislav Zaikin
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#26274): https://lists.fd.io/g/vpp-dev/message/26274 Mute This Topic: https://lists.fd.io/mt/114761798/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-