Hello Andrey,

Can you install gdb and vpp-dbg, then attach with gdb and perform `bt full`
after it crashes?

On Mon, 18 Aug 2025 at 14:07, Andrey Zelentsov via lists.fd.io
<andrey.zelentsov=my.ga...@lists.fd.io> wrote:

> Hello VPP Developers,
>
> We are writing to report a recurring VPP crash.
>
> The issue occurs when we attempt to send traffic from the Linux host
> system through an LCP interface into a GRE tunnel terminated on VPP, for
> example
> ip netns exec vppDataplane ping 10.88.0.65
>
> We've observed that pinging the tunnel directly from VPP's ping plugin
> works correctly without causing a crash.
>
> Here is some additional context about our environment and the steps we've
> already taken:
>
> System Details:
>
> - VPP is running on a bare-metal server.
>
> - We were unable to reproduce the issue on servers with a different CPU,
> specifically Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz. LCP worked as
> expected, and ping from linux was successful.
>
> Troubleshooting Steps Taken:
>
> - We applied the recommended BIOS settings as per the performance
> optimization guide on the fd.io wiki (
> https://wiki.fd.io/view/VPP/How_To_Optimize_Performance_(System_Tuning)),
> but the issue persists.
>
> - We have tried running VPP in single-threaded mode, reducing the
> allocated memory, and adjusting various LCP settings. None of these actions
> resolved the problem.
>
> This leads us to believe the issue may be related to the interaction
> between the LCP interface and the GRE encapsulation process, possibly
> specific to certain hardware.
>
> Error logs
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]:      from
> /lib/x86_64-linux-gnu/libc.so.6
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #5  0x000070f966729c3c __clone
> + 0x24c
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]:      from
> /lib/x86_64-linux-gnu/libc.so.6
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #5
>  0x000070f966729c3c __clone + 0x24c
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]:      from
> /lib/x86_64-linux-gnu/libc.so.6
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #4  0x000070f96669caa4
> pthread_condattr_setpshared + 0x684
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]:      from
> /lib/x86_64-linux-gnu/libc.so.6
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #4
>  0x000070f96669caa4 pthread_condattr_setpshared + 0x684
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvlib.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #3  0x000070f966a7f77e
> vlib_worker_thread_bootstrap_fn + 0x4e
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvlib.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #3
>  0x000070f966a7f77e vlib_worker_thread_bootstrap_fn + 0x4e
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvlib.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #2  0x000070f966a3c53e
> vlib_exit_with_status + 0x375e
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvlib.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #2
>  0x000070f966a3c53e vlib_exit_with_status + 0x375e
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvlib.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #1  0x000070f966a395ef
> vlib_exit_with_status + 0x80f
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvlib.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #1
>  0x000070f966a395ef vlib_exit_with_status + 0x80f
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvnet.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: #0  0x000070f9681e0347
> adj_l2_midchain_node_fn_skx + 0x737
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]:      from
> /lib/x86_64-linux-gnu/libvnet.so.25.06
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: #0
>  0x000070f9681e0347 adj_l2_midchain_node_fn_skx + 0x737
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: Code:  41 0f b7 4c 1c 46 48 83
> f9 14 0f 85 ce 00 00 00 c4 c1 7a 6f
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: Code:  41 0f b7 4c
> 1c 46 48 83 f9 14 0f 85 ce 00 00 00 c4 c1 7a 6f
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: received signal SIGSEGV, PC
> 0x70f9681e0347, faulting address 0x71f47e8a37c6
> Aug 18 11:23:12 net-chgr-vpp03 vpp[14019]: vpp[14019]: received signal
> SIGSEGV, PC 0x70f9681e0347, faulting address 0x71f47e8a37c6
> Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vlib/file: file error:
> nl_route_error_cb: Error polling netlink socket 1698
> Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file
> error: nl_route_error_cb: Error polling netlink socket 1698
> Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling netlink
> socket (fd 1698)
> Aug 18 11:21:32 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error
> polling netlink socket (fd 1698)
> Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vlib/file: file error:
> nl_route_error_cb: Error polling netlink socket 1698
> Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling netlink
> socket (fd 1698)
> Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file
> error: nl_route_error_cb: Error polling netlink socket 1698
> Aug 18 11:21:30 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error
> polling netlink socket (fd 1698)
> Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vlib/file: file error:
> nl_route_error_cb: Error polling netlink socket 1698
> Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: nl/nl: Error polling netlink
> socket (fd 1698)
> Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vpp[14019]: vlib/file: file
> error: nl_route_error_cb: Error polling netlink socket 1698
> Aug 18 11:21:28 net-chgr-vpp03 vpp[14019]: vpp[14019]: nl/nl: Error
> polling netlink socket (fd 1698)
>
> The commands that I've used to configure the gre tunnel
>
> create gre tunnel src 10.10.25.5 dst 10.10.35.5 instance 0
>
> set interface state gre0 up
>
> lcp create gre0 host-if gre0@vpp tun
>
> set interface ip address gre0 10.88.0.64/31
>
>
> Linux distro is Ubuntu 24.04.2 LTS
>
> exit interface for gre tunnel info
> driver: mlx5_core
> version: 6.14.0-27-generic
> firmware-version: 16.31.1014 (MT_0000000013)
> expansion-rom-version:
> bus-info: 0000:d8:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: no
> supports-register-dump: no
> supports-priv-flags: yes
>
> Affected host CPU info:
>
> Architecture:             x86_64
>   CPU op-mode(s):         32-bit, 64-bit
>   Address sizes:          46 bits physical, 48 bits virtual
>   Byte Order:             Little Endian
> CPU(s):                   40
>   On-line CPU(s) list:    0-39
> Vendor ID:                GenuineIntel
>   BIOS Vendor ID:         Intel(R) Corporation
>   Model name:             Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
>     BIOS Model name:      Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz  CPU @
> 2.1GHz
>     BIOS CPU family:      179
>     CPU family:           6
>     Model:                85
>     Thread(s) per core:   1
>     Core(s) per socket:   20
>     Socket(s):            2
>     Stepping:             7
>     CPU(s) scaling MHz:   71%
>     CPU max MHz:          3900.0000
>     CPU min MHz:          800.0000
>     BogoMIPS:             4200.00
>     Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
> syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfm
>                           on pebs bts rep_good nopl xtopology nonstop_tsc
> cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3
> sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic
>                            movbe popcnt tsc_deadline_timer aes xsave avx
> f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3
> intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow
>                           flexpriority ept vpid ept_ad fsgsbase tsc_adjust
> bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx
> smap clflushopt clwb intel_pt avx512cd avx512bw avx5
>                           12vl xsaveopt xsavec xgetbv1 xsaves cqm_llc
> cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp
> hwp_act_window hwp_epp hwp_pkg_req vnmi pku ospke avx512_vnni
>                           md_clear flush_l1d arch_capabilities
> Virtualization features:
>   Virtualization:         VT-x
> Caches (sum of all):
>   L1d:                    1.3 MiB (40 instances)
>   L1i:                    1.3 MiB (40 instances)
>   L2:                     40 MiB (40 instances)
>   L3:                     55 MiB (2 instances)
> NUMA:
>   NUMA node(s):           2
>   NUMA node0 CPU(s):      0-19
>   NUMA node1 CPU(s):      20-39
> Vulnerabilities:
>   Gather data sampling:   Vulnerable
>   Ghostwrite:             Not affected
>   Itlb multihit:          KVM: Mitigation: Split huge pages
>   L1tf:                   Not affected
>   Mds:                    Not affected
>   Meltdown:               Not affected
>   Mmio stale data:        Mitigation; Clear CPU buffers; SMT disabled
>   Reg file data sampling: Not affected
>   Retbleed:               Mitigation; Enhanced IBRS
>   Spec rstack overflow:   Not affected
>   Spec store bypass:      Mitigation; Speculative Store Bypass disabled
> via prctl
>   Spectre v1:             Mitigation; usercopy/swapgs barriers and __user
> pointer sanitization
>   Spectre v2:             Mitigation; Enhanced / Automatic IBRS; IBPB
> conditional; PBRSB-eIBRS SW sequence; BHI SW loop, KVM SW loop
>   Srbds:                  Not affected
>   Tsx async abort:        Mitigation; TSX disabled
>
> root@localhost:~# vppctl show version verbose cmdline
> Version: v25.06-release
> Compiled by: root
> Compile host: e29c327af67c
> Compile date: 2025-06-25T13:23:10
> Compile location: /w/workspace/vpp-merge-2506-ubuntu2404-x86_64
> Compiler: Clang/LLVM 18.1.3 (1ubuntu1)
> Current PID: 26500
> Command line arguments:
> vppctl show version verbose command extensive output in attached file
>
> P.S: I have not created a Jira ticket, because jira.fd.io fails in dns
> resolution.
>
> Thank you for your time and consideration.
>
> --
>
> Kind regards,
>
> Andrey Zelentsov
>
> Network Engineer
>
>
>
>
> 
>
>

-- 
Best regards
Stanislav Zaikin
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#26274): https://lists.fd.io/g/vpp-dev/message/26274
Mute This Topic: https://lists.fd.io/mt/114761798/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to