The following crash is observed in the contiv deployment. Version: vpp v19.08.1-282~ga6a98b546-dirty built by root on c4100e93fc1d at Wed Aug 12 22:44:26 UTC 2020 Deployment: Contiv-vswitch. It has an our configuration agent along with contiv components.
The following crash is seen during the system instantiation. I believe our proprietary configuration agent tries to push the configuration when VPP is in the middle of starting up. The route add operation tries to access the dpo_nodes (NULL pointer) when it was not initialized. This could be due to allowing the configuration through API when system is in the middle of initialization. We are working adding wait or checking the status before pushing the configurations. However, is it possible to protect the system from this in the VPP code? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/bin/vpp -c /etc/vpp/contiv-vswitch.conf'. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51../sysdeps/unix/sysv/linux/raise.c: No such file or directory. [Current thread is 1 (Thread 0x7ff2b41e3780 (LWP 15194))] (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ff2b1c978b1 in __GI_abort () at abort.c:79 #2 0x0000557eb07500b2 in os_exit (code=1) at /opt/vpp-agent/dev/vpp/src/vpp/vnet/main.c:379 #3 0x00007ff2b266506d in unix_signal_handler (signum=11, si=0x7ff271262430, uc=0x7ff271262300) at /opt/vpp-agent/dev/vpp/src/vlib/unix/main.c:183 #4 <signal handler called> #5 0x00007ff2b375265f in dpo_default_get_next_node (dpo=0x7ff27226e040) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/dpo.c:293 #6 0x00007ff2b37536a4 in dpo_get_next_node (child_type=DPO_LOAD_BALANCE, child_proto=DPO_PROTO_IP4, parent_dpo=0x7ff27226e040) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/dpo.c:428 #7 0x00007ff2b3753a73 in dpo_stack (child_type=DPO_LOAD_BALANCE, child_proto=DPO_PROTO_IP4, dpo=0x7ff2722706e0, parent=0x7ff27226e040) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/dpo.c:521 #8 0x00007ff2b3760a95 in load_balance_set_bucket_i (lb=0x7ff2722706c0, bucket=0, buckets=0x7ff2722706e0, next=0x7ff27226e040) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/load_balance.c:252 #9 0x00007ff2b3761424 in load_balance_fill_buckets_norm (lb=0x7ff2722706c0, nhs=0x7ff27226e040, buckets=0x7ff2722706e0, n_buckets=1) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/load_balance.c:525 #10 0x00007ff2b3761847 in load_balance_fill_buckets (lb=0x7ff2722706c0, nhs=0x7ff27226e040, buckets=0x7ff2722706e0, n_buckets=1, flags=LOAD_BALANCE_FLAG_NONE) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/load_balance.c:589 #11 0x00007ff2b3761bf4 in load_balance_multipath_update (dpo=0x7ff27226e788, raw_nhs=0x7ff272274930, flags=LOAD_BALANCE_FLAG_NONE) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/load_balance.c:654 #12 0x00007ff2b36fc03b in fib_entry_src_mk_lb (fib_entry=0x7ff27226e760, esrc=0x7ff2722738e0, fct=FIB_FORW_CHAIN_TYPE_UNICAST_IP4, dpo_lb=0x7ff27226e788) at /opt/vpp-agent/dev/vpp/src/vnet/fib/fib_entry_src.c:602 #13 0x00007ff2b36fc1af in fib_entry_src_action_install (fib_entry=0x7ff27226e760, source=FIB_SOURCE_API) at /opt/vpp-agent/dev/vpp/src/vnet/fib/fib_entry_src.c:662 #14 0x00007ff2b36fcc4a in fib_entry_src_action_activate (fib_entry=0x7ff27226e760, source=FIB_SOURCE_API) at /opt/vpp-agent/dev/vpp/src/vnet/fib/fib_entry_src.c:1035 #15 0x00007ff2b36f31d4 in fib_entry_create (fib_index=0, prefix=0x7ff271262d60, source=FIB_SOURCE_API, flags=FIB_ENTRY_FLAG_NONE, paths=0x7ff27226ca00) at /opt/vpp-agent/dev/vpp/src/vnet/fib/fib_entry.c:755 #16 0x00007ff2b36dfeac in fib_table_entry_path_add2 (fib_index=0, prefix=0x7ff271262d60, source=FIB_SOURCE_API, flags=FIB_ENTRY_FLAG_NONE, rpaths=0x7ff27226ca00) at /opt/vpp-agent/dev/vpp/src/vnet/fib/fib_table.c:587 #17 0x00007ff2b3726142 in fib_api_route_add_del (is_add=1 '\001', is_multipath=1 '\001', fib_index=0, prefix=0x7ff271262d60, entry_flags=FIB_ENTRY_FLAG_NONE, rpaths=0x7ff27226ca00) at /opt/vpp-agent/dev/vpp/src/vnet/fib/fib_api.c:469 #18 0x00007ff2b3171299 in ip_route_add_del_t_handler (mp=0x7ff272274ae0, stats_index=0x7ff271262da8) at /opt/vpp-agent/dev/vpp/src/vnet/ip/ip_api.c:688 #19 0x00007ff2b317133d in vl_api_ip_route_add_del_t_handler (mp=0x7ff272274ae0) at /opt/vpp-agent/dev/vpp/src/vnet/ip/ip_api.c:708 #20 0x00007ff2b3db6b85 in msg_handler_internal (am=0x7ff2b3fc7cc0 <api_main>, the_msg=0x7ff272274ae0, trace_it=1, do_it=1, free_it=0) at /opt/vpp-agent/dev/vpp/src/vlibapi/api_shared.c:479 #21 0x00007ff2b3db73f8 in vl_msg_api_socket_handler (the_msg=0x7ff272274ae0) at /opt/vpp-agent/dev/vpp/src/vlibapi/api_shared.c:732 #22 0x00007ff2b3d953dc in vl_socket_process_api_msg (uf=0x7ff272277b88, rp=0x7ff2722282f0, input_v=0x7ff272274ad0 "") at /opt/vpp-agent/dev/vpp/src/vlibmemory/socket_api.c:201 #23 0x00007ff2b3da2091 in vl_api_clnt_process (vm=0x7ff2b289edc0 <vlib_global_main>, node=0x7ff27125a000, f=0x0) at /opt/vpp-agent/dev/vpp/src/vlibmemory/vlib_api.c:389 #24 0x00007ff2b25fbf6b in vlib_process_bootstrap (_a=140679262125024) at /opt/vpp-agent/dev/vpp/src/vlib/main.c:1472 #25 0x00007ff2b209cd40 in clib_calljmp () from /usr/lib/x86_64-linux-gnu/libvppinfra.so.19.08.1 #26 0x00007ff271723bb0 in ?? () #27 0x00007ff2b25fc073 in vlib_process_startup (vm=0xffffffffffffffff, p=0xea00000000, f=0x7ff27125a000) at /opt/vpp-agent/dev/vpp/src/vlib/main.c:1494 #28 0x00007ff2721fee28 in ?? () #29 0x00007ff2b289ef18 in vlib_global_main () from /usr/lib/x86_64-linux-gnu/libvlib.so.19.08.1 #30 0x00007310b5cbe5c0 in ?? () #31 0x00007ff27125a000 in ?? () #32 0x00007ff27210d368 in ?? () #33 0x00007ff27210d188 in ?? () #34 0x0000000000000012 in ?? () #35 0x00007ff27210d368 in ?? () #36 0x00007ff27125a000 in ?? () #37 0x00007ff2713fa914 in ?? () #38 0x0000000000000000 in ?? () (gdb) f 5 #5 0x00007ff2b375265f in dpo_default_get_next_node (dpo=0x7ff27226e040) at /opt/vpp-agent/dev/vpp/src/vnet/dpo/dpo.c:293 293 /opt/vpp-agent/dev/vpp/src/vnet/dpo/dpo.c: No such file or directory. (gdb) info local node_indices = 0x0 node_name = 0x58 <error: Cannot access memory at address 0x58> ii = 0 __FUNCTION__ = "dpo_default_get_next_node" (gdb) info locals node_indices = 0x0 node_name = 0x58 <error: Cannot access memory at address 0x58> ii = 0 __FUNCTION__ = "dpo_default_get_next_node" (gdb) p dpo->dpoi_type $1 = DPO_ADJACENCY_MIDCHAIN (gdb) p dpo->dpoi_proto $2 = 19 (gdb) p *dpo_nodes $3 = (const char * const * const *) 0x0 (gdb)
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#17339): https://lists.fd.io/g/vpp-dev/message/17339 Mute This Topic: https://lists.fd.io/mt/76640049/21656 Mute #vnet: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/vnet Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-