As I mentioned earlier, hqos thread runs only qos scheduler. In the configuration below, I don't see any logical core assigned to main/worker thread ?
From: David Hotham [mailto:david.hot...@metaswitch.com] Sent: Tuesday, October 11, 2016 3:37 PM To: vpp-dev@lists.fd.io; Singh, Jasvinder <jasvinder.si...@intel.com>; Dumitrescu, Cristian <cristian.dumitre...@intel.com> Subject: RE: crash in new QoS code So back on the original crash that started this thread... I've been doing some digging, have a decent idea of what's going wrong, and would appreciate input on what the right approach / next steps towards fixing it should be. When the GRE tunnel is configured, VPP's main control thread causes an ARP packet to be transmitted - stack below. Then when we hit the null pointer - stack per original email, a little further below - we're still on CPU 0. But IIUC, the HQoS assumes that we will only ever transmit from a worker thread - so it has not set up any configuration for CPU 0. Comments? Thanks! David #0 vlib_put_frame_to_node (vm=0x1087700 <vlib_global_main>, to_node_index=204, f=0x2aaaaec2f900) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:195 #1 0x00002aaaabbf6c52 in adj_ip4_nbr_probe (adj=0x2aaaaec3a498) at /home/ubuntu/vpp/build-data/../vnet/vnet/adj/adj_nbr.c:192 #2 0x00002aaaabbf713d in adj_nbr_add_or_lock (nh_proto=FIB_PROTOCOL_IP4, link_type=FIB_LINK_IP4, nh_addr=0x2aaaaec25a4c, sw_if_index=1) at /home/ubuntu/vpp/build-data/../vnet/vnet/adj/adj_nbr.c:350 #3 0x00002aaaabbe0813 in fib_path_attached_next_hop_get_adj (path=0x2aaaaec25a24, link=FIB_LINK_IP4) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path.c:522 #4 0x00002aaaabbe089c in fib_path_attached_next_hop_set (path=0x2aaaaec25a24) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path.c:542 #5 0x00002aaaabbe2346 in fib_path_resolve (path_index=13) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path.c:1377 #6 0x00002aaaabbdd31b in fib_path_list_resolve (path_list=0x2aaaaebd8a70) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path_list.c:582 #7 0x00002aaaabbdd6ee in fib_path_list_create (flags=FIB_PATH_LIST_FLAG_NONE, rpaths=0x2aaaaec2dfcc) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path_list.c:720 #8 0x00002aaaabbd1bae in fib_entry_src_rr_resolve_via_connected (src=0x2aaaaebf02d4, fib_entry=0x2aaaaebd8024, cover=0x2aaaaebd7f44) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry_src_rr.c:54 #9 0x00002aaaabbd1d43 in fib_entry_src_rr_activate (src=0x2aaaaebf02d4, fib_entry=0x2aaaaebd8024) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry_src_rr.c:99 #10 0x00002aaaabbcf41e in fib_entry_src_action_activate (fib_entry=0x2aaaaebd8024, source=FIB_SOURCE_RR) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry_src.c:512 #11 0x00002aaaabbc66cf in fib_entry_create_special (fib_index=0, prefix=0x2aaaaeb8f8e0, source=FIB_SOURCE_RR, flags=FIB_ENTRY_FLAG_NONE, dpo=0x2aaaaeb8f7f0) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry.c:741 #12 0x00002aaaabbb3b3d in fib_table_entry_special_dpo_add (fib_index=0, prefix=0x2aaaaeb8f8e0, source=FIB_SOURCE_RR, flags=FIB_ENTRY_FLAG_NONE, dpo=0x2aaaaeb8f7f0) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_table.c:302 #13 0x00002aaaabbb3ca5 in fib_table_entry_special_add (fib_index=0, prefix=0x2aaaaeb8f8e0, source=FIB_SOURCE_RR, flags=FIB_ENTRY_FLAG_NONE, adj_index=4294967295) at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_table.c:348 #14 0x00002aaaab97df36 in vnet_gre_tunnel_add (a=0x2aaaaeb8f9d0, sw_if_indexp=0x2aaaaeb8f964) at /home/ubuntu/vpp/build-data/../vnet/vnet/gre/interface.c:388 #15 0x00002aaaab97ebae in create_gre_tunnel_command_fn (vm=0x1087700 <vlib_global_main>, input=0x2aaaaeb8fec0, cmd=0x2aaaae1f21ac) at /home/ubuntu/vpp/build-data/../vnet/vnet/gre/interface.c:588 #16 0x00002aaaab354f98 in vlib_cli_dispatch_sub_commands (vm=0x1087700 <vlib_global_main>, cm=0x1087968 <vlib_global_main+616>, input=0x2aaaaeb8fec0, parent_command_index=273) at /home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:483 #17 0x00002aaaab354ea8 in vlib_cli_dispatch_sub_commands (vm=0x1087700 <vlib_global_main>, cm=0x1087968 <vlib_global_main+616>, input=0x2aaaaeb8fec0, parent_command_index=115) at /home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:461 #18 0x00002aaaab354ea8 in vlib_cli_dispatch_sub_commands (vm=0x1087700 <vlib_global_main>, cm=0x1087968 <vlib_global_main+616>, input=0x2aaaaeb8fec0, parent_command_index=0) at /home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:461 #19 0x00002aaaab355280 in vlib_cli_input (vm=0x1087700 <vlib_global_main>, input=0x2aaaaeb8fec0, function=0x2aaaab11c57c <unix_vlib_cli_output>, function_arg=0) at /home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:557 #20 0x00002aaaab120dfe in unix_cli_process_input (cm=0x2aaaab3422e0 <unix_cli_main>, cli_file_index=0) at /home/ubuntu/vpp/build-data/../vlib/vlib/unix/cli.c:2033 #21 0x00002aaaab121870 in unix_cli_process (vm=0x1087700 <vlib_global_main>, rt=0x2aaaaeb7f000, f=0x0) at /home/ubuntu/vpp/build-data/../vlib/vlib/unix/cli.c:2130 #22 0x00002aaaab37c7ed in vlib_process_bootstrap (_a=46912544967216) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1191 #23 0x00002aaaac5e2b3c in clib_calljmp () at /home/ubuntu/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110 #24 0x00002aaaad940a00 in ?? () #25 0x00002aaaab37c922 in vlib_process_startup (vm=0x0, p=0x2aaaaec30bbc, f=0x2aaaaebfc324) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1213 From: David Hotham Sent: 07 October 2016 17:15 To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: crash in new QoS code I'm experimenting with the new QoS code and have hit a crash. Below is a short gdb session that shows - my startup configuration - what I'm configuring - the crash Am I doing something wrong, or is this a bug that wants debugging and fixing? Thanks! David ubuntu@dch-test:~/vpp$ make debug STARTUP_CONF=/etc/vpp/startup.conf GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /home/ubuntu/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp...done. Signal Stop Print Pass to program Description SIGUSR1 No No Yes User defined signal 1 (gdb) run Starting program: /home/ubuntu/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp unix \{ interactive nodaemon log /tmp/vpp.log full-coredump \} api-trace \{ o n \} api-segment \{ gid vpp \} dpdk \{ socket-mem 1024 dev 0000:00:05.0 \{ num-rx-queues 2 hqos \} dev 0000:00:06.0 \{ num-rx-queues 2 hqos \} num-mbufs 100000 \} cpu \{ corelist-hqos-threads 1 \} plugin_path /home/ubuntu/vpp/build-root/install-vpp_debug-native/plugins/lib64/vpp_plugins [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". vlib_plugin_early_init:213: plugin path /home/ubuntu/vpp/build-root/install-vpp_debug-native/plugins/lib64/vpp_plugins EAL: Detected 2 lcore(s) EAL: Probing VFIO support... EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles ! PMD: bnxt_rte_pmd_init() called for (null) [New Thread 0x2aab0ab54700 (LWP 8214)] [New Thread 0x2aab0ad55700 (LWP 8215)] EAL: PCI device 0000:00:05.0 on NUMA socket -1 EAL: probe driver: 8086:10ed rte_ixgbevf_pmd EAL: PCI device 0000:00:06.0 on NUMA socket -1 EAL: probe driver: 8086:10ed rte_ixgbevf_pmd DPDK physical memory layout: Segment 0: phys:0x61000000, len:1073741824, virt:0x2aab10000000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0 1: vlib_set_thread_name:102: pthread_setname_np returned 34 [New Thread 0x2aab0af56700 (LWP 8216)] 0: svm_client_scan_this_region_nolock:1139: /vpe-api: cleanup ghost pid 8173 0: svm_client_scan_this_region_nolock:1139: /global_vm: cleanup ghost pid 8173 0: dpdk_lib_init:314: DPDK drivers found 2 ports... PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip _______ _ _ _____ ___ __/ __/ _ \ (_)__ | | / / _ \/ _ \ _/ _// // / / / _ \ | |/ / ___/ ___/ /_/ /____(_)_/\___/ |___/_/ /_/ DBGvpp# set int ip addr TenGigabitEthernet0/5/0 10.35.1.1/16 DBGvpp# set interface mtu 1500 TenGigabitEthernet0/5/0 PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip DBGvpp# set int state TenGigabitEthernet0/5/0 up DBGvpp# create gre tunnel teb src 10.35.1.1 dst 10.35.2.1 gre0 DBGvpp# Program received signal SIGSEGV, Segmentation fault. 0x00002aaaabad1b51 in __rte_ring_sp_do_enqueue (behavior=RTE_RING_QUEUE_VARIABLE, n=1, obj_table=0x2aaaae9a3bc0, r=0x0) at /home/ubuntu/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_ring.h:546 546 uint32_t mask = r->prod.mask; (gdb) bt #0 0x00002aaaabad1b51 in __rte_ring_sp_do_enqueue (behavior=RTE_RING_QUEUE_VARIABLE, n=1, obj_table=0x2aaaae9a3bc0, r=0x0) at /home/ubuntu/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_ring.h:546 #1 rte_ring_sp_enqueue_burst (n=1, obj_table=0x2aaaae9a3bc0, r=0x0) at /home/ubuntu/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_ring.h:1168 #2 tx_burst_vector_internal (vm=0x1086700 <vlib_global_main>, xd=0x2aaaae9ab940, tx_vector=0x2aaaae9a3bc0) at /home/ubuntu/vpp/build-data/../vnet/vnet/devices/dpdk/device.c:345 #3 0x00002aaaabad3950 in dpdk_interface_tx (vm=0x1086700 <vlib_global_main>, node=0x2aaaad72c540, f=0x2aaaae9c0600) at /home/ubuntu/vpp/build-data/../vnet/vnet/devices/dpdk/device.c:892 #4 0x00002aaaab37c17a in dispatch_node (vm=0x1086700 <vlib_global_main>, node=0x2aaaad72c540, type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x2aaaae9c0600, last_time_stamp=912728663188572) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:996 #5 0x00002aaaab37c596 in dispatch_pending_node (vm=0x1086700 <vlib_global_main>, p=0x2aaaae997a58, last_time_stamp=912728663188572) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1134 #6 0x00002aaaab37e3d8 in vlib_main_loop (vm=0x1086700 <vlib_global_main>) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1518 #7 0x00002aaaab37ea56 in vlib_main (vm=0x1086700 <vlib_global_main>, input=0x2aaaad919fb0) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1653 #8 0x00002aaaab128b33 in thread0 (arg=17327872) at /home/ubuntu/vpp/build-data/../vlib/vlib/unix/main.c:485 #9 0x00002aaaac5bbb3c in clib_calljmp () at /home/ubuntu/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110 #10 0x00007fffffffcfe0 in ?? () #11 0x00002aaaab128f93 in vlib_unix_main (argc=46, argv=0x7fffffffe298) at /home/ubuntu/vpp/build-data/../vlib/vlib/unix/main.c:545 #12 0x0000000000a9ca2c in main (argc=46, argv=0x7fffffffe298) at /home/ubuntu/vpp/build-data/../vpp/vnet/main.c:259 (gdb) frame 2 #2 tx_burst_vector_internal (vm=0x1086700 <vlib_global_main>, xd=0x2aaaae9ab940, tx_vector=0x2aaaae9a3bc0) at /home/ubuntu/vpp/build-data/../vnet/vnet/devices/dpdk/device.c:345 345 rv = rte_ring_sp_enqueue_burst (hqos->swq, (gdb) print hqos->swq $1 = (struct rte_ring *) 0x0
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev