Hi John,

The ARP packets in the bridge-domain are not generated by the VPP control-plane (i.e. thread 0), instead they are either RX from a bridge port, or generated by an ip4_arp_inline, so in both cases, from a worker thread - so they should be OK.

The ARP request in the traceback below (which upon further inspection) is generated by FIB in the control-plane to try and resolve the adjacency through which the tunnel's destination is reachable.

These pre-emptive ARP probes are not the only packets sent by the control plane, some i can think of; IPv6 RAs, DCHP client, also use thread 0.
It strikes me as unreasonable to mandate only worker threads can send packets.

regards,
neale


On 11/10/16 20:47, John Lo (loj) wrote:
Hi Neal,

I believe David is using GRE tunnel in transparent ethernet bridging (TEB)
mode, to carry ethernet packet to a bridge domain. For the TEB case, the GRE
tunnel is created like an ethernet interface with calls to
register_etehrnet_interface using a MAC that's (d00b:eed0:0000 +
sw_if_index). So I wonder how ARP should be handled in this case.

Regards, John

-----Original Message----- From: vpp-dev-boun...@lists.fd.io
[mailto:vpp-dev-boun...@lists.fd.io] On Behalf Of Neale Ranns (nranns) Sent:
Tuesday, October 11, 2016 1:16 PM To: David Hotham; vpp-dev@lists.fd.io;
Singh, Jasvinder (jasvinder.si...@intel.com); Dumitrescu, Cristian
(cristian.dumitre...@intel.com) Subject: Re: [vpp-dev] crash in new QoS code

Hi David,

On 11/10/16 15:36, David Hotham via vpp-dev wrote:
So back on the original crash that started this thread...  I've been doing
some digging, have a decent idea of what's going wrong, and would
appreciate input on what the right approach / next steps towards fixing it
should be.



When the GRE tunnel is configured, VPP's main control thread causes an ARP
packet to be transmitted - stack below.

sending an ARP packet on a GRE interface is a bug. I'm working on fixing that
now. But we also send ARP/ND requests on Ethernet interfaces in the same way,
which is legit.

/neale



Then when we hit the null pointer - stack per original email, a little
further below - we're still on CPU 0.



But IIUC, the HQoS assumes that we will only ever transmit from a worker
thread - so it has not set up any configuration for CPU 0.



Comments?



Thanks!



David







#0  vlib_put_frame_to_node (vm=0x1087700 <vlib_global_main>,
to_node_index=204, f=0x2aaaaec2f900) at
/home/ubuntu/vpp/build-data/../vlib/vlib/main.c:195

#1  0x00002aaaabbf6c52 in adj_ip4_nbr_probe (adj=0x2aaaaec3a498) at
/home/ubuntu/vpp/build-data/../vnet/vnet/adj/adj_nbr.c:192

#2  0x00002aaaabbf713d in adj_nbr_add_or_lock (nh_proto=FIB_PROTOCOL_IP4,
link_type=FIB_LINK_IP4, nh_addr=0x2aaaaec25a4c, sw_if_index=1)

at /home/ubuntu/vpp/build-data/../vnet/vnet/adj/adj_nbr.c:350

#3  0x00002aaaabbe0813 in fib_path_attached_next_hop_get_adj
(path=0x2aaaaec25a24, link=FIB_LINK_IP4)

at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path.c:522

#4  0x00002aaaabbe089c in fib_path_attached_next_hop_set
(path=0x2aaaaec25a24) at
/home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path.c:542

#5  0x00002aaaabbe2346 in fib_path_resolve (path_index=13) at
/home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path.c:1377

#6  0x00002aaaabbdd31b in fib_path_list_resolve (path_list=0x2aaaaebd8a70)
at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path_list.c:582

#7  0x00002aaaabbdd6ee in fib_path_list_create
(flags=FIB_PATH_LIST_FLAG_NONE, rpaths=0x2aaaaec2dfcc)

at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_path_list.c:720

#8  0x00002aaaabbd1bae in fib_entry_src_rr_resolve_via_connected
(src=0x2aaaaebf02d4, fib_entry=0x2aaaaebd8024, cover=0x2aaaaebd7f44)

at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry_src_rr.c:54

#9  0x00002aaaabbd1d43 in fib_entry_src_rr_activate (src=0x2aaaaebf02d4,
fib_entry=0x2aaaaebd8024)

at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry_src_rr.c:99

#10 0x00002aaaabbcf41e in fib_entry_src_action_activate
(fib_entry=0x2aaaaebd8024, source=FIB_SOURCE_RR)

at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry_src.c:512

#11 0x00002aaaabbc66cf in fib_entry_create_special (fib_index=0,
prefix=0x2aaaaeb8f8e0, source=FIB_SOURCE_RR, flags=FIB_ENTRY_FLAG_NONE,
dpo=0x2aaaaeb8f7f0)

at /home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_entry.c:741

#12 0x00002aaaabbb3b3d in fib_table_entry_special_dpo_add (fib_index=0,
prefix=0x2aaaaeb8f8e0, source=FIB_SOURCE_RR, flags=FIB_ENTRY_FLAG_NONE,

dpo=0x2aaaaeb8f7f0) at
/home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_table.c:302

#13 0x00002aaaabbb3ca5 in fib_table_entry_special_add (fib_index=0,
prefix=0x2aaaaeb8f8e0, source=FIB_SOURCE_RR, flags=FIB_ENTRY_FLAG_NONE,

adj_index=4294967295) at
/home/ubuntu/vpp/build-data/../vnet/vnet/fib/fib_table.c:348

#14 0x00002aaaab97df36 in vnet_gre_tunnel_add (a=0x2aaaaeb8f9d0,
sw_if_indexp=0x2aaaaeb8f964) at
/home/ubuntu/vpp/build-data/../vnet/vnet/gre/interface.c:388

#15 0x00002aaaab97ebae in create_gre_tunnel_command_fn (vm=0x1087700
<vlib_global_main>, input=0x2aaaaeb8fec0, cmd=0x2aaaae1f21ac)

at /home/ubuntu/vpp/build-data/../vnet/vnet/gre/interface.c:588

#16 0x00002aaaab354f98 in vlib_cli_dispatch_sub_commands (vm=0x1087700
<vlib_global_main>, cm=0x1087968 <vlib_global_main+616>,
input=0x2aaaaeb8fec0,

parent_command_index=273) at
/home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:483

#17 0x00002aaaab354ea8 in vlib_cli_dispatch_sub_commands (vm=0x1087700
<vlib_global_main>, cm=0x1087968 <vlib_global_main+616>,
input=0x2aaaaeb8fec0,

parent_command_index=115) at
/home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:461

#18 0x00002aaaab354ea8 in vlib_cli_dispatch_sub_commands (vm=0x1087700
<vlib_global_main>, cm=0x1087968 <vlib_global_main+616>,
input=0x2aaaaeb8fec0,

parent_command_index=0) at
/home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:461

#19 0x00002aaaab355280 in vlib_cli_input (vm=0x1087700 <vlib_global_main>,
input=0x2aaaaeb8fec0, function=0x2aaaab11c57c <unix_vlib_cli_output>,

function_arg=0) at /home/ubuntu/vpp/build-data/../vlib/vlib/cli.c:557

#20 0x00002aaaab120dfe in unix_cli_process_input (cm=0x2aaaab3422e0
<unix_cli_main>, cli_file_index=0)

at /home/ubuntu/vpp/build-data/../vlib/vlib/unix/cli.c:2033

#21 0x00002aaaab121870 in unix_cli_process (vm=0x1087700
<vlib_global_main>, rt=0x2aaaaeb7f000, f=0x0)

at /home/ubuntu/vpp/build-data/../vlib/vlib/unix/cli.c:2130

#22 0x00002aaaab37c7ed in vlib_process_bootstrap (_a=46912544967216) at
/home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1191

#23 0x00002aaaac5e2b3c in clib_calljmp () at
/home/ubuntu/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110

#24 0x00002aaaad940a00 in ?? ()

#25 0x00002aaaab37c922 in vlib_process_startup (vm=0x0, p=0x2aaaaec30bbc,
f=0x2aaaaebfc324) at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1213









*From:*David Hotham *Sent:* 07 October 2016 17:15 *To:*
vpp-dev@lists.fd.io *Subject:* crash in new QoS code



I'm experimenting with the new QoS code and have hit a crash.  Below is a
short gdb session that shows



-          my startup configuration

-          what I'm configuring

-          the crash



Am I doing something wrong, or is this a bug that wants debugging and
fixing?


Thanks!



David





ubuntu@dch-test:~/vpp$ make debug STARTUP_CONF=/etc/vpp/startup.conf

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1

Copyright (C) 2014 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.  Type "show copying"

and "show warranty" for details.

This GDB was configured as "x86_64-linux-gnu".

Type "show configuration" for configuration details.

For bug reporting instructions, please see:

<http://www.gnu.org/software/gdb/bugs/>.

Find the GDB manual and other documentation resources online at:

<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".

Type "apropos word" to search for commands related to "word"...

Reading symbols from
/home/ubuntu/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp...done.

Signal        Stop      Print   Pass to program Description

SIGUSR1       No        No      Yes             User defined signal 1

(gdb) run

Starting program:
/home/ubuntu/vpp/build-root/install-vpp_debug-native/vpp/bin/vpp unix \{
interactive nodaemon log /tmp/vpp.log full-coredump \} api-trace \{ o

n \} api-segment \{ gid vpp \} dpdk \{ socket-mem 1024 dev 0000:00:05.0 \{
num-rx-queues 2 hqos \} dev 0000:00:06.0 \{ num-rx-queues 2 hqos \}
num-mbufs 100000

\} cpu \{ corelist-hqos-threads 1 \} plugin_path
/home/ubuntu/vpp/build-root/install-vpp_debug-native/plugins/lib64/vpp
_plugins

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

vlib_plugin_early_init:213: plugin path
/home/ubuntu/vpp/build-root/install-vpp_debug-native/plugins/lib64/vpp
_plugins

EAL: Detected 2 lcore(s)

EAL: Probing VFIO support...

EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable
clock cycles !

PMD: bnxt_rte_pmd_init() called for (null)

[New Thread 0x2aab0ab54700 (LWP 8214)]

[New Thread 0x2aab0ad55700 (LWP 8215)]

EAL: PCI device 0000:00:05.0 on NUMA socket -1

EAL:   probe driver: 8086:10ed rte_ixgbevf_pmd

EAL: PCI device 0000:00:06.0 on NUMA socket -1

EAL:   probe driver: 8086:10ed rte_ixgbevf_pmd

DPDK physical memory layout:

Segment 0: phys:0x61000000, len:1073741824, virt:0x2aab10000000,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0

1: vlib_set_thread_name:102: pthread_setname_np returned 34

[New Thread 0x2aab0af56700 (LWP 8216)]

0: svm_client_scan_this_region_nolock:1139: /vpe-api: cleanup ghost pid
8173

0: svm_client_scan_this_region_nolock:1139: /global_vm: cleanup ghost pid
8173

0: dpdk_lib_init:314: DPDK drivers found 2 ports...

PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip

PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip

_______    _        _   _____  ___

__/ __/ _ \  (_)__    | | / / _ \/ _ \

_/ _// // / / / _ \   | |/ / ___/ ___/

/_/ /____(_)_/\___/   |___/_/  /_/



DBGvpp# set int ip addr TenGigabitEthernet0/5/0 10.35.1.1/16

DBGvpp# set interface mtu 1500 TenGigabitEthernet0/5/0

PMD: ixgbevf_dev_configure(): VF can't disable HW CRC Strip

DBGvpp# set int state TenGigabitEthernet0/5/0 up

DBGvpp# create gre tunnel teb src 10.35.1.1 dst 10.35.2.1

gre0

DBGvpp#

Program received signal SIGSEGV, Segmentation fault.

0x00002aaaabad1b51 in __rte_ring_sp_do_enqueue
(behavior=RTE_RING_QUEUE_VARIABLE, n=1, obj_table=0x2aaaae9a3bc0, r=0x0)

at /home/ubuntu/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_
ring.h:546

546             uint32_t mask = r->prod.mask;

(gdb) bt

#0  0x00002aaaabad1b51 in __rte_ring_sp_do_enqueue
(behavior=RTE_RING_QUEUE_VARIABLE, n=1, obj_table=0x2aaaae9a3bc0, r=0x0)

at /home/ubuntu/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_
ring.h:546

#1  rte_ring_sp_enqueue_burst (n=1, obj_table=0x2aaaae9a3bc0, r=0x0) at
/home/ubuntu/vpp/build-root/install-vpp_debug-native/dpdk/include/rte_
ring.h:1168

#2  tx_burst_vector_internal (vm=0x1086700 <vlib_global_main>,
xd=0x2aaaae9ab940, tx_vector=0x2aaaae9a3bc0)

at /home/ubuntu/vpp/build-data/../vnet/vnet/devices/dpdk/device.c:345

#3  0x00002aaaabad3950 in dpdk_interface_tx (vm=0x1086700
<vlib_global_main>, node=0x2aaaad72c540, f=0x2aaaae9c0600)

at /home/ubuntu/vpp/build-data/../vnet/vnet/devices/dpdk/device.c:892

#4  0x00002aaaab37c17a in dispatch_node (vm=0x1086700 <vlib_global_main>,
node=0x2aaaad72c540, type=VLIB_NODE_TYPE_INTERNAL,

dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x2aaaae9c0600,
last_time_stamp=912728663188572) at
/home/ubuntu/vpp/build-data/../vlib/vlib/main.c:996

#5  0x00002aaaab37c596 in dispatch_pending_node (vm=0x1086700
<vlib_global_main>, p=0x2aaaae997a58, last_time_stamp=912728663188572)

at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1134

#6  0x00002aaaab37e3d8 in vlib_main_loop (vm=0x1086700 <vlib_global_main>)
at /home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1518

#7  0x00002aaaab37ea56 in vlib_main (vm=0x1086700 <vlib_global_main>,
input=0x2aaaad919fb0) at
/home/ubuntu/vpp/build-data/../vlib/vlib/main.c:1653

#8  0x00002aaaab128b33 in thread0 (arg=17327872) at
/home/ubuntu/vpp/build-data/../vlib/vlib/unix/main.c:485

#9  0x00002aaaac5bbb3c in clib_calljmp () at
/home/ubuntu/vpp/build-data/../vppinfra/vppinfra/longjmp.S:110

#10 0x00007fffffffcfe0 in ?? ()

#11 0x00002aaaab128f93 in vlib_unix_main (argc=46, argv=0x7fffffffe298) at
/home/ubuntu/vpp/build-data/../vlib/vlib/unix/main.c:545

#12 0x0000000000a9ca2c in main (argc=46, argv=0x7fffffffe298) at
/home/ubuntu/vpp/build-data/../vpp/vnet/main.c:259

(gdb) frame 2

#2  tx_burst_vector_internal (vm=0x1086700 <vlib_global_main>,
xd=0x2aaaae9ab940, tx_vector=0x2aaaae9a3bc0)

at /home/ubuntu/vpp/build-data/../vnet/vnet/devices/dpdk/device.c:345

345                   rv = rte_ring_sp_enqueue_burst (hqos->swq,

(gdb) print hqos->swq

$1 = (struct rte_ring *) 0x0



_______________________________________________ vpp-dev mailing list
vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev

_______________________________________________ vpp-dev mailing list
vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev

_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to