Thanks for the feedback Florin. I am just now getting back to this. Perhaps I am doing something wrong to enable the TCP debug, but it looks pretty simple.
I enable the flags in tcp_debug.h: diff --git a/src/vnet/tcp/tcp_debug.h b/src/vnet/tcp/tcp_debug.h index f4e7c3971..8806d8e14 100644 --- a/src/vnet/tcp/tcp_debug.h +++ b/src/vnet/tcp/tcp_debug.h @@ -23,14 +23,14 @@ * debug configuration. Comes with some overhead so it's not recommended for * production/performance scenarios. Takes priority over TCP_DEBUG_ENABLE. */ -#define TCP_DEBUG_ALWAYS (0) +#define TCP_DEBUG_ALWAYS (1) /** * Build debugging infra only if enabled. Debug components controlled via * macros that follow. */ #define TCP_DEBUG_ENABLE (0) -#define TCP_DEBUG_SM (0) +#define TCP_DEBUG_SM (1) #define TCP_DEBUG_CC (0) #define TCP_DEBUG_CS (0) #define TCP_DEBUG_LC (0 || TCP_DEBUG_SM || TCP_DEBUG_CC || TCP_DEBUG_CS) Do a full rebuild (I have tried incremental and full). Then enable TCP debug via the CLI: spoke1# tcp debug ? tcp debug tcp [show] [debug group <N> level <N>] spoke1# tcp debug show Index Group Level 0 life cycle 0 1 state machine 0 2 congestion control 0 3 cc stats 0 spoke1# tcp debug group 1 level 10 spoke1# tcp debug show Index Group Level 0 life cycle 0 1 state machine 10 2 congestion control 0 3 cc stats 0 <---- Start traffic ----> spoke1# /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208 (elog_track_register) assertion `t->name' fails Thread 1 "vpp_main" received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) up #1 0x00007ffff46c68b1 in __GI_abort () at abort.c:79 79 abort.c: No such file or directory. (gdb) #2 0x0000000000407113 in os_panic () at /home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:368 368 abort (); (gdb) #3 0x00007ffff55a5ad9 in debugger () at /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:84 84 os_panic (); (gdb) #4 0x00007ffff55a5857 in _clib_error (how_to_die=2, function_name=0x0, line_number=0, fmt=0x7ffff56753af "%s:%d (%s) assertion `%s' fails") at /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:143 143 debugger (); (gdb) #5 0x00007ffff5598e54 in elog_track_register (em=0x7ffff6432f40 <vlib_global_main+960>, t=0x7fffbd40e260) at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208 208 ASSERT (t->name); (gdb) p /x *t *$1 = {name = 0x0, track_index_plus_one = 0x3}* (gdb) where #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ffff46c68b1 in __GI_abort () at abort.c:79 #2 0x0000000000407113 in os_panic () at /home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:368 #3 0x00007ffff55a5ad9 in debugger () at /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:84 #4 0x00007ffff55a5857 in _clib_error (how_to_die=2, function_name=0x0, line_number=0, fmt=0x7ffff56753af "%s:%d (%s) assertion `%s' fails") at /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:143 #5 0x00007ffff5598e54 in elog_track_register (em=0x7ffff6432f40 <vlib_global_main+960>, t=0x7fffbd40e260) at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208 #6 0x00007ffff5597f49 in elog_event_data_inline (em=0x7ffff6432f40 <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, track=0x7fffbd40e260, cpu_time=2715086657921106) at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:312 #7 0x00007ffff5597e5d in elog_event_data (em=0x7ffff6432f40 <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, track=0x7fffbd40e260, cpu_time=2715086657921106) at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:68 #8 0x00007ffff707a9e0 in elog_event_data_not_inline (em=0x7ffff6432f40 <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, track=0x7fffbd40e260, cpu_time=2715086657921106) at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:358 #9 0x00007ffff706d135 in elog_data (em=0x7ffff6432f40 <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, track=0x7fffbd40e260) at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:428 #10 0x00007ffff706db40 in tcp_send_synack (tc=0x7fffbd40e200) at /home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_output.c:856 #11 0x00007ffff7082cd3 in tcp46_listen_inline (vm=0x7ffff6432b80 <vlib_global_main>, node=0x7fffb61bbe80, from_frame=0x7fffbd3b7a40, is_ip4=1) at /home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_input.c:2641 #12 0x00007ffff70826ba in tcp4_listen_node_fn (vm=0x7ffff6432b80 <vlib_global_main>, node=0x7fffb61bbe80, from_frame=0x7fffbd3b7a40) at /home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_input.c:2667 #13 0x00007ffff6178d36 in dispatch_node (vm=0x7ffff6432b80 <vlib_global_main>, node=0x7fffb61bbe80, type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fffbd3b7a40, last_time_stamp=2715086657730646) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1194 #14 0x00007ffff6179817 in dispatch_pending_node (vm=0x7ffff6432b80 <vlib_global_main>, pending_frame_index=6, last_time_stamp=2715086657730646) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1353 #15 0x00007ffff6173400 in vlib_main_or_worker_loop (vm=0x7ffff6432b80 <vlib_global_main>, is_main=1) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1846 #16 0x00007ffff617518a in vlib_main_loop (vm=0x7ffff6432b80 <vlib_global_main>) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1974 #17 0x00007ffff6174f57 in vlib_main (vm=0x7ffff6432b80 <vlib_global_main>, input=0x7fffb6bf3fa8) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:2220 #18 0x00007ffff61fd84e in thread0 (arg=140737324985216) at /home/bvaughan/espace/grid_vpp/src/vlib/unix/main.c:660 #19 0x00007ffff55c9c64 in clib_calljmp () at /home/bvaughan/espace/grid_vpp/src/vppinfra/longjmp.S:123 #20 0x00007fffffffd190 in ?? () #21 0x00007ffff61fd3c3 in vlib_unix_main (argc=71, argv=0x71e510) at /home/bvaughan/espace/grid_vpp/src/vlib/unix/main.c:733 #22 0x00000000004067f6 in main (argc=71, argv=0x71e510) at /home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:293 (gdb) t->name is set in the TCPEVT_INIT_HANDLER macro which does not appear to be getting called. I added a print to that macro and it was never hit. Regarding the main issue of the the tcp dispatch error: tcp_input_dispatch_buffer:2808: tcp conn 138 disp error state TIME_WAIT > flags 0x12 SYN ACK > > You mentioned that this could be a threading issue. I have been running only with the main thread for all of this testing (no workers). I plan to add those, but I want to test for correctness first then add performance. My understanding is that if I do not have any workers, then VPP runs everything out of the main thread including the Process Nodes. I am using the Process Nodes right now without locks since I don't have threading enabled. Please let me know if my threading assumption is incorrect. Also, the client on the other end of the TCP connection is from Ubuntu 18.04. I am testing with ssh/scp/iperf3. Thanks, -Bill On Sun, Nov 22, 2020 at 1:24 PM Florin Coras <fcoras.li...@gmail.com> wrote: > Hi Bill, > > Inline. > > On Nov 22, 2020, at 6:17 AM, Bill Vaughan <billvaug...@gmail.com> wrote: > > I should have stated that I am running 2009. > > On Sun, Nov 22, 2020 at 8:48 AM Bill Vaughan via lists.fd.io <billvaughan= > gmail....@lists.fd.io> wrote: > >> Hi all, >> >> I am working on a project where I need to initiate TCP connections from >> the VPP session layer. In my overnight test runs I am seeing a TCP dispatch >> error that I am trying to track down. This happens after thousands of test >> runs. It looks like I am able to issue a Connect call but the SYN-ACK from >> the other side is coming back to a TCP conn that is in a TIME_WAIT state. >> >> tcp_input_dispatch_buffer:2808: tcp conn 138 disp error state TIME_WAIT >> flags 0x12 SYN ACK >> >> > That’s somewhat unexpected, if the peer is correctly implemented. The only > thing that comes to mind is to check that your app closes sessions from the > right threads. > > >> I wanted to get some TCP state machine debug data but it appears the >> tcp_debug macros are not hooked up. >> > > That’s on purpose because of performance degradation although there’s an > option of leaving it always on by enabling TCP_DEBUG_ALWAYS and then > configuring the features you want debugged. > > I have enabled all of the compile time debug flags in tcp_debug.h >> but tcp_evt_track_register() is not getting called and this causes an >> assert in the elog code. Looking through the TCP connection setup code I >> don't see where the debug handler macros are getting called >> (TCP_EVT_OPEN_HANDLER, TCP_EVT_CLOSE_HANDLER, etc). I tried >> adding TCP_EVT_OPEN_HANDLER to tcp_session_open() >> and tcp_evt_track_register() was hit, but then it asserted somewhere else. >> > > TCP_EVT (TCP_EVT_OPEN, tc) is called in tcp_session_open and TCP_EVT > (TCP_EVT_SYN_RCVD, child, 1) is called in tcp46_listen_inline. That is, > when connections are set up, that’s when we setup the elog track. > > Regards, > Florin > > >> Before I spend too much more time going down this rabbit hole, I wanted >> to see if anyone knows why the debug macros are not hooked up. >> >> Thanks, >> -Bill >> >> >> >> >> >> > > > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18128): https://lists.fd.io/g/vpp-dev/message/18128 Mute This Topic: https://lists.fd.io/mt/78430523/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-