Thank you Florin! On Tue, Nov 24, 2020 at 8:54 PM Florin Coras <fcoras.li...@gmail.com> wrote:
> Hi Bill, > > Ah, you must manually enable “life cycle” events since those are > responsible for track initialization, i.e., tcp debug group 0 level 1. To > avoid the hassle, here’s a patch [1] that enables them whenever events for > another group are enabled. > > Regards, > Florin > > [1] https://gerrit.fd.io/r/c/vpp/+/30116 > > On Nov 24, 2020, at 7:35 AM, Bill Vaughan <billvaug...@gmail.com> wrote: > > Thanks for the feedback Florin. I am just now getting back to this. > > Perhaps I am doing something wrong to enable the TCP debug, but it looks > pretty simple. > > I enable the flags in tcp_debug.h: > > diff --git a/src/vnet/tcp/tcp_debug.h b/src/vnet/tcp/tcp_debug.h > index f4e7c3971..8806d8e14 100644 > --- a/src/vnet/tcp/tcp_debug.h > +++ b/src/vnet/tcp/tcp_debug.h > @@ -23,14 +23,14 @@ > * debug configuration. Comes with some overhead so it's not recommended > for > * production/performance scenarios. Takes priority over TCP_DEBUG_ENABLE. > */ > -#define TCP_DEBUG_ALWAYS (0) > +#define TCP_DEBUG_ALWAYS (1) > /** > * Build debugging infra only if enabled. Debug components controlled via > * macros that follow. > */ > #define TCP_DEBUG_ENABLE (0) > > -#define TCP_DEBUG_SM (0) > +#define TCP_DEBUG_SM (1) > #define TCP_DEBUG_CC (0) > #define TCP_DEBUG_CS (0) > #define TCP_DEBUG_LC (0 || TCP_DEBUG_SM || TCP_DEBUG_CC || TCP_DEBUG_CS) > > > Do a full rebuild (I have tried incremental and full). > > Then enable TCP debug via the CLI: > > spoke1# tcp debug ? > tcp debug tcp [show] [debug group <N> > level <N>] > spoke1# tcp debug show > Index Group Level > 0 life cycle 0 > 1 state machine 0 > 2 congestion control 0 > 3 cc stats 0 > spoke1# tcp debug group 1 level 10 > spoke1# tcp debug show > Index Group Level > 0 life cycle 0 > 1 state machine 10 > 2 congestion control 0 > 3 cc stats 0 > > <---- Start traffic ----> > > spoke1# /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208 > (elog_track_register) assertion `t->name' fails > > Thread 1 "vpp_main" received signal SIGABRT, Aborted. > __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 > 51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. > (gdb) up > #1 0x00007ffff46c68b1 in __GI_abort () at abort.c:79 > 79 abort.c: No such file or directory. > (gdb) > #2 0x0000000000407113 in os_panic () at > /home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:368 > 368 abort (); > (gdb) > #3 0x00007ffff55a5ad9 in debugger () at > /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:84 > 84 os_panic (); > (gdb) > #4 0x00007ffff55a5857 in _clib_error (how_to_die=2, function_name=0x0, > line_number=0, > fmt=0x7ffff56753af "%s:%d (%s) assertion `%s' fails") > at /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:143 > 143 debugger (); > (gdb) > #5 0x00007ffff5598e54 in elog_track_register (em=0x7ffff6432f40 > <vlib_global_main+960>, t=0x7fffbd40e260) > at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208 > 208 ASSERT (t->name); > (gdb) p /x *t > *$1 = {name = 0x0, track_index_plus_one = 0x3}* > (gdb) where > #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 > #1 0x00007ffff46c68b1 in __GI_abort () at abort.c:79 > #2 0x0000000000407113 in os_panic () at > /home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:368 > #3 0x00007ffff55a5ad9 in debugger () at > /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:84 > #4 0x00007ffff55a5857 in _clib_error (how_to_die=2, function_name=0x0, > line_number=0, fmt=0x7ffff56753af "%s:%d (%s) assertion `%s' fails") at > /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:143 > #5 0x00007ffff5598e54 in elog_track_register (em=0x7ffff6432f40 > <vlib_global_main+960>, t=0x7fffbd40e260) at > /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208 > #6 0x00007ffff5597f49 in elog_event_data_inline (em=0x7ffff6432f40 > <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, > track=0x7fffbd40e260, cpu_time=2715086657921106) at > /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:312 > #7 0x00007ffff5597e5d in elog_event_data (em=0x7ffff6432f40 > <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, > track=0x7fffbd40e260, cpu_time=2715086657921106) at > /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:68 > #8 0x00007ffff707a9e0 in elog_event_data_not_inline (em=0x7ffff6432f40 > <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, > track=0x7fffbd40e260, cpu_time=2715086657921106) at > /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:358 > #9 0x00007ffff706d135 in elog_data (em=0x7ffff6432f40 > <vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>, > track=0x7fffbd40e260) at > /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:428 > #10 0x00007ffff706db40 in tcp_send_synack (tc=0x7fffbd40e200) at > /home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_output.c:856 > #11 0x00007ffff7082cd3 in tcp46_listen_inline (vm=0x7ffff6432b80 > <vlib_global_main>, node=0x7fffb61bbe80, from_frame=0x7fffbd3b7a40, > is_ip4=1) at /home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_input.c:2641 > #12 0x00007ffff70826ba in tcp4_listen_node_fn (vm=0x7ffff6432b80 > <vlib_global_main>, node=0x7fffb61bbe80, from_frame=0x7fffbd3b7a40) at > /home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_input.c:2667 > #13 0x00007ffff6178d36 in dispatch_node (vm=0x7ffff6432b80 > <vlib_global_main>, node=0x7fffb61bbe80, type=VLIB_NODE_TYPE_INTERNAL, > dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fffbd3b7a40, > last_time_stamp=2715086657730646) at > /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1194 > #14 0x00007ffff6179817 in dispatch_pending_node (vm=0x7ffff6432b80 > <vlib_global_main>, pending_frame_index=6, > last_time_stamp=2715086657730646) at > /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1353 > #15 0x00007ffff6173400 in vlib_main_or_worker_loop (vm=0x7ffff6432b80 > <vlib_global_main>, is_main=1) at > /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1846 > #16 0x00007ffff617518a in vlib_main_loop (vm=0x7ffff6432b80 > <vlib_global_main>) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1974 > #17 0x00007ffff6174f57 in vlib_main (vm=0x7ffff6432b80 <vlib_global_main>, > input=0x7fffb6bf3fa8) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:2220 > #18 0x00007ffff61fd84e in thread0 (arg=140737324985216) at > /home/bvaughan/espace/grid_vpp/src/vlib/unix/main.c:660 > #19 0x00007ffff55c9c64 in clib_calljmp () at > /home/bvaughan/espace/grid_vpp/src/vppinfra/longjmp.S:123 > #20 0x00007fffffffd190 in ?? () > #21 0x00007ffff61fd3c3 in vlib_unix_main (argc=71, argv=0x71e510) at > /home/bvaughan/espace/grid_vpp/src/vlib/unix/main.c:733 > #22 0x00000000004067f6 in main (argc=71, argv=0x71e510) at > /home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:293 > (gdb) > > > > t->name is set in the TCPEVT_INIT_HANDLER macro which does not appear to > be getting called. I added a print to that macro and it was never hit. > > > Regarding the main issue of the the tcp dispatch error: > > tcp_input_dispatch_buffer:2808: tcp conn 138 disp error state TIME_WAIT >> flags 0x12 SYN ACK >> >> > You mentioned that this could be a threading issue. I have been running > only with the main thread for all of this testing (no workers). I plan to > add those, but I want to test for correctness first then add performance. > My understanding is that if I do not have any workers, then VPP runs > everything out of the main thread including the Process Nodes. I am using > the Process Nodes right now without locks since I don't have threading > enabled. Please let me know if my threading assumption is incorrect. > > Also, the client on the other end of the TCP connection is from Ubuntu > 18.04. I am testing with ssh/scp/iperf3. > > Thanks, > -Bill > > > > > > > > On Sun, Nov 22, 2020 at 1:24 PM Florin Coras <fcoras.li...@gmail.com> > wrote: > >> Hi Bill, >> >> Inline. >> >> On Nov 22, 2020, at 6:17 AM, Bill Vaughan <billvaug...@gmail.com> wrote: >> >> I should have stated that I am running 2009. >> >> On Sun, Nov 22, 2020 at 8:48 AM Bill Vaughan via lists.fd.io >> <billvaughan=gmail....@lists.fd.io> wrote: >> >>> Hi all, >>> >>> I am working on a project where I need to initiate TCP connections from >>> the VPP session layer. In my overnight test runs I am seeing a TCP dispatch >>> error that I am trying to track down. This happens after thousands of test >>> runs. It looks like I am able to issue a Connect call but the SYN-ACK from >>> the other side is coming back to a TCP conn that is in a TIME_WAIT state. >>> >>> tcp_input_dispatch_buffer:2808: tcp conn 138 disp error state TIME_WAIT >>> flags 0x12 SYN ACK >>> >>> >> That’s somewhat unexpected, if the peer is correctly implemented. The >> only thing that comes to mind is to check that your app closes sessions >> from the right threads. >> >> >>> I wanted to get some TCP state machine debug data but it appears the >>> tcp_debug macros are not hooked up. >>> >> >> That’s on purpose because of performance degradation although there’s an >> option of leaving it always on by enabling TCP_DEBUG_ALWAYS and then >> configuring the features you want debugged. >> >> I have enabled all of the compile time debug flags in tcp_debug.h >>> but tcp_evt_track_register() is not getting called and this causes an >>> assert in the elog code. Looking through the TCP connection setup code I >>> don't see where the debug handler macros are getting called >>> (TCP_EVT_OPEN_HANDLER, TCP_EVT_CLOSE_HANDLER, etc). I tried >>> adding TCP_EVT_OPEN_HANDLER to tcp_session_open() >>> and tcp_evt_track_register() was hit, but then it asserted somewhere else. >>> >> >> TCP_EVT (TCP_EVT_OPEN, tc) is called in tcp_session_open and TCP_EVT >> (TCP_EVT_SYN_RCVD, child, 1) is called in tcp46_listen_inline. That is, >> when connections are set up, that’s when we setup the elog track. >> >> Regards, >> Florin >> >> >>> Before I spend too much more time going down this rabbit hole, I wanted >>> to see if anyone knows why the debug macros are not hooked up. >>> >>> Thanks, >>> -Bill >>> >>> >>> >>> >>> >>> >> >> >> >> >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18134): https://lists.fd.io/g/vpp-dev/message/18134 Mute This Topic: https://lists.fd.io/mt/78430523/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-