Thanks for the feedback Florin. I am just now getting back to this.

Perhaps I am doing something wrong to enable the TCP debug, but it looks
pretty simple.

I enable the flags in tcp_debug.h:

diff --git a/src/vnet/tcp/tcp_debug.h b/src/vnet/tcp/tcp_debug.h
index f4e7c3971..8806d8e14 100644
--- a/src/vnet/tcp/tcp_debug.h
+++ b/src/vnet/tcp/tcp_debug.h
@@ -23,14 +23,14 @@
  * debug configuration. Comes with some overhead so it's not recommended
for
  * production/performance scenarios. Takes priority over TCP_DEBUG_ENABLE.
  */
-#define TCP_DEBUG_ALWAYS (0)
+#define TCP_DEBUG_ALWAYS (1)
 /**
  * Build debugging infra only if enabled. Debug components controlled via
  * macros that follow.
  */
 #define TCP_DEBUG_ENABLE (0)

-#define TCP_DEBUG_SM (0)
+#define TCP_DEBUG_SM (1)
 #define TCP_DEBUG_CC (0)
 #define TCP_DEBUG_CS (0)
 #define TCP_DEBUG_LC (0 || TCP_DEBUG_SM || TCP_DEBUG_CC || TCP_DEBUG_CS)


Do a full rebuild (I have tried incremental and full).

Then enable TCP debug via the CLI:

spoke1# tcp debug ?
  tcp debug                                tcp [show] [debug group <N>
level <N>]
spoke1# tcp debug show
Index     Group                         Level
0         life cycle                    0
1         state machine                 0
2         congestion control            0
3         cc stats                      0
spoke1# tcp debug group 1 level 10
spoke1# tcp debug show
Index     Group                         Level
0         life cycle                    0
1         state machine                 10
2         congestion control            0
3         cc stats                      0

<---- Start traffic ---->

spoke1# /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208
(elog_track_register) assertion `t->name' fails

Thread 1 "vpp_main" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) up
#1  0x00007ffff46c68b1 in __GI_abort () at abort.c:79
79 abort.c: No such file or directory.
(gdb)
#2  0x0000000000407113 in os_panic () at
/home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:368
368  abort ();
(gdb)
#3  0x00007ffff55a5ad9 in debugger () at
/home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:84
84  os_panic ();
(gdb)
#4  0x00007ffff55a5857 in _clib_error (how_to_die=2, function_name=0x0,
line_number=0,
    fmt=0x7ffff56753af "%s:%d (%s) assertion `%s' fails")
    at /home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:143
143    debugger ();
(gdb)
#5  0x00007ffff5598e54 in elog_track_register (em=0x7ffff6432f40
<vlib_global_main+960>, t=0x7fffbd40e260)
    at /home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208
208  ASSERT (t->name);
(gdb) p /x *t
*$1 = {name = 0x0, track_index_plus_one = 0x3}*
(gdb) where
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff46c68b1 in __GI_abort () at abort.c:79
#2  0x0000000000407113 in os_panic () at
/home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:368
#3  0x00007ffff55a5ad9 in debugger () at
/home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:84
#4  0x00007ffff55a5857 in _clib_error (how_to_die=2, function_name=0x0,
line_number=0, fmt=0x7ffff56753af "%s:%d (%s) assertion `%s' fails") at
/home/bvaughan/espace/grid_vpp/src/vppinfra/error.c:143
#5  0x00007ffff5598e54 in elog_track_register (em=0x7ffff6432f40
<vlib_global_main+960>, t=0x7fffbd40e260) at
/home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:208
#6  0x00007ffff5597f49 in elog_event_data_inline (em=0x7ffff6432f40
<vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>,
track=0x7fffbd40e260, cpu_time=2715086657921106) at
/home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:312
#7  0x00007ffff5597e5d in elog_event_data (em=0x7ffff6432f40
<vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>,
track=0x7fffbd40e260, cpu_time=2715086657921106) at
/home/bvaughan/espace/grid_vpp/src/vppinfra/elog.c:68
#8  0x00007ffff707a9e0 in elog_event_data_not_inline (em=0x7ffff6432f40
<vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>,
track=0x7fffbd40e260, cpu_time=2715086657921106) at
/home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:358
#9  0x00007ffff706d135 in elog_data (em=0x7ffff6432f40
<vlib_global_main+960>, type=0x7ffff7b34be8 <tcp_send_synack._e>,
track=0x7fffbd40e260) at
/home/bvaughan/espace/grid_vpp/src/vppinfra/elog.h:428
#10 0x00007ffff706db40 in tcp_send_synack (tc=0x7fffbd40e200) at
/home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_output.c:856
#11 0x00007ffff7082cd3 in tcp46_listen_inline (vm=0x7ffff6432b80
<vlib_global_main>, node=0x7fffb61bbe80, from_frame=0x7fffbd3b7a40,
is_ip4=1) at /home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_input.c:2641
#12 0x00007ffff70826ba in tcp4_listen_node_fn (vm=0x7ffff6432b80
<vlib_global_main>, node=0x7fffb61bbe80, from_frame=0x7fffbd3b7a40) at
/home/bvaughan/espace/grid_vpp/src/vnet/tcp/tcp_input.c:2667
#13 0x00007ffff6178d36 in dispatch_node (vm=0x7ffff6432b80
<vlib_global_main>, node=0x7fffb61bbe80, type=VLIB_NODE_TYPE_INTERNAL,
dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fffbd3b7a40,
last_time_stamp=2715086657730646) at
/home/bvaughan/espace/grid_vpp/src/vlib/main.c:1194
#14 0x00007ffff6179817 in dispatch_pending_node (vm=0x7ffff6432b80
<vlib_global_main>, pending_frame_index=6,
last_time_stamp=2715086657730646) at
/home/bvaughan/espace/grid_vpp/src/vlib/main.c:1353
#15 0x00007ffff6173400 in vlib_main_or_worker_loop (vm=0x7ffff6432b80
<vlib_global_main>, is_main=1) at
/home/bvaughan/espace/grid_vpp/src/vlib/main.c:1846
#16 0x00007ffff617518a in vlib_main_loop (vm=0x7ffff6432b80
<vlib_global_main>) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:1974
#17 0x00007ffff6174f57 in vlib_main (vm=0x7ffff6432b80 <vlib_global_main>,
input=0x7fffb6bf3fa8) at /home/bvaughan/espace/grid_vpp/src/vlib/main.c:2220
#18 0x00007ffff61fd84e in thread0 (arg=140737324985216) at
/home/bvaughan/espace/grid_vpp/src/vlib/unix/main.c:660
#19 0x00007ffff55c9c64 in clib_calljmp () at
/home/bvaughan/espace/grid_vpp/src/vppinfra/longjmp.S:123
#20 0x00007fffffffd190 in ?? ()
#21 0x00007ffff61fd3c3 in vlib_unix_main (argc=71, argv=0x71e510) at
/home/bvaughan/espace/grid_vpp/src/vlib/unix/main.c:733
#22 0x00000000004067f6 in main (argc=71, argv=0x71e510) at
/home/bvaughan/espace/grid_vpp/src/vpp/vnet/main.c:293
(gdb)



t->name is set in the TCPEVT_INIT_HANDLER macro which does not appear to be
getting called. I added a print to that macro and it was never hit.


Regarding the main issue of the the tcp dispatch error:

tcp_input_dispatch_buffer:2808: tcp conn 138 disp error state TIME_WAIT
> flags 0x12 SYN ACK
>
>
You mentioned that this could be a threading issue.  I have been running
only with the main thread for all of this testing (no workers).  I plan to
add those, but I want to test for correctness first then add performance.
My understanding is that if I do not have any workers, then VPP runs
everything out of the main thread including the Process Nodes. I am using
the Process Nodes right now without locks since I don't have threading
enabled.  Please let me know if my threading assumption is incorrect.

Also, the client on the other end of the TCP connection is from Ubuntu
18.04. I am testing with ssh/scp/iperf3.

Thanks,
-Bill







On Sun, Nov 22, 2020 at 1:24 PM Florin Coras <fcoras.li...@gmail.com> wrote:

> Hi Bill,
>
> Inline.
>
> On Nov 22, 2020, at 6:17 AM, Bill Vaughan <billvaug...@gmail.com> wrote:
>
> I should have stated that I am running 2009.
>
> On Sun, Nov 22, 2020 at 8:48 AM Bill Vaughan via lists.fd.io <billvaughan=
> gmail....@lists.fd.io> wrote:
>
>> Hi all,
>>
>> I am working on a project where I need to initiate TCP connections from
>> the VPP session layer. In my overnight test runs I am seeing a TCP dispatch
>> error that I am trying to track down. This happens after thousands of test
>> runs. It looks like I am able to issue a Connect call but the SYN-ACK from
>> the other side is coming back to a TCP conn that is in a TIME_WAIT state.
>>
>> tcp_input_dispatch_buffer:2808: tcp conn 138 disp error state TIME_WAIT
>> flags 0x12 SYN ACK
>>
>>
> That’s somewhat unexpected, if the peer is correctly implemented. The only
> thing that comes to mind is to check that your app closes sessions from the
> right threads.
>
>
>> I wanted to get some TCP state machine debug data but it appears the
>> tcp_debug macros are not hooked up.
>>
>
> That’s on purpose because of performance degradation although there’s an
> option of leaving it always on by enabling TCP_DEBUG_ALWAYS and then
> configuring the features you want debugged.
>
> I have enabled all of the compile time debug flags in tcp_debug.h
>> but tcp_evt_track_register() is not getting called and this causes an
>> assert in the elog code. Looking through the TCP connection setup code I
>> don't see where the debug handler macros are getting called
>> (TCP_EVT_OPEN_HANDLER, TCP_EVT_CLOSE_HANDLER, etc).  I tried
>> adding TCP_EVT_OPEN_HANDLER to tcp_session_open()
>> and tcp_evt_track_register() was hit, but then it asserted somewhere else.
>>
>
> TCP_EVT (TCP_EVT_OPEN, tc) is called in tcp_session_open and TCP_EVT
> (TCP_EVT_SYN_RCVD, child, 1) is called in tcp46_listen_inline. That is,
> when connections are set up, that’s when we setup the elog track.
>
> Regards,
> Florin
>
>
>> Before I spend too much more time going down this rabbit hole, I wanted
>> to see if anyone knows why the debug macros are not hooked up.
>>
>> Thanks,
>> -Bill
>>
>>
>>
>>
>>
>>
> 
>
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18128): https://lists.fd.io/g/vpp-dev/message/18128
Mute This Topic: https://lists.fd.io/mt/78430523/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to