Hey liangbaolin, I tried to look into this, but the backtrace is a bit strange, because those line numbers don't match up at all - https://github.com/kamailio/kamailio/blob/5.8.1/src/modules/cdp/receiver.c#L942
Or maybe I'm looking at a different version than you are, or the core is from a different version of the source code? Cheers, -Dragos On Fri, Oct 11, 2024 at 12:27 PM liangbaolin via sr-dev < sr-dev@lists.kamailio.org> wrote: > Description > > hi, I encountered a problem where the CDP module is extremely prone to > process crashes. The following are screenshots of the logs and core files. > I couldn't find the exception code that caused the problem, but I suspect > that the TCP link was properly established, but the peer did not initialize > or handle the exception properly, resulting in an exception when the packet > was parsed incorrectly and disconnected later. In addition, since the > socket is not a normal peer, it will constantly rebuild the chain, but the > CDP does not recognize and process it properly. The socket will continue to > grow, but the number of peers will not increase. > _20241011175120.png (view on web) > <https://github.com/user-attachments/assets/88a0fc31-9ff6-4f46-9394-f49c787850e8> > Troubleshooting Reproduction Debugging Data > > Core was generated by `/usr/sbin/kamailio -f > /etc/kamailio_dra/kamailio_dra.cfg -P /var/run/kamailio_d'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x00007f9860fae336 in cc_acc_client_stateful_sm_process > (s=0x7f9861bd8980, event=21874, msg=0x55720d57c702 <qm_free+8029>) at > acctstatemachine.c:304 > 304 } > (gdb) bt > #0 0x00007f9860fae336 in cc_acc_client_stateful_sm_process > (s=0x7f9861bd8980, event=21874, msg=0x55720d57c702 <qm_free+8029>) at > acctstatemachine.c:304 > #1 0x00007f9860fae391 in atomic_get_and_set_int (var=0x776f6e6b6e752072, > v=32766) at ../../core/mem/../atomic/atomic_x86.h:242 > #2 0x00007f9860fb16eb in disconnect_serviced_peer (sp=0x7f98e28287d0, > locked=0) at receiver.c:232 > #3 0x00007f9860fbd56b in receive_loop (original_peer=0x0) at receiver.c:942 > #4 0x00007f9860fb4c02 in receiver_process (p=0x0) at receiver.c:488 > #5 0x00007f9860f50c80 in diameter_peer_start (blocking=0) at > diameter_peer.c:278 > #6 0x00007f9860f413df in cdp_child_init (rank=0) at cdp_mod.c:274 > #7 0x000055720d3df6c2 in init_mod_child (m=0x7f98e279b180, rank=0) at > core/sr_module.c:920 > #8 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279be50, rank=0) at > core/sr_module.c:912 > #9 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279c2c0, rank=0) at > core/sr_module.c:912 > #10 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279cc40, rank=0) at > core/sr_module.c:912 > #11 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d130, rank=0) at > core/sr_module.c:912 > #12 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d6c0, rank=0) at > core/sr_module.c:912 > #13 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279db10, rank=0) at > core/sr_module.c:912 > #14 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279dff0, rank=0) at > core/sr_module.c:912 > #15 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279fae0, rank=0) at > core/sr_module.c:912 > #16 0x000055720d3dfff0 in init_child (rank=0) at core/sr_module.c:999 > #17 0x000055720d23d70a in main_loop () at main.c:1942 > #18 0x000055720d2488f5 in main (argc=16, argv=0x7ffea4fde838) at main.c:3256 > (gdb) bt full > #0 0x00007f9860fae336 in cc_acc_client_stateful_sm_process > (s=0x7f9861bd8980, event=21874, msg=0x55720d57c702 <qm_free+8029>) at > acctstatemachine.c:304 > x = 0x7f9861b97000 > ret = 441 > rc = 1627304660 > record_type = 32664 > __func__ = "cc_acc_client_stateful_sm_process" > #1 0x00007f9860fae391 in atomic_get_and_set_int (var=0x776f6e6b6e752072, > v=32766) at ../../core/mem/../atomic/atomic_x86.h:242 > No locals. > #2 0x00007f9860fb16eb in disconnect_serviced_peer (sp=0x7f98e28287d0, > locked=0) at receiver.c:232 > __llevel = 0 > __func__ = "disconnect_serviced_peer" > #3 0x00007f9860fbd56b in receive_loop (original_peer=0x0) at receiver.c:942 > __llevel = -1526863824 > rfds = {__fds_bits = {0, 0, 0, 0, 1024, 0 <repeats 11 times>}} > efds = {__fds_bits = {0 <repeats 16 times>}} > tv = {tv_sec = 0, tv_usec = 883496} > n = 1 > max = 298 > cnt = 0 > msg = 0x0 > sp = 0x7f98e28287d0 > sp2 = 0x7f98e2827e30 > p = 0x0 > fd = 295 > fd_exchange_pipe_local = 28 > __func__ = "receive_loop" > #4 0x00007f9860fb4c02 in receiver_process (p=0x0) at receiver.c:488 > __llevel = -990730168 > __func__ = "receiver_process" > #5 0x00007f9860f50c80 in diameter_peer_start (blocking=0) at > diameter_peer.c:278 > pid = 0 > k = 1 > seed = 1112701621 > p = 0x0 > __func__ = "diameter_peer_start" > #6 0x00007f9860f413df in cdp_child_init (rank=0) at cdp_mod.c:274 > __llevel = 0 > __func__ = "cdp_child_init" > #7 0x000055720d3df6c2 in init_mod_child (m=0x7f98e279b180, rank=0) at > core/sr_module.c:920 > ret = 0 > __func__ = "init_mod_child" > #8 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279be50, rank=0) at > core/sr_module.c:912 > ret = 1 > __func__ = "init_mod_child" > #9 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279c2c0, rank=0) at > core/sr_module.c:912 > ret = 0 > __func__ = "init_mod_child" > #10 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279cc40, rank=0) at > core/sr_module.c:912 > ret = 0 > __func__ = "init_mod_child" > ---Type <return> to continue, or q <return> to quit--- > #11 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d130, rank=0) at > core/sr_module.c:912 > ret = 0 > __func__ = "init_mod_child" > #12 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d6c0, rank=0) at > core/sr_module.c:912 > ret = 0 > __func__ = "init_mod_child" > #13 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279db10, rank=0) at > core/sr_module.c:912 > ret = 32766 > __func__ = "init_mod_child" > #14 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279dff0, rank=0) at > core/sr_module.c:912 > ret = 21874 > __func__ = "init_mod_child" > #15 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279fae0, rank=0) at > core/sr_module.c:912 > ret = 12 > __func__ = "init_mod_child" > #16 0x000055720d3dfff0 in init_child (rank=0) at core/sr_module.c:999 > ret = 0 > type = 0x55720d6ece7b "PROC_MAIN" > __func__ = "init_child" > #17 0x000055720d23d70a in main_loop () at main.c:1942 > i = 1639542784 > pid = 50 > si = 0x0 > si_desc = > "\240\233\"\rrU\000\000@\361\367a\230\177\000\000\000\343\375\244\376\177\000\000\327~2\rrU\000\000\000\343\375\244\376\177\000\000\025\t>\r\005\000\000\000\000\000\000\000\037\000\000\000\000;\213\222\213\017\r\202h\r\000\000\000\000\000\000\060\000\000\000\000\000\000\000\240\233\"\rrU\000\000\060\350\375\244\376\177", > '\000' <repeats 18 times>, "\340\342\375\244\376\177\000\000 \351O\rrU\000" > nrprocs = 21874 > woneinit = 0 > __func__ = "main_loop" > #18 0x000055720d2488f5 in main (argc=16, argv=0x7ffea4fde838) at main.c:3256 > cfg_stream = 0x55720ef75260 > c = -1 > r = 0 > tmp = 0x7ffea4fdfee3 "" > tmp_len = 32766 > port = 5060 > proto = 0 > aproto = 0 > ahost = 0x0 > aport = 0 > options = 0x55720d6b3698 > ":f:cm:M:dVIhEeb:B:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" > ret = -1 > seed = 2959679812 > rfd = 4 > debug_save = 0 > debug_flag = 0 > dont_fork_cnt = 2 > n_lst = 0x0 > p = 0x7f996220a3d0 "" > st = {st_dev = 50, st_ino = 2995745, st_nlink = 1, st_mode = 16877, > st_uid = 103, st_gid = 105, __pad0 = 0, st_rdev = 0, st_size = 4096, > st_blksize = 4096, st_blocks = 8, st_atim = {tv_sec = 1726638407, tv_nsec = > 558385502}, st_mtim = {tv_sec = 1727831142, > tv_nsec = 822744119}, st_ctim = {tv_sec = 1727831142, tv_nsec = > 822744119}, __glibc_reserved = {0, 0, 0}} > ---Type <return> to continue, or q <return> to quit--- > l1 = 2048 > tbuf = > "pQ!b\231\177\000\000\070-\000b\231\177\000\000\020\347\375\244\376\177\000\000\367\344\376a\231\177", > '\000' <repeats 18 times>, > "\001\000\000\000\000\000\000\000(W!b\231\177\000\000\000Q!b\231\177\000\000\001\000\000\000\000\000\000\000\300\200 > b\231\177\000\000\017Q\377a\231\177\000\000\020W!b\231\177", '\000' <repeats > 19 times>, > "S\376\244\376\177\000\000\300\212\225\001\000\000\000\000\207\026=a\231\177\000\000`\347\375\244\376\177\000\000\220Q\376\244\376\177\000\000\002\000\000\000\231\177\000\000\000\000\000\000\000\000\000\000\300\346\375\244\376\177\000\000\003\000\000\000\000\000\000\000\260\346\375\244\376\177\000\000\000\000\000\000\000\000\000\000"... > option_index = 0 > long_options = {{name = 0x55720d6b59f6 "help", has_arg = 0, flag = > 0x0, val = 104}, {name = 0x55720d6b087c "version", has_arg = 0, flag = 0x0, > val = 118}, {name = 0x55720d6b59fb "alias", has_arg = 1, flag = 0x0, val = > 1024}, {name = 0x55720d6b5a01 "subst", has_arg = 1, > flag = 0x0, val = 1025}, {name = 0x55720d6b5a07 "substdef", > has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55720d6b5a10 "substdefs", > has_arg = 1, flag = 0x0, val = 1027}, {name = 0x55720d6b5a1a "server-id", > has_arg = 1, flag = 0x0, val = 1028}, { > name = 0x55720d6b5a24 "loadmodule", has_arg = 1, flag = 0x0, val > = 1029}, {name = 0x55720d6b5a2f "modparam", has_arg = 1, flag = 0x0, val = > 1030}, {name = 0x55720d6b5a38 "log-engine", has_arg = 1, flag = 0x0, val = > 1031}, {name = 0x55720d6b5a43 "debug", has_arg = 1, > flag = 0x0, val = 1032}, {name = 0x55720d6b5a49 "cfg-print", > has_arg = 0, flag = 0x0, val = 1033}, {name = 0x55720d6b5a53 "atexit", > has_arg = 1, flag = 0x0, val = 1034}, {name = 0x55720d6b5a5a "all-errors", > has_arg = 0, flag = 0x0, val = 1035}, {name = 0x0, > has_arg = 0, flag = 0x0, val = 0}} > __func__ = "main" > (gdb) > (gdb) info locals > cfg_stream = 0x55720ef75260 > c = -1 > r = 0 > tmp = 0x7ffea4fdfee3 "" > tmp_len = 32766 > port = 5060 > proto = 0 > aproto = 0 > ahost = 0x0 > aport = 0 > options = 0x55720d6b3698 > ":f:cm:M:dVIhEeb:B:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" > ret = -1 > seed = 2959679812 > rfd = 4 > debug_save = 0 > debug_flag = 0 > dont_fork_cnt = 2 > n_lst = 0x0 > p = 0x7f996220a3d0 "" > st = {st_dev = 50, st_ino = 2995745, st_nlink = 1, st_mode = 16877, st_uid = > 103, st_gid = 105, __pad0 = 0, st_rdev = 0, st_size = 4096, st_blksize = > 4096, st_blocks = 8, st_atim = {tv_sec = 1726638407, tv_nsec = 558385502}, > st_mtim = {tv_sec = 1727831142, tv_nsec = 822744119}, > st_ctim = {tv_sec = 1727831142, tv_nsec = 822744119}, __glibc_reserved = > {0, 0, 0}} > l1 = 2048 > tbuf = > "pQ!b\231\177\000\000\070-\000b\231\177\000\000\020\347\375\244\376\177\000\000\367\344\376a\231\177", > '\000' <repeats 18 times>, > "\001\000\000\000\000\000\000\000(W!b\231\177\000\000\000Q!b\231\177\000\000\001\000\000\000\000\000\000\000\300\200 > b\231\177\000\000\017Q\377a\231\177\000\000\020W!b\231\177", '\000' <repeats > 19 times>, > "S\376\244\376\177\000\000\300\212\225\001\000\000\000\000\207\026=a\231\177\000\000`\347\375\244\376\177\000\000\220Q\376\244\376\177\000\000\002\000\000\000\231\177\000\000\000\000\000\000\000\000\000\000\300\346\375\244\376\177\000\000\003\000\000\000\000\000\000\000\260\346\375\244\376\177\000\000\000\000\000\000\000\000\000\000"... > option_index = 0 > long_options = {{name = 0x55720d6b59f6 "help", has_arg = 0, flag = 0x0, val = > 104}, {name = 0x55720d6b087c "version", has_arg = 0, flag = 0x0, val = 118}, > {name = 0x55720d6b59fb "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = > 0x55720d6b5a01 "subst", has_arg = 1, > flag = 0x0, val = 1025}, {name = 0x55720d6b5a07 "substdef", has_arg = 1, > flag = 0x0, val = 1026}, {name = 0x55720d6b5a10 "substdefs", has_arg = 1, > flag = 0x0, val = 1027}, {name = 0x55720d6b5a1a "server-id", has_arg = 1, > flag = 0x0, val = 1028}, { > name = 0x55720d6b5a24 "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, > {name = 0x55720d6b5a2f "modparam", has_arg = 1, flag = 0x0, val = 1030}, > {name = 0x55720d6b5a38 "log-engine", has_arg = 1, flag = 0x0, val = 1031}, > {name = 0x55720d6b5a43 "debug", has_arg = 1, > flag = 0x0, val = 1032}, {name = 0x55720d6b5a49 "cfg-print", has_arg = 0, > flag = 0x0, val = 1033}, {name = 0x55720d6b5a53 "atexit", has_arg = 1, flag = > 0x0, val = 1034}, {name = 0x55720d6b5a5a "all-errors", has_arg = 0, flag = > 0x0, val = 1035}, {name = 0x0, has_arg = 0, > flag = 0x0, val = 0}} > __func__ = "main" > (gdb) list > 299 if(s) { > 300 AAASessionsUnlock(s->hash); > 301 } > 302 > 303 return ret; > 304 } > > Log Messages > > 2024-10-11T08:23:20.234947572Z 21(60) ERROR: cdp [receiver.c:783]: > receive_loop(): select_recv(): Bad file descriptor > 2024-10-11T08:23:24.906121404Z 21(60) ERROR: cdp [receiver.c:783]: > receive_loop(): select_recv(): Bad file descriptor > 2024-10-11T08:23:41.857946233Z 21(60) ERROR: cdp [receiver.c:783]: > receive_loop(): select_recv(): Bad file descriptor > 2024-10-11T08:25:18.095639136Z 25(64) WARNING: cdp [peermanager.c:337]: > peer_timer(): Inactivity on peer [scscf32.ims.mnc011.mcc460.3gppnetwork.org] > and no DWA, Closing peer... > 2024-10-11T08:43:38.138243609Z 31(70) CRITICAL: <core> [core/pass_fd.c:281]: > receive_fd(): EOF on 34 > 2024-10-11T08:43:47.476315811Z 0(39) ALERT: <core> [main.c:805]: > handle_sigs(): child process 60 exited by a signal 11 > 2024-10-11T08:43:47.476380054Z 0(39) ALERT: <core> [main.c:809]: > handle_sigs(): core was generated > 2024-10-11T08:43:47.503780368Z 0(39) CRITICAL: cdp [diameter_peer.c:447]: > diameter_peer_destroy(): destroy_diameter_peer(): Bye Bye from C Diameter > Peer test > > > > > - *Operating System*: > > version: kamailio 5.8.1 (x86_64/linux) 07b761 > flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, > DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, > Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, > FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, > USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED > ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, > MAX_SEND_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT > PKG_SIZE 8MB > poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. > id: 07b761 > compiled on 10:00:57 Oct 9 2024 with gcc 7.5.0 > > — > Reply to this email directly, view it on GitHub > <https://github.com/kamailio/kamailio/issues/3999>, or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ABO7UZKVG2MSCXKHIKVVDOTZ26RMPAVCNFSM6AAAAABPYT6WY6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU4DCMBRGM2TENA> > . > You are receiving this because you are subscribed to this thread.Message > ID: <kamailio/kamailio/issues/3...@github.com> > _______________________________________________ > Kamailio (SER) - Development Mailing List > To unsubscribe send an email to sr-dev-le...@lists.kamailio.org >
_______________________________________________ Kamailio (SER) - Development Mailing List To unsubscribe send an email to sr-dev-le...@lists.kamailio.org