Hello,
the logs show that the core file was generated:
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: ALERT: <core>
[main.c:778]: handle_sigs(): core was generated
Unless you deleted them, then you should have them somewhere on the file
system.
mem_safety won't affect any kind of processing - it is just a protection
for a double call of free() function. Actually, it is turned on with
f_malloc memory manager which was the default in the past. Now we use
q_malloc as default which is kind of f_malloc with defrag option. We
should make this back on by default, but catching double free() is also
good to solve anyhow.
Cheers,
Daniel
On 09/04/14 17:15, Samuel Ware wrote:
Daniel,
Here is the version information.
r...@tel-vc-fs03.telariscom.com> kamailio -v
version: kamailio 4.1.2 (x86_64/linux) 73ea61
flags: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS,
DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC,
DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE,
USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16,
MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 73ea61
compiled on 20:41:21 Apr 1 2014 with gcc 4.4.7
We don’t see this happen in the lab environment. It only happens in production
so we are little concern about it failing. Unfortunately, we don’t have any
core dumps of this occurring; we didn’t have that feature enabled when this
happened.
By enabling “men_safety”, will this affect the processing of any of other
signaling messages when an error occurs or will it only affect the transaction
in which the error occurred. There isn’t much detail in the description of
this flag in the documentation to layout how it affects the overall system in
an otherwise “abort” situation. Your assistance is appreciated and I hope that
it doesn’t sound like we are being difficult but this element is setting in
front of our entire network and failure causes us to block all calls. We were
running close to 800 CPS through it when it failed. We don’t see any load on
the machine that would indicate that the issue isn’t software based. If the
“men_safety” will not affect the rest of the traffic, we can add this as well
as enabling core dumps and try it again when we can closely monitor the system.
Please provide the additional details about “men_safety” so we can assess the
situation. Is there any other information that I can provide to assist your
input like the part of the config file or what else?
Sam
On Apr 9, 2014, at 2:27 AM, Daniel-Constantin Mierla <mico...@gmail.com> wrote:
Can you get the output of kamailio -v?
Also, send the backtrace for the corefiles taken with gdb - the logs show that
the core were generated (check / or the value of -w parameter):
gdb /usr/local/sbin/kamailio /path/to/core
bt full
To protect against such cases, you can use in config file:
mem_safety=1
Cheers,
Daniel
On 08/04/14 17:34, Samuel Ware wrote:
Here are the logs from the first crash
Apr 3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: ERROR: tm
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on
branch 0 failed
Apr 3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: ERROR: sl
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately
error on sending to next hop occurred (477/SL)
Apr 3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7feac6836310,65572,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: tm
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on
branch 0 failed
Apr 3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: sl
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately
error on sending to next hop occurred (477/SL)
Apr 3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7feaace02c98,65533,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: tm
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on
branch 0 failed
Apr 3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: sl
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately
error on sending to next hop occurred (477/SL)
Apr 3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7feac6836310,65569,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: tm
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on
branch 0 failed
Apr 3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: sl
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately
error on sending to next hop occurred (477/SL)
Apr 3 13:14:32 tel-vc-fs03 /usr/local/sbin/kamailio[7589]: : <core>
[mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed pointer (0x7feaca6c9738),
called from <core>: mem/shm_mem.c: sh_realloc(88), first free <core>: mem/shm_mem.c:
sh_realloc(88) - aborting
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: ALERT: <core>
[main.c:775]: handle_sigs(): child process 7589 exited by a signal 6
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: ALERT: <core>
[main.c:778]: handle_sigs(): core was generated
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: INFO: <core>
[main.c:790]: handle_sigs(): INFO: terminating due to SIGCHLD
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7604]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7597]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7603]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7601]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Logs from second crash
Apr 3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7f34ec4f6ed8,65569,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: tm
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on
branch 0 failed
Apr 3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: sl
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately
error on sending to next hop occurred (477/SL)
Apr 3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7f35242e5d28,65535,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: tm
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on
branch 0 failed
Apr 3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: sl
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately
error on sending to next hop occurred (477/SL)
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7f3516fb6720,65578,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: tm
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on
branch 0 failed
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7f37d8e91088,65522,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: sl
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: sl
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately
error on sending to next hop occurred (477/SL)
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7f37d8e91088,65522,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3961]: ERROR: <core>
[udp_server.c:591]: udp_send(): ERROR: udp_send:
sendto(sock,0x7f37d81fa960,65522,0,205.251.172.14:5060,16): Message too long(90)
Apr 3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3961]: ERROR: tm
[../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
Apr 3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3974]: : <core>
[mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed pointer (0x7f34d7d6b720),
called from <core>: mem/shm_mem.c: sh_realloc(88), first free <core>: mem/shm_mem.c:
sh_realloc(88) - aborting
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: ALERT: <core>
[main.c:775]: handle_sigs(): child process 3974 exited by a signal 6
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: ALERT: <core>
[main.c:778]: handle_sigs(): core was generated
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: INFO: <core>
[main.c:790]: handle_sigs(): INFO: terminating due to SIGCHLD
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3985]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3983]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3966]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3988]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3972]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
Apr 3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3984]: INFO: <core>
[main.c:841]: sig_usr(): INFO: signal 15 received
_______________________________________________
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
sr-users@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
--
Daniel-Constantin Mierla - http://www.asipto.com
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
--
Daniel-Constantin Mierla - http://www.asipto.com
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
_______________________________________________
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
sr-users@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users