Daniel,

I was unable to locate them.  We made the mem_safety change and put the machine 
back into Production.  It has been running for 5 hours without crashing.  The 
only error that seems different than before is the following

Apr 10 17:36:29 tel-vc-fs03 /usr/local/sbin/kamailio[31365]: ERROR: tm 
[t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
branch 0 failed
Apr 10 17:36:29 tel-vc-fs03 /usr/local/sbin/kamailio[31365]: ERROR: sl 
[sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: Unfortunately 
error on sending to next hop occurred (477/SL)
Apr 10 17:37:02 tel-vc-fs03 /usr/local/sbin/kamailio[31368]: : <core> 
[mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed pointer 
(0x7f8c19c22f10), called from tm: h_table.c: free_cell(186), first free tm: 
h_table
.c: free_cell(186) - aborting


 
On Apr 9, 2014, at 12:34 PM, Daniel-Constantin Mierla <mico...@gmail.com> wrote:

> Hello,
> 
> the logs show that the core file was generated:
> 
> Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: ALERT: <core> 
> [main.c:778]: handle_sigs(): core was generated
> 
> Unless you deleted them, then you should have them somewhere on the file 
> system.
> 
> mem_safety won't affect any kind of processing - it is just a protection for 
> a double call of free() function. Actually, it is turned on with f_malloc 
> memory manager which was the default in the past. Now we use q_malloc as 
> default which is kind of f_malloc with defrag option. We should make this 
> back on by default, but catching double free() is also good to solve anyhow.
> 
> Cheers,
> Daniel
> 
> On 09/04/14 17:15, Samuel Ware wrote:
>> Daniel,
>> 
>> Here is the version information.
>> 
>>      r...@tel-vc-fs03.telariscom.com> kamailio -v
>>      version: kamailio 4.1.2 (x86_64/linux) 73ea61
>>      flags: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, 
>> DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, 
>> DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, 
>> USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
>>      ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, 
>> MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB
>>      poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
>>      id: 73ea61
>>      compiled on 20:41:21 Apr  1 2014 with gcc 4.4.7
>> 
>> We don’t see this happen in the lab environment.  It only happens in 
>> production so we are little concern about it failing.  Unfortunately, we 
>> don’t have any core dumps of this occurring; we didn’t have that feature 
>> enabled when this happened.
>> 
>> By enabling “men_safety”, will this affect the processing of any of other 
>> signaling messages when an error occurs or will it only affect the 
>> transaction in which the error occurred.  There isn’t much detail in the 
>> description of this flag in the documentation to layout how it affects the 
>> overall system in an otherwise “abort” situation.  Your assistance is 
>> appreciated and I hope that it doesn’t sound like we are being difficult but 
>> this element is setting in front of our entire network and failure causes us 
>> to block all calls.  We were running close to 800 CPS through it when it 
>> failed.  We don’t see any load on the machine that would indicate that the 
>> issue isn’t software based.  If the “men_safety” will not affect the rest of 
>> the traffic, we can add this as well as enabling core dumps and try it again 
>> when we can closely monitor the system.  Please provide the additional 
>> details about “men_safety” so we can assess the situation.  Is there any 
>> other information that I can provide to assist your input like the part of 
>> the config file or what else?
>> 
>> 
>> 
>> Sam
>> 
>> 
>> On Apr 9, 2014, at 2:27 AM, Daniel-Constantin Mierla <mico...@gmail.com> 
>> wrote:
>> 
>>> Can you get the output of kamailio -v?
>>> 
>>> Also, send the backtrace for the corefiles taken with gdb - the logs show 
>>> that the core were generated (check / or the value of -w parameter):
>>> 
>>> gdb /usr/local/sbin/kamailio /path/to/core
>>> bt full
>>> 
>>> To protect against such cases, you can use in config file:
>>> 
>>> mem_safety=1
>>> 
>>> Cheers,
>>> Daniel
>>> 
>>> On 08/04/14 17:34, Samuel Ware wrote:
>>>> Here are the logs from the first crash
>>>> 
>>>>   Apr  3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: ERROR: tm 
>>>> [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
>>>> branch 0 failed
>>>>    Apr  3 13:13:59 tel-vc-fs03 /usr/local/sbin/kamailio[7602]: ERROR: sl 
>>>> [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
>>>> Unfortunately error on sending to next hop occurred (477/SL)
>>>>    Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7feac6836310,65572,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: tm 
>>>> [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
>>>> branch 0 failed
>>>>    Apr  3 13:14:05 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: ERROR: sl 
>>>> [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
>>>> Unfortunately error on sending to next hop occurred (477/SL)
>>>>    Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7feaace02c98,65533,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: tm 
>>>> [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
>>>> branch 0 failed
>>>>    Apr  3 13:14:20 tel-vc-fs03 /usr/local/sbin/kamailio[7594]: ERROR: sl 
>>>> [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
>>>> Unfortunately error on sending to next hop occurred (477/SL)
>>>>    Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7feac6836310,65569,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: tm 
>>>> [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
>>>> branch 0 failed
>>>>    Apr  3 13:14:27 tel-vc-fs03 /usr/local/sbin/kamailio[7583]: ERROR: sl 
>>>> [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
>>>> Unfortunately error on sending to next hop occurred (477/SL)
>>>>    Apr  3 13:14:32 tel-vc-fs03 /usr/local/sbin/kamailio[7589]: : <core> 
>>>> [mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed 
>>>> pointer (0x7feaca6c9738), called from <core>: mem/shm_mem.c: 
>>>> sh_realloc(88), first free <core>: mem/shm_mem.c: sh_realloc(88) - aborting
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: ALERT: 
>>>> <core> [main.c:775]: handle_sigs(): child process 7589 exited by a signal 6
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: ALERT: 
>>>> <core> [main.c:778]: handle_sigs(): core was generated
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7553]: INFO: 
>>>> <core> [main.c:790]: handle_sigs(): INFO: terminating due to SIGCHLD
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7604]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7597]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7579]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7603]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 13:14:55 tel-vc-fs03 /usr/local/sbin/kamailio[7601]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>> 
>>>> 
>>>>  Logs from second crash
>>>> 
>>>> Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: <core> 
>>>> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f34ec4f6ed8,65569,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: tm 
>>>> [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
>>>> branch 0 failed
>>>>    Apr  3 14:38:42 tel-vc-fs03 /usr/local/sbin/kamailio[3979]: ERROR: sl 
>>>> [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
>>>> Unfortunately error on sending to next hop occurred (477/SL)
>>>>    Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f35242e5d28,65535,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: tm 
>>>> [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
>>>> branch 0 failed
>>>>    Apr  3 14:38:44 tel-vc-fs03 /usr/local/sbin/kamailio[3965]: ERROR: sl 
>>>> [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
>>>> Unfortunately error on sending to next hop occurred (477/SL)
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f3516fb6720,65578,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: tm 
>>>> [t_fwd.c:1609]: t_send_branch(): ERROR: t_send_branch: sending request on 
>>>> branch 0 failed
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f37d8e91088,65522,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: sl 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: sl 
>>>> [sl_funcs.c:387]: sl_reply_error(): ERROR: sl_reply_error used: 
>>>> Unfortunately error on sending to next hop occurred (477/SL)
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f37d8e91088,65522,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 14:38:57 tel-vc-fs03 /usr/local/sbin/kamailio[3968]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3961]: ERROR: 
>>>> <core> [udp_server.c:591]: udp_send(): ERROR: udp_send: 
>>>> sendto(sock,0x7f37d81fa960,65522,0,205.251.172.14:5060,16): Message too 
>>>> long(90)
>>>>    Apr  3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3961]: ERROR: tm 
>>>> [../../forward.h:199]: msg_send(): msg_send: ERROR: udp_send failed
>>>>    Apr  3 14:38:58 tel-vc-fs03 /usr/local/sbin/kamailio[3974]: : <core> 
>>>> [mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed 
>>>> pointer (0x7f34d7d6b720), called from <core>: mem/shm_mem.c: 
>>>> sh_realloc(88), first free <core>: mem/shm_mem.c: sh_realloc(88) - aborting
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: ALERT: 
>>>> <core> [main.c:775]: handle_sigs(): child process 3974 exited by a signal 6
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: ALERT: 
>>>> <core> [main.c:778]: handle_sigs(): core was generated
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3935]: INFO: 
>>>> <core> [main.c:790]: handle_sigs(): INFO: terminating due to SIGCHLD
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3985]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3983]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3966]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3988]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3972]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>>    Apr  3 14:39:20 tel-vc-fs03 /usr/local/sbin/kamailio[3984]: INFO: 
>>>> <core> [main.c:841]: sig_usr(): INFO: signal 15 received
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>>>> sr-users@lists.sip-router.org
>>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>> -- 
>>> Daniel-Constantin Mierla - http://www.asipto.com
>>> http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>>> 
> 
> -- 
> Daniel-Constantin Mierla - http://www.asipto.com
> http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
> 
> 


_______________________________________________
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
sr-users@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users

Reply via email to