Hi Daniel, thank you for having a look. Regard your remarks: 1) when the cloned transaction structure is passed to ts_onreply, we take the pointer to its urecord and from this last one we take the transaction list. So, the transaction list should still be coherent. We could have a problem if the urecord does not exist anymore (even if I'm wondering how this could happen since a urecord is removed only when receiving a TMCB_DESTROY event for the last transaction of a urecord). I could change this to calculate the hash_id from the ruri in the cloned transaction, search for it in the hash slot and only if found walk through the transactions.
2) I checked and it looks to me that we always lock the entry (either with lock_entry_by_ruri or with lock_entry) before performing an operation on the table (records or transactions). Did I miss a place where this is not happening (can happen looking the same code over and over again :))? I'll keep looking at it. Regards, Federico On Fri, Sep 11, 2015 at 9:10 AM, Daniel-Constantin Mierla <mico...@gmail.com > wrote: > Hi Federico, > > didn't get the time to write a more detailed report, but my quick look at > the tsilo code revealed two potential issues: > > - the ts transaction structure is cloned in shared memory to be passed as > parameter to a tm callback. The clone is stil linked to the list with prev > and next. In the callback, if I got it properly, it starts walking through > the list, but the list chould have been updated and prev/next can point to > invalid data structure now. > > - there is a hash table that has locks for each slot, but those locks are > not use and parallel operations (add/remove) can be done in the hash table > by different kamailio processes > > When I get more time I will try to check again and see if those > suppositions are valid. Meanwhile, maybe you can check as well. > > Cheers, > Daniel > > > On 10/09/15 19:33, Federico Cabiddu wrote: > > Hi Thibault, > have you tried last tsilo code from 4.3.x branch? > Recently there has been a fix ( > https://github.com/kamailio/kamailio/commit/6ce6803d57dabe287d7d6fa859e93c1df402d821) > for an issue that may be related to yours. > I'll keep investigating to see if I can spot something else. In the > meanwhile could you describe your scenario? Are you storing multiple > transactions per ruri? Did any of them got a final reply before the crash? > > Regards, > > Federico > > On Thu, Sep 10, 2015 at 3:00 PM, Daniel-Constantin Mierla < > <mico...@gmail.com>mico...@gmail.com> wrote: > >> >> >> On 10/09/15 14:36, Thibault Gueslin wrote: >> >> >> >> 2015-09-10 14:25 GMT+02:00 Daniel-Constantin Mierla <mico...@gmail.com>: >> >>> Do you have msrp enabled in configuration file >>> >> >> I don't think so >> >> >> The last frames of backtrace indicates code related to msrp, but might be >> just some code lines mismatching. >> >> The issue seems to be in tsilo. I looke over the code and I spotted some >> "unclear" mechanisms that can lead to race conditions, which may result in >> invalid access to memory, as it happens in this case, ptr becomes >> 0x8b08578b49642454 -- from my short investigation, that is likely to be due >> to following a ->next field in a freed structure. >> >> Not being the author of tsilo module, I can't do much more right now. I >> will open an issue on bug tracker explaining what I found, assigning >> Federico (cc-ed, author of the module) to analyze and see if anything is >> wrong there. >> >> Cheers, >> Daniel >> >> >> >> >>> Send the output from gdb for next commands: >>> >>> frame 0 >>> >> >> #0 ts_append (msg=0x7f883bfd5490, ruri=0x7fff65fbfe30, >> table=0x7f883bf7e390 "location") at ts_append.c:60 >> 60 in ts_append.c >> >> >>> info locals >>> >> _r = 0x7f88339bcfd0 >> >> ptr = 0x8b08578b49642454 >> >> res = 2 >> __FUNCTION__ = "ts_append" >> >> >>> p *msg >>> >> >> >> >>> p *ruri >>> p *ptr >>> p *_r >>> >>> >> | p *msg >> >> $21 = {id = 2, pid = 31171, tval = {tv_sec = 1441885042, tv_usec = >> 162339}, fwd_send_flags = {f = 0 '\000', blst_imask = 0 '\000'}, >> rpl_send_flags = {f = 0 '\000', blst_imask = 0 '\000'}, first_line = {type >> = 1, >> flags = 1, len = 46, u = {request = {method = { >> s = 0x7f88339b8910 "REGISTER sip:sip-staging.serveur.com >> SIP/2.0\r\nVia: SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expi"..., >> len = 8}, uri = { >> s = 0x7f88339b8919 "sip:sip-staging.serveur.com >> SIP/2.0\r\nVia: SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expires=900\r\n"..., >> len = 27}, version = { >> s = 0x7f88339b8935 "SIP/2.0\r\nVia: SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expires=900\r\nMax-Forwards: >> 69\r\nTo: <sip:t"..., len = 7}, method_value = 32}, reply = {version = { >> s = 0x7f88339b8910 "REGISTER sip:sip-staging.serveur.com >> SIP/2.0\r\nVia: SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expi"..., >> len = 8}, status = { >> s = 0x7f88339b8919 "sip:sip-staging.serveur.com >> SIP/2.0\r\nVia: SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expires=900\r\n"..., >> len = 27}, reason = { >> s = 0x7f88339b8935 "SIP/2.0\r\nVia: SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expires=900\r\nMax-Forwards: >> 69\r\nTo: <sip:t"..., len = 7}, statuscode = 32}}}, via1 = >> 0x7f883bfa7020, via2 = 0x0, headers = 0x7f883bfa6f30, last_header = >> 0x7f883bfac220, parsed_flag = 18446744073709551615, h_via1 = >> 0x7f883bfa6f30, >> h_via2 = 0x0, callid = 0x7f883bfac400, to = 0x7f883bfac688, cseq = >> 0x7f883bfac388, from = 0x7f883bfa6fa8, contact = 0x7f883bfac778, >> maxforwards = 0x7f883bfac700, route = 0x0, record_route = 0x0, >> content_type = 0x0, content_length = 0x7f883bfac220, authorization = >> 0x0, expires = 0x0, proxy_auth = 0x0, supported = 0x7f883bfb6300, require = >> 0x0, proxy_require = 0x0, unsupported = 0x0, >> allow = 0x7f883bfac298, event = 0x0, accept = 0x0, accept_language = >> 0x0, organization = 0x0, priority = 0x0, subject = 0x0, user_agent = >> 0x7f883bfac310, server = 0x0, content_disposition = 0x0, diversion = 0x0, >> rpid = 0x0, refer_to = 0x0, session_expires = 0x0, min_se = 0x0, >> sipifmatch = 0x0, subscription_state = 0x0, date = 0x0, identity = 0x0, >> identity_info = 0x0, pai = 0x0, ppi = 0x0, path = 0x0, privacy = 0x0, >> body = 0x0, eoh = 0x7f88339b8b2b "\r\n", unparsed = 0x7f88339b8b2b >> "\r\n", rcv = {src_ip = {af = 2, len = 4, u = {addrl = {3334267998, 0}, >> addr32 = {3334267998, 0, 0, 0}, addr16 = {58462, 50876, 0, 0, 0, 0, 0, >> 0}, addr = "^\344\274\306", '\000' <repeats 11 times>}}, dst_ip >> = {af = 2, len = 4, u = {addrl = {2667915013, 0}, addr32 = {2667915013, 0, >> 0, 0}, addr16 = {9989, 40709, 0, 0, 0, 0, 0, 0}, >> addr = "\005'\005\237", '\000' <repeats 11 times>}}, src_port = >> 52243, dst_port = 5060, proto_reserved1 = 3, proto_reserved2 = 0, src_su = >> {s = {sa_family = 2, >> sa_data = "\314\023^\344\274\306\000\000\000\000\000\000\000"}, >> sin = {sin_family = 2, sin_port = 5068, sin_addr = {s_addr = 3334267998}, >> sin_zero = "\000\000\000\000\000\000\000"}, sin6 = { >> sin6_family = 2, sin6_port = 5068, sin6_flowinfo = 3334267998, >> sin6_addr = {__in6_u = {__u6_addr8 = "\000\000\000\000\000\000\000\000 >> \206\233\063\210\177\000", __u6_addr16 = {0, 0, 0, 0, 34336, 13211, >> 32648, 0}, __u6_addr32 = {0, 0, 865830432, 32648}}}, >> sin6_scope_id = 865796800}}, bind_address = 0x7f883bfb6bd0, proto = 2 >> '\002'}, >> buf = 0x7f88339b8910 "REGISTER sip:sip-staging.serveur.com >> SIP/2.0\r\nVia: SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expi"..., >> len = 541, new_uri = {s = 0x0, len = 0}, dst_uri = {s = 0x0, len = 0}, >> parsed_uri_ok = 1, parsed_uri = {user = {s = 0x0, len = 0}, passwd = {s = >> 0x0, len = 0}, host = { >> s = 0x7f88339b891d "sip-staging.serveur.com SIP/2.0\r\nVia: >> SIP/2.0/TCP >> 172.16.224.222:52242;branch=z9hG4bK20965be26f6a6324;rport\r\nContact: >> <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:52242;transport=tcp>;expires=900\r\nMax-"..., >> len = 23}, port = {s = 0x0, len = 0}, params = {s = 0x0, len = 0}, >> sip_params = {s = 0x0, len = 0}, headers = {s = 0x0, len = 0}, port_no = 0, >> proto = 0, type = SIP_URI_T, >> flags = (unknown: 0), transport = {s = 0x0, len = 0}, ttl = {s = 0x0, >> len = 0}, user_param = {s = 0x0, len = 0}, maddr = {s = 0x0, len = 0}, >> method = {s = 0x0, len = 0}, lr = {s = 0x0, len = 0}, r2 = {s = 0x0, >> len = 0}, gr = {s = 0x0, len = 0}, transport_val = {s = 0x0, len = >> 0}, ttl_val = {s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, >> maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, >> lr_val = {s = 0x0, len = 0}, r2_val = {s = 0x0, len = 0}, gr_val = {s >> = 0x0, len = 0}}, parsed_orig_ruri_ok = 0, parsed_orig_ruri = {user = {s = >> 0x0, len = 0}, passwd = {s = 0x0, len = 0}, host = {s = 0x0, >> len = 0}, port = {s = 0x0, len = 0}, params = {s = 0x0, len = 0}, >> sip_params = {s = 0x0, len = 0}, headers = {s = 0x0, len = 0}, port_no = 0, >> proto = 0, type = ERROR_URI_T, flags = (unknown: 0), transport = { >> s = 0x0, len = 0}, ttl = {s = 0x0, len = 0}, user_param = {s = 0x0, >> len = 0}, maddr = {s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s >> = 0x0, len = 0}, r2 = {s = 0x0, len = 0}, gr = {s = 0x0, >> len = 0}, transport_val = {s = 0x0, len = 0}, ttl_val = {s = 0x0, >> len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = {s = 0x0, len = >> 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 0}, >> r2_val = {s = 0x0, len = 0}, gr_val = {s = 0x0, len = 0}}, add_rm = >> 0x0, body_lumps = 0x0, reply_lump = 0x7f883bfa3878, add_to_branch_s = >> '\000' <repeats 57 times>, add_to_branch_len = 0, hash_index = 39024, >> msg_flags = 129, flags = 32, set_global_address = {s = 0x0, len = 0}, >> set_global_port = {s = 0x0, len = 0}, force_send_socket = 0x0, path_vec = >> {s = 0x0, len = 0}, instance = {s = 0x0, len = 0}, reg_id = 0, >> ruid = {s = 0x0, len = 0}, location_ua = {s = 0x0, len = 0}, ldv = >> {flow = {decoded = 0, rcv = {src_ip = {af = 0, len = 0, u = {addrl = {0, >> 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, >> addr = '\000' <repeats 15 times>}}, dst_ip = {af = 0, len = >> 0, u = {addrl = {0, 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, >> 0, 0}, addr = '\000' <repeats 15 times>}}, src_port = 0, >> dst_port = 0, proto_reserved1 = 0, proto_reserved2 = 0, src_su = >> {s = {sa_family = 0, sa_data = '\000' <repeats 13 times>}, sin = >> {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, >> sin_zero = "\000\000\000\000\000\000\000"}, sin6 = >> {sin6_family = 0, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = >> {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, >> 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = >> 0}}, bind_address = 0x0, proto = 0 '\000'}}}} >> >> >> >> (gdb) p *ruri >> $22 = {s = 0x7f883bf317e8 "toto4.toto.com.Ipod_tgu", len = 23} >> >> (gdb) p *ptr >> Cannot access memory at address 0x8b08578b49642454 >> >> (gdb) p *_r >> $23 = {ruri = {s = 0x7f88339bd040 "toto4.toto.com.Ipod_tgu", len = 23}, >> rurihash = 164669906, entry = 0x7f883398af08, transactions = >> 0x7f88339bd090, next = 0x0, prev = 0x0} >> >> >>> >>> Cheers, >>> Daniel >>> >>> >>> On 10/09/15 14:07, Thibault Gueslin wrote: >>> >>> [Thread debugging using libthread_db enabled] >>> >>> Using host libthread_db library >>> "/lib/x86_64-linux-gnu/libthread_db.so.1". >>> >>> Core was generated by `/usr/sbin/kamailio -f /etc/kamailio/kamailio.cfg >>> -P /var/run/kamailio/kamailio.'. >>> >>> Program terminated with signal SIGSEGV, Segmentation fault. >>> >>> #0 ts_append (msg=0x7f883bfd5490, ruri=0x7fff65fbfe30, >>> table=0x7f883bf7e390 "location") at ts_append.c:60 >>> >>> 60 ts_append.c: No such file or directory. >>> >>> (gdb) bt >>> >>> #0 ts_append (msg=0x7f883bfd5490, ruri=0x7fff65fbfe30, >>> table=0x7f883bf7e390 "location") at ts_append.c:60 >>> >>> #1 0x00007f8837f638fc in w_ts_append (_msg=0x7f883bfd5490, >>> _table=0x7f883bf7e390 "location", _ruri=0x7f88339aa634 "\002") at >>> tsilo.c:225 >>> >>> #2 0x0000000000534db0 in do_action (h=0x7fff65fc1a40, a=0x7f883bf7d278, >>> msg=0x7f883bfd5490) at action.c:1059 >>> >>> #3 0x00000000005339e8 in run_actions (h=0x7f8837f65540, a=0x5, >>> msg=0x7f88339aa634) at action.c:1548 >>> >>> #4 0x00000000005352e6 in do_action (h=0x7fff65fc1a40, a=0x7f883bf7dc50, >>> msg=0x7f883bfd5490) at action.c:1048 >>> >>> #5 0x00000000005339e8 in run_actions (h=0x7f8837f65540, a=0x5, >>> msg=0x7f88339aa634) at action.c:1548 >>> >>> #6 0x00000000005352e6 in do_action (h=0x7fff65fc1a40, a=0x7f883bf7df48, >>> msg=0x7f883bfd5490) at action.c:1048 >>> >>> #7 0x00000000005339e8 in run_actions (h=0x7f8837f65540, a=0x5, >>> msg=0x7f88339aa634) at action.c:1548 >>> >>> #8 0x00000000005354a3 in do_action (h=0x7fff65fc1a40, a=0x7f883bfa8ab8, >>> msg=0x7f883bfd5490) at action.c:677 >>> >>> #9 0x00000000005339e8 in run_actions (h=0x7f8837f65540, a=0x5, >>> msg=0x7f88339aa634) at action.c:1548 >>> >>> #10 0x00000000005354a3 in do_action (h=0x7fff65fc1a40, a=0x7f883bfa8ab8, >>> msg=0x7f883bfd5490) at action.c:677 >>> >>> #11 0x00000000005339e8 in run_actions (h=0x7f8837f65540, a=0x5, >>> msg=0x7f88339aa634) at action.c:1548 >>> >>> #12 0x00000000005352e6 in do_action (h=0x7fff65fc1a40, a=0x7f883bf56068, >>> msg=0x7f883bfd5490) at action.c:1048 >>> >>> #13 0x00000000005339e8 in run_actions (h=0x7f8837f65540, >>> h@entry=0x7fff65fc1a40, >>> a=0x5, a@entry=0x7f883bf4f958, msg=0x7f88339aa634, msg@entry=0x7f883bfd5490) >>> at action.c:1548 >>> >>> #14 0x000000000053f885 in run_top_route (a=0x7f883bf4f958, >>> msg=0x7f883bfd5490, c=<optimized out>) at action.c:1634 >>> >>> #15 0x00000000005407e6 in receive_msg (buf=0x0, len=5, >>> rcv_info=0x7f88339b8638) at receive.c:196 >>> >>> #16 0x00000000005b806b in tcp_read_req (con=0x7f88339b8620, >>> bytes_read=0x7fff65fc1d50, read_flags=0x7fff65fc1d54) at tcp_read.c:1382 >>> >>> #17 0x00000000005bb4e1 in handle_io (fm=0xb, events=5, idx=865773108) at >>> tcp_read.c:1568 >>> >>> #18 0x00000000005c191a in io_wait_loop_epoll (h=<optimized out>, >>> t=<optimized out>, repeat=<optimized out>) at io_wait.h:1061 >>> >>> #19 tcp_receive_loop (unix_sock=938890560) at tcp_read.c:1733 >>> >>> #20 0x00000000004d3447 in tcp_init_children () at tcp_main.c:4787 >>> >>> #21 0x0000000000506863 in main_loop () at main.c:1658 >>> >>> #22 0x000000000041b944 in main (argc=0, argv=0x0) at main.c:2533 >>> >>> >>> It seems to be on reception of REGISTER from client. >>> >>> >>> >>> >>> -- >>> Daniel-Constantin Mierlahttp://twitter.com/#!/miconda - >>> http://www.linkedin.com/in/miconda >>> Book: SIP Routing With Kamailio - http://www.asipto.com >>> Kamailio Advanced Training, Sep 28-30, 2015, in Berlin - >>> http://asipto.com/u/kat >>> >>> >> >> -- >> Daniel-Constantin Mierlahttp://twitter.com/#!/miconda - >> http://www.linkedin.com/in/miconda >> Book: SIP Routing With Kamailio - http://www.asipto.com >> Kamailio Advanced Training, Sep 28-30, 2015, in Berlin - >> http://asipto.com/u/kat >> >> > > -- > Daniel-Constantin Mierlahttp://twitter.com/#!/miconda - > http://www.linkedin.com/in/miconda > Book: SIP Routing With Kamailio - http://www.asipto.com > Kamailio Advanced Training, Sep 28-30, 2015, in Berlin - > http://asipto.com/u/kat > >
_______________________________________________ SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users