mtryfoss left a comment (kamailio/kamailio#4191)

First of all, thanks a lot for the comprehensive feedback!

> Seems like there is a dead-lock between the lock on a peer and the one on 
> sessions. There is maybe also a 3rd lock involved, but I don't see that in 
> your changes. And it might be just a side-effect, which would clear if the 
> root cause is fixed.

Yes, there's a lock on the peer list as well. So 3 processes involved, but 
according to my analysis having this order changed should avoid the deadlock - 
but as you say, it might introduce a segfault instead.

> I'm actually surprised that it doesn't crash&burn much more often

Indeed, but this seems to be very load and timing sensitive - combined with 
some specific behaviour of the ims_charging module. For example, the interim 
timer needs to be active. All in all, a nightmare to debug and/or confirm that 
the fix actually worked 😅

> Anyway, long-story short, this seems like a deeper change might be required. 
> If you wish, I'd offer a call (hopefully time-boxed) to help you navigate cdp 
> or maybe to discuss a deeper change for peers and sessions.

I really appreciate this, but I'm probably not the correct person for a 
bigger/deeper change in this and other related modules. 

I'll take look again on the code based on your feedback, and see if it could be 
improved a bit for a short term fix.

--

We're currently running a custom build with this change below on one of the 
clusters, as an immediate work around. In my case, I'd take a segfault over a 
deadlock any day, since the service would just be restarted and continue 
operation.

https://github.com/mtryfoss/kamailio/commit/a573e107b81b3975daf78103b3dc2a58d615c53a

That's just a variant where we copy the hack used for sessions with a sticky 
peer, for all requests. It's been doing fine with a fair amount of traffic for 
several days, for our specific use case.


-- 
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/pull/4191#issuecomment-2765538934
You are receiving this because you are subscribed to this thread.

Message ID: <kamailio/kamailio/pull/4191/c2765538...@github.com>
_______________________________________________
Kamailio - Development Mailing List -- sr-dev@lists.kamailio.org
To unsubscribe send an email to sr-dev-le...@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the 
sender!

Reply via email to