schneuwlym created an issue (kamailio/kamailio#4185)
### Description
This issue is a follow up of the discussion in the [mailing
list](https://lists.kamailio.org/mailman3/hyperkitty/list/sr-us...@lists.kamailio.org/thread/IO63U64NPN2NB3YGJBCOSZTIERTB54CD/)
We are using Kamailio 5.8.4 in a yocto scarthgap based image. We noticed on
multiple nodes, that suddenly phones lost their registrations and the log
showed messages like `send_fd(): sendmsg failed sending 55 on 52: Too many
references: cannot splice (109)`.
After debugging we guess that the `send_fd` messages are just a symptom, not
the cause. Generating a backtrace using `kamctl trap` shows that all TCP
receiver are waiting for a `pthread_rwlock_...` lock.
Our deployments, where we randomly see the issue, are very small; 10-30
registrations.
I've tested kamailio version 5.8.4, 5.8.5, 5.8.6 and 6.0.1 and I was able to
reproduce the issue with all versions.
I've also tested version 5.5.4, but with an older yocto (dunfell) which used
openssl 1.1.1w, **which did not block!**
Yocto Scarthgap uses OpenSSL 3.2.3 and we also updated to version 3.4.1, which
didn't show any difference.
### Troubleshooting
#### Reproduction
Meanwhile we found a way to reproduce the issue within seconds. It requires a
bit of brute force, but it locks kamailio reliable.
We are generating 3 parallel SIP REGISTER streams with sipexer to 3 different
accounts. Following an example of one stream:
```
watch -n 0.1 "./sipexer -register -vl 3 -co -com -ex 3600 -fu "2" -au "2" -ap
"test" -tk <path to key>.key -tc <path to certificate>.cer -ti -sd -ru
sip:<domain> -cu sip:2@<domain> tls:<fqdn>:5061;sleep 0.1;./sipexer -register
-vl 3 -co -com -ex 0 -fu "2" -au "2" -ap "test" -tk <path to key>.key -tc
<path to cert>.cer -ti -sd -ru sip:<domain> -cu sip:2@<domain> tls:<fqdn>:5061"
```
Our dialplan is quite complex, but I was also able to reproduce the issue with
the default
[dialplan](https://raw.githubusercontent.com/kamailio/kamailio/refs/heads/5.8/etc/kamailio.cfg)
with the following changes:
```
169a170,171
> #!define WITH_TLS
>
205a208,219
> alias=udp:<domain>:5060
> alias=udp:127.0.0.1:5060
> alias=udp:10.0.6.1:5060
> alias=tls:<domain>
> alias=tls:10.0.6.1:5061
> alias=tls:10.165.42.76:5061
> alias=tls:<domain>:5061
> alias=udp:<domain>
> listen=udp:127.0.0.1:5060
> listen=tls:10.0.6.1:5061
> listen=udp:10.0.6.1:5060
> listen=tls:10.165.42.76:5061
498c512
< modparam("tls", "config", "/usr/local/etc/kamailio/tls.cfg")
---
> modparam("tls", "config", "/etc/kamailio/kamailio-tls.cfg")
```
```
# cat /etc/kamailio/kamailio-tls.cfg
[server:default]
method = TLSv1+
verify_certificate = yes
require_certificate = yes
private_key = "<path to key>.key"
certificate = "<path to cert>.crt"
ca_list = "<path to ca>.pem"
[client:default]
method = TLSv1+
verify_certificate = yes
require_certificate = yes
private_key = "<path to key>.key"
certificate = "<path to cert>.crt"
ca_list = "<path to ca>.pem"
```
#### Debugging Data
Following the backtrace of one tcp listener process. The full backtrace can be
found
[here](https://github.com/user-attachments/files/19430313/gdb_kamailio_20250321_154253.txt)
```
---start 4608 -----------------------------------------------------
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
__futex_abstimed_wait_common (futex_word=futex_word@entry=0x7fc855758a0c,
expected=expected@entry=3, clockid=clockid@entry=0, abstime=abstime@entry=0x0,
private=<optimized out>, cancel=cancel@entry=false) at futex-internal.c:103
warning: 103 futex-internal.c: No such file or directory
#0 __futex_abstimed_wait_common (futex_word=futex_word@entry=0x7fc855758a0c,
expected=expected@entry=3, clockid=clockid@entry=0, abstime=abstime@entry=0x0,
private=<optimized out>, cancel=cancel@entry=false) at futex-internal.c:103
err = -512
clockbit = 256
op = <optimized out>
#1 0x00007fc85a59429c in __GI___futex_abstimed_wait64
(futex_word=futex_word@entry=0x7fc855758a0c, expected=expected@entry=3,
clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=<optimized out>) at
futex-internal.c:128
No locals.
#2 0x00007fc85a59d830 in __pthread_rwlock_wrlock_full64 (abstime=0x0,
clockid=0, rwlock=0x7fc855758a00) at
/usr/src/debug/glibc/2.39+git/nptl/pthread_rwlock_common.c:730
private = <optimized out>
wf = <optimized out>
err = <optimized out>
prefer_writer = false
may_share_futex_used_flag = true
wpf = <optimized out>
ready = <optimized out>
r = <optimized out>
may_share_futex_used_flag = <optimized out>
r = <optimized out>
done = <optimized out>
wpf = <optimized out>
ready = <optimized out>
__value = <optimized out>
prefer_writer = <optimized out>
private = <optimized out>
wf = <optimized out>
err = <optimized out>
w = <optimized out>
w = <optimized out>
private = <optimized out>
err = <optimized out>
w = <optimized out>
wf = <optimized out>
wf = <optimized out>
__value = <optimized out>
#3 ___pthread_rwlock_wrlock (rwlock=0x7fc855758a00) at
pthread_rwlock_wrlock.c:26
result = <optimized out>
#4 0x00007fc859820b49 in CRYPTO_THREAD_write_lock (lock=<optimized out>) at
/usr/src/debug/openssl/3.2.3/crypto/threads_pthread.c:108
No locals.
#5 0x00007fc8599bd5da in ossl_prov_drbg_generate (drbg=0x7fc855710040,
out=0x7fc856471980 "", outlen=48, strength=256, prediction_resistance=0,
adin=0x7ffeec8b67e0 "@", adinlen=8) at
/usr/src/debug/openssl/3.2.3/providers/implementations/rands/drbg.c:645
fork_id = <optimized out>
reseed_required = 0
ret = 0
__func__ = "ossl_prov_drbg_generate"
err = <optimized out>
now = <optimized out>
#6 0x00007fc8599bd9ca in ossl_drbg_get_seed (vdrbg=<optimized out>,
pout=0x7ffeec8b6880, entropy=<optimized out>, min_len=<optimized out>,
max_len=<optimized out>, prediction_resistance=0, adin=0x7ffeec8b6838
"\220'\037V\310\177", adin_len=8) at
/usr/src/debug/openssl/3.2.3/providers/implementations/rands/drbg.c:175
drbg = 0x7fc855710040
bytes_needed = 48
buffer = 0x7fc856471980 ""
__func__ = "ossl_drbg_get_seed"
#7 0x00007fc8599bc907 in get_entropy (drbg=<optimized out>,
drbg@entry=0x7fc8561f2790, pout=pout@entry=0x7ffeec8b6880, entropy=<optimized
out>, min_len=48, max_len=48,
prediction_resistance=prediction_resistance@entry=0) at
/usr/src/debug/openssl/3.2.3/providers/implementations/rands/drbg.c:245
bytes = <optimized out>
p_str = 256
__func__ = "get_entropy"
#8 0x00007fc8599bd26c in ossl_prov_drbg_reseed_unlocked
(drbg=drbg@entry=0x7fc8561f2790,
prediction_resistance=prediction_resistance@entry=0, ent=ent@entry=0x0,
ent_len=ent_len@entry=0, adin=adin@entry=0x0, adinlen=adinlen@entry=0) at
/usr/src/debug/openssl/3.2.3/providers/implementations/rands/drbg.c:573
entropy = 0x0
entropylen = 0
__func__ = "ossl_prov_drbg_reseed_unlocked"
#9 0x00007fc8599bd68e in ossl_prov_drbg_generate (adinlen=0, adin=0x0,
prediction_resistance=0, strength=<optimized out>, outlen=32,
out=0x7fc8564717c8 "", drbg=0x7fc8561f2790) at
/usr/src/debug/openssl/3.2.3/providers/implementations/rands/drbg.c:697
fork_id = <optimized out>
reseed_required = <optimized out>
ret = <optimized out>
fork_id = <optimized out>
reseed_required = <optimized out>
ret = <optimized out>
__func__ = "ossl_prov_drbg_generate"
err = <optimized out>
now = <optimized out>
#10 ossl_prov_drbg_generate (drbg=0x7fc8561f2790, out=0x7fc8564717c8 "",
outlen=32, strength=<optimized out>, prediction_resistance=0, adin=0x0,
adinlen=0) at
/usr/src/debug/openssl/3.2.3/providers/implementations/rands/drbg.c:634
fork_id = <optimized out>
reseed_required = 0
ret = 0
__func__ = "ossl_prov_drbg_generate"
err = <optimized out>
now = <optimized out>
#11 0x00007fc8597e75c6 in evp_rand_generate_locked
(ctx=ctx@entry=0x7fc8561f26f0, out=out@entry=0x7fc8564717c8 "",
outlen=outlen@entry=32, strength=strength@entry=0,
prediction_resistance=<optimized out>, prediction_resistance@entry=0,
addin=0x0, addin_len=0) at
/usr/src/debug/openssl/3.2.3/crypto/evp/evp_rand.c:570
chunk = 32
max_request = 65536
params = {{key = 0x7fc8599e9521 "max_request", data_type = 2, data =
0x7ffeec8b6968, data_size = 8, return_size = 8}, {key = 0x0, data_type = 0,
data = 0x0, data_size = 0, return_size = 0}}
__func__ = "evp_rand_generate_locked"
#12 0x00007fc8597e840b in EVP_RAND_generate (ctx=0x7fc8561f26f0,
out=out@entry=0x7fc8564717c8 "", outlen=outlen@entry=32,
strength=strength@entry=0, prediction_resistance=prediction_resistance@entry=0,
addin=addin@entry=0x0, addin_len=0) at
/usr/src/debug/openssl/3.2.3/crypto/evp/evp_rand.c:592
res = <optimized out>
#13 0x00007fc8598f9282 in RAND_bytes_ex (ctx=warning: could not convert
'ossl_lib_ctx_st' from the host encoding (ANSI_X3.4-1968) to UTF-32.
This normally should not happen, please file a bug report.
0x0, buf=0x7fc8564717c8 "", num=32, strength=0) at
/usr/src/debug/openssl/3.2.3/crypto/rand/rand_lib.c:380
rand = <optimized out>
meth = <optimized out>
__func__ = "RAND_bytes_ex"
#14 0x00007fc859bdb62c in def_generate_session_id
(ssl=ssl@entry=0x7fc856436e50, id=id@entry=0x7fc8564717c8 "",
id_len=id_len@entry=0x7ffeec8b6ae4) at
/usr/src/debug/openssl/3.2.3/ssl/ssl_sess.c:326
retry = 0
#15 0x00007fc859bdc132 in ssl_generate_session_id (s=s@entry=0x7fc856436e50,
ss=ss@entry=0x7fc856471570) at /usr/src/debug/openssl/3.2.3/ssl/ssl_sess.c:409
tmp = 32
cb = 0x7fc859bdb600 <def_generate_session_id>
ssl = <optimized out>
__func__ = "ssl_generate_session_id"
#16 0x00007fc859bdc31e in ssl_get_new_session (s=s@entry=0x7fc856436e50,
session=session@entry=1) at /usr/src/debug/openssl/3.2.3/ssl/ssl_sess.c:464
ss = 0x7fc856471570
__func__ = "ssl_get_new_session"
#17 0x00007fc859c3f169 in tls_early_post_process_client_hello
(s=0x7fc856436e50) at /usr/src/debug/openssl/3.2.3/ssl/statem/statem_srvr.c:1898
i = <optimized out>
loop = <optimized out>
sctx = 0x7fc855d24280
j = <optimized out>
dgrd = DOWNGRADE_NONE
protverr = <optimized out>
c = <optimized out>
clienthello = 0x7fc856470a90
al = 80
id = <optimized out>
comp = 0x0
ciphers = 0x7fc856471330
scsvs = 0x7fc8564713c0
ssl = 0x7fc856436e50
j = <optimized out>
i = <optimized out>
al = <optimized out>
protverr = <optimized out>
loop = <optimized out>
id = <optimized out>
comp = <optimized out>
c = <optimized out>
ciphers = <optimized out>
scsvs = <optimized out>
clienthello = <optimized out>
dgrd = <optimized out>
sctx = <optimized out>
ssl = <optimized out>
__func__ = "tls_early_post_process_client_hello"
err = <optimized out>
cipher = <optimized out>
trc_out = <optimized out>
pos = <optimized out>
pref_cipher = <optimized out>
master_key_length = <optimized out>
m = <optimized out>
comp_id = <optimized out>
k = <optimized out>
m = <optimized out>
nn = <optimized out>
v = <optimized out>
done = <optimized out>
o = <optimized out>
#18 tls_post_process_client_hello (s=0x7fc856436e50, wst=<optimized out>) at
/usr/src/debug/openssl/3.2.3/ssl/statem/statem_srvr.c:2295
rv = <optimized out>
cipher = <optimized out>
ssl = 0x7fc856436e50
err = <optimized out>
__func__ = "tls_post_process_client_hello"
#19 0x00007fc859c2bc1b in read_state_machine (s=0x7fc856436e50) at
/usr/src/debug/openssl/3.2.3/ssl/statem/statem.c:712
mt = 1
pkt = {curr = 0x7fc856466125 "", remaining = 0}
max_message_size = 0x7fc859c3e3e0 <ossl_statem_server_max_message_size>
ret = <optimized out>
process_message = 0x7fc859c42400 <ossl_statem_server_process_message>
post_process_message = 0x7fc859c414f0
<ossl_statem_server_post_process_message>
st = 0x7fc856436ee0
len = 273
transition = 0x7fc859c3d330 <ossl_statem_server_read_transition>
cb = 0x7fc859c91d90 <sr_ssl_ctx_info_callback>
ssl = 0x7fc856436e50
st = <optimized out>
ret = <optimized out>
mt = <optimized out>
len = <optimized out>
transition = <optimized out>
pkt = <optimized out>
process_message = <optimized out>
post_process_message = <optimized out>
max_message_size = <optimized out>
cb = <optimized out>
ssl = <optimized out>
__func__ = "read_state_machine"
#20 state_machine (s=0x7fc856436e50, server=1) at
/usr/src/debug/openssl/3.2.3/ssl/statem/statem.c:478
buf = 0x0
cb = 0x7fc859c91d90 <sr_ssl_ctx_info_callback>
st = 0x7fc856436ee0
ret = -1
ssret = <optimized out>
ssl = 0x7fc856436e50
__func__ = "state_machine"
#21 0x00007fc859cbf036 in tls_accept (c=c@entry=0x7fc856461be0,
error=error@entry=0x7ffeec8b6dd8) at tls_server.c:457
ret = <optimized out>
ssl = 0x7fc856436e50
cert = <optimized out>
tls_c = 0x7fc8564171b0
tls_log = <optimized out>
__func__ = "tls_accept"
```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
```
version: kamailio 5.8.4 (x86_64/linux) f1d83d
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE,
USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC,
F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_SEND_BUFFER_SIZE
262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: f1d83d
compiled on 10:25:58 Nov 12 2024 with x86_64-ruag-linux-gcc 13.3.0
```
* **Operating System**:
It is a yocto scarthgap based image (Version 5.0.6 LTS)
```
Linux ttel 6.6.54-yocto-standard #1 SMP PREEMPT_DYNAMIC Tue Oct 8 03:03:40 UTC
2024 x86_64 GNU/Linux
```
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/4185
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/4...@github.com>
_______________________________________________
Kamailio - Development Mailing List -- sr-dev@lists.kamailio.org
To unsubscribe send an email to sr-dev-le...@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the
sender!