zenichev left a comment (kamailio/kamailio#4539)
@miconda
> Is the ability to do reloads without time interval limiting the main benefit?
**The main problem out there is that:**
the bigger is the trusted table, the bigger must be the reload delta (modparam)
that controls the time between two rpc triggered reloads. As it takes more time
to reload the table when each new chunk of new sources is provisioned.
As our experience showed, on the heavily loaded platforms the reload delta
should regularly be increased to not let two or even more competing RPC
processes suddenly manipulate the same linked list. Moreover the
0039c50e83916c6 commit doesn't really solve the problem, probably it slightly
improved the situation, but still somehow we still keep on crashing while doing
the insert, an example (part of the BT trace, kamailio 5.8.6):
```
#0 hash_table_insert (table=table@entry=0x7f3a655f3cb0, src_ip=<optimized
out>, proto=<optimized out>, pattern=pattern@entry=0x561a5e01b569
".*username111.*", ruri_pattern=ruri_pattern@entry=0x0,
tag=tag@entry=0x561a5e01b583 "uuid-was-here-but-is-substituted",
priority=0) at ./src/modules/permissions/hash.c:231
231 ./src/modules/permissions/hash.c: No such file or directory.
(gdb) bt full
#0 hash_table_insert (table=table@entry=0x7f3a655f3cb0, src_ip=<optimized
out>, proto=<optimized out>, pattern=pattern@entry=0x561a5e01b569
".*username111.*", ruri_pattern=ruri_pattern@entry=0x0,
tag=tag@entry=0x561a5e01b583 "uuid-was-here-but-is-substituted",
priority=0) at ./src/modules/permissions/hash.c:231
np = 0x7f3a66a09490
np0 = 0x6f5f65656c6c6163
np1 = <optimized out>
hash_val = <optimized out>
__func__ = "hash_table_insert"
#1 0x00007f3bbccfc269 in reload_trusted_table () at
./src/modules/permissions/trusted.c:147
cols = {0x7f3bbcd0fa50 <perm_source_col>, 0x7f3bbcd0fa40
<perm_proto_col>, 0x7f3bbcd0fa30 <perm_from_col>, 0x7f3bbcd0fa20
<perm_ruri_col>, 0x7f3bbcd0fa10 <perm_tag_col>, 0x7f3bbcd0fa00
<perm_priority_col>}
res = 0x7f3bc3ff4980
row = 0x7f3bc60f60c0
val = 0x7f3bc2106820
new_hash_table = 0x7f3a655f3cb0
i = 10
priority = <optimized out>
pattern = <optimized out>
ruri_pattern = <optimized out>
tag = 0x561a5e01b583 "uuid-was-here-but-is-substituted"
__func__ = "reload_trusted_table"
#2 0x00007f3bbcd018be in reload_trusted_table_cmd () at
./src/modules/permissions/trusted.c:712
__func__ = "reload_trusted_table_cmd"
#3 0x00007f3bbccf9cea in rpc_trusted_reload (rpc=0x7f3bbf9f1220 <func_param>,
c=0x7f3bbf9f1280 <ctx>) at ./src/modules/permissions/rpc.c:61
No locals.
#4 0x00007f3bbf9e7c64 in ki_dispatch_rpc (msg=<optimized out>) at
./src/modules/xmlrpc/xmlrpc.c:2693
exp = 0x7f3bc60bbe80
ret = 1
rdata = 0
__func__ = "ki_dispatch_rpc"
```
So from my PoV, the module itself has an architecture, so the code design, that
allows such races to happen.
And therefore it should be reworked, whereas the quickest and the easiest way
is to introduce locks and wrap lists into structures that really introduce real
hash table concept. Please see the way how it's designed in the rtpengine
module, where we never experienced issues when manipulating entries in hash
buckets.
**So answering your question**: shortly said, yes, the main benefit in the
short perspective is to solve the race issue of RPC reloads. But in the longer
perspective actually to limit the other issues that can appear because of the
unsecured buckets list.
> How is the reload performed, changing the buckets of the hash table?
I'm not sure I properly understood your question, but the reload is done on the
same way, with an exception that now the table gets locked and one cannot
introduce two or more processes changing the data inside of the table, while
the first process is still not done.
The algo remains the same:
- select a table
- re-init the buckets inside (clean old content)
- re-insert the data adding the newly content plus older (so updated version)
- make the current table (cursor) pointing to the freshest table
The only difference is that now it manipulates the lists via the table
structures having locks.
And also uses the updated hash API with newer safe methods (covered with locks).
> Are the changes only for the trusted table, or also for the address table?
For now only the trusted table gets the changes. We never run into the address
table issues, but probably it's just a matter of time and luck. It has the same
issues as the trusted table, totally lock-free and gives RPC processes a chance
to introduce race issues like iterator invalidation.
---
One thing that I should mention apart of everything, is that the
`hash_table_insert()` got a duplicates prevention mechanism. That wasn't in
place before, so this must be taken into account (no duplicates in the trusted
table will be allowed). I've mentioned this chance in the commit message, but
probably should be added into the doc as well.
I'm not sure if this behavior change can be an obstacle for the community. I
can make it optional via the modparam though. You can see it there:
https://github.com/kamailio/kamailio/pull/4539/commits/5c052718482ed1764e1a89ed7f2f37a20324387b
---
Also I totally agree with you, we need a good expertise here before to actually
accept such changes.
For now the only thing we can do is run tests on our side, what we've done
already.
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/pull/4539#issuecomment-3715064114
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/pull/4539/[email protected]>_______________________________________________
Kamailio - Development Mailing List -- [email protected]
To unsubscribe send an email to [email protected]
Important: keep the mailing list in the recipients, do not reply only to the
sender!