OK, did a few more tests but have come accross something, which I am
not sure is intended behaviour.
When setting a destingation as probing in failure route (due to
timeout), the destination still gets used in destination selection.
# ./kamailio -V
version: kamailio 3.3.0-dev0 (i386/linux) 25bedc
flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS,
USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM,
SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX,
FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR,
USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16,
MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 25bedc
compiled on 09:18:41 Oct 21 2011 with gcc 4.1.2
Dispatcher module parameters (in testing) are as follows
(SBC_PING_FROM is #defined previously):
modparam("dispatcher", "flags", 2)
modparam("dispatcher", "dst_avp", "$avp(AVP_DST)")
modparam("dispatcher", "grp_avp", "$avp(AVP_GRP)")
modparam("dispatcher", "cnt_avp", "$avp(AVP_CNT)")
modparam("dispatcher", "ds_ping_method", "OPTIONS")
modparam("dispatcher", "ds_ping_from", SBC_PING_FROM)
modparam("dispatcher", "ds_ping_interval", 10)
modparam("dispatcher", "ds_probing_threshhold", 1)
modparam("dispatcher", "ds_probing_mode", 0)
Main routing logic has following snippet to select destination (hash
table selects dispatcher setid based on request domain):
if(!ds_select_dst("$sht(which_sbc=>$rd)", "0")) {
sl_send_reply("500", "No destination available");
xlog("route[MAIN] : $rm : No destinations
available for $rd");
exit;
}
Failure route has following logic to select next destination based on
timout/failure of destination:
if (t_branch_timeout() && !t_branch_replied())
{
xlog("route[TO_SBC] : $rm : timeout and no reply
($si:$sp->$Ri:$Rp->$du)\n");
xlog("route[TO_SBC] : $rm : setting $du to probing
state");
ds_mark_dst("p");
if(ds_next_dst())
{
xlog("route[TO_SBC] : $rm : next destination
select ($du)\n");
t_on_failure("TO_SBC");
t_relay();
exit;
} else {
send_reply("500", "No destination available");
xlog("route[TO_SBC] : $rm : No destinations
available for $rd");
exit;
}
}
According to 3.2 module docs for dispatcher, when a destination is
set into probing state, it will not be used by ds_select_dst:
----
4.6. |ds_mark_dst("s")|
Mark the last used address from destination set as inactive
("i"/"I"/"0"), active ("a"/"A"/"1") or probing ("p"/"P"/"2"). With
this function, an automatic detection of failed gateways can be
implemented. When an address is marked as inactive or probing, it
will be ignored by 'ds_select_dst' and 'ds_select_domain'.
possible parameters:
*
/"i", "I" or "0"/ - the last destination should be set to
inactive and will be ignored in future requests.
*
/"a", "A" or "1"/ - the last destination should be set to active
and the error-counter should set to "0".
*
/"p", "P" or "2"/ - the last destination will be set to probing.
Note: You will need to call this function "threshhold"-times,
before it will be actually set to probing.
This function can be used from REQUEST_ROUTE, FAILURE_ROUTE.
---
What happens here, for me, is:
[1] Gateway is in Active mode (state: AX).
[2] Request comes in and times out, destination is set to
Active/Probing (state: AP)
[3] Another request comes in and it selects gateway that is in AP
mode, times out, and then selects next dst in list.
[4] Another request comes in and it selects gateway that is in AP
mode, times out, and then selects next dst in list.
.
.
NOTE: The requests selecting the AP mode gateway may not be right
after each other (algorythm used is hash over callid) but I have
stripped those out in above steps. If I'm not mistaked, if 2
destination in a set, and 1 destination is marked as AP, then
remaining destination should always be selected as destination to
send to. The destination marked AP (active-probing) should not be
selected while in probing state.
When the gateway is set into AP mode at step [2], then, according to
docs, any new request coming in should not have the gateway selected
as it is marked as being in probing state.
Is this the intended behaviour or am I missing something in the
documentation, or is it a bug?
Thanks
[...]