sysctl_tcp_congestion_control seems to have a bug that prevents it from actually calling the tcp_set_default_congestion_control function. This is not so apparent because it does not return an error and generally the /proc interface is used to configure the default TCP congestion control algorithm. This is present in 2.6.18 onwards and probably earlier, though I have not inspected 2.6.15--2.6.17.
sysctl_tcp_congestion_control calls sysctl_string and expects a successful return code of 0. In such a case it actually sets the congestion control algorithm with tcp_set_default_congestion_control. Otherwise, it returns the value returned by sysctl_string. This was correct in 2.6.14, as sysctl_string returned 0 on success. However, sysctl_string was updated to return 1 on success around about 2.6.15 and sysctl_tcp_congestion_control was not updated. Even though sysctl_tcp_congestion_control returns 1, do_sysctl_strategy converts this return code to '0', so the caller never notices the error. A sample program that shows this behaviour is attached. It is hardcoded to set the congestion control algorithm to reno. Running it multiple times shows that it succeeds, yet does not modify the default TCP congestion control algorithm. e.g., $ make sysctl $ sudo /tmp/sysctl; sudo /tmp/sysctl old value='bic', new value should be='reno' old value='bic', new value should be='reno' A simple patch is also attached that modifies the if statement in sysctl_tcp_congestion_control.
#include <linux/unistd.h> #include <linux/types.h> #include <linux/sysctl.h> #include <stdio.h> _syscall1(int, _sysctl, struct __sysctl_args *, args); int sysctl(int *name, int nlen, void *oldval, size_t *oldlenp, void *newval, size_t newlen) { struct __sysctl_args args={name,nlen,oldval,oldlenp,newval,newlen}; return _sysctl(&args); } #define SIZE(x) sizeof(x)/sizeof(x[0]) #ifndef NET_TCO_CONG_CONTROL #define NET_TCP_CONG_CONTROL 110 #endif int main() { char oldname[100]; size_t oldnamelen = SIZE(oldname); char newval[] = "reno"; size_t newvallen = SIZE(newval); int name[] = { CTL_NET, NET_IPV4, NET_TCP_CONG_CONTROL }; if (sysctl(name, SIZE(name), oldname, &oldnamelen, newval, newvallen)) perror("sysctl"); else printf("old value='%s', new value should be='%s'\n", oldname, newval); return 0; }
--- /tmp/net/ipv4/sysctl_net_ipv4.c 2007-10-05 11:07:25.000000000 -0700 +++ net/ipv4/sysctl_net_ipv4.c 2007-11-19 16:05:16.000000000 -0800 @@ -129,7 +129,7 @@ tcp_get_default_congestion_control(val); ret = sysctl_string(&tbl, name, nlen, oldval, oldlenp, newval, newlen, context); - if (ret == 0 && newval && newlen) + if (ret == 1 && newval && newlen) ret = tcp_set_default_congestion_control(val); return ret; }