Hello Yvan,
I have tested the suggested workaorund in pfkey.c by raising the ceiling
to 768kbytes from 128.
I also raised the limit in the socketvar.h in FreeBSD 6 Stable from the
default 128kbytes to 768kbytes.
Any higher values then 768kbytes result in a integer overflow and
prevents a build world of FreeBSD 6.
The critical hit is around 120~130 active tunnels out of a configuration
of 390. After his critical ceiling the racoon keeps getting stuck in
sbwait after about 1 or 2 minutes after starting.
A good way to test this with less tunnels is sending reload signals to
the racoon processes which forces a lot of pfkey traffic. Which will
eventually trigger the truncated pfkey socket buffer problem.
The real problem is that racoon 0.6.7 and racoon 0.7 do not correctly
handle a truncated socket. Instead of retrying the operation they either
idle in sbwait and waiting for something that will never arrive or run
away on a buffer which does not exist anymore.
Any further suggestions?
Kind regards,
Seth Mos
pfSense Developer
Seth Mos schreef:
Hello there,
I have problems with racoon hanging in sbwait state with ipsec-tools 0.6.7
or getting into a tailspin on ipsec-tools 0.7.
The problem is that the pfkey interface breaks down with a lot of VPN
tunnels and spd entries. The FreeBSd PR is here.
http://www.freebsd.org/cgi/query-pr.cgi?pr=115651
I have 390 discrete IPSEC VPN tunnels and endpoints. I have loaded this
all up into one config. I am using pfSense as the platform of choice.
pfSense 1.2-RC2 specifically which is based on FreeBSD 6.2 Stable p7. Note
that I am also a pfSense developer.
At this current time I have exactly 112 IPSEC tunnels active. I am using
3des-sha1 with a 3600 lifetime for phase 1 and aes128-sha1 with a 28800
lifetime for phase2.
On ipsec-tools 0.6.7 I can go by for several days before the racoon
process wedges itself into a hanging sbwait state (0% cpu). On ipsec-tools
0.7 the situation is significantly worse and it starts churning 100% cpu
somewhere every 1-4 hours.
Basically where 0.6.7 was difficult 0.7 has become unworkable.
The hardware in question is a Dell PE860 with 6 gigabit nics (about 2mbps
ipsec traffic at most) with a DualCore Xeon 3050 2.13Ghz. With 1GB ram.
In the pfSense kernel we use this patch.
http://cvs.pfsense.com/cgi-bin/cvsweb.cgi/tools/patches/RELENG_6_2/socketvar.h.diff
Which helps significantly. Without this patch the situation is the same as
I have described above but will be reached at about 30-40 active tunnels
instead of the 112 I have now.
I really need a reliable solution to this problem.
Kind regards,
Seth Mos
pfSense Developer
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"