On Thu, 27 Oct 2005, Gunter Ohrner wrote: > Hi! > > We're experiencing regular assertion failures and subsequent OpenVPN server > crashes on one of our servers. > > The assertion failure is always the same: > > ,---- > | Assertion failed at multi.c:1561 > | Exiting > `---- > > The crash seems to leave openvpn's network device, tap0 in our case, in a > state which blocks all processes subsequently trying to access to device. > The crashes happen every few days and a restart of the server machine is > needed. > > Does anyone have any quick idea of this behaviour's cause? Unfortunately > according to Google we're the only ones on Linux 2.6 with this crash. ;) > > http://openvpn.net/archive/openvpn-users/2005-08/msg00011.html mentions a > similar problem but running kernel 2.2.25 and no solution has been provided > so far, a suggested patch did not fix the problem for the reporter. > > ,----[ Some details about our setup ] > | * Debian Sarge i386 > | * Kernel 2.6.12.6 32 Bit Opteron optimized > | * Debian's 2.0-1sarge1 openvpn package > | * Dual Opteron 246 2,0GHz > `---- > > ,----[ OpenVPN configuration (excerpts) ] > | * bind to single interface/port > | * use udp > | * use tap0 > | * PSK authentication > `---- > > The server is also routing traffic and we do traffic limiting for some > traffic (destination dependant, to comply with a leased link policy). This > limiting is not done on the device on which the encrypted openvpn traffic > leaves the machine but on an IMQ device before the incoming traffic enters > tap0, so openvpn should not see anything from it. > > Are there any further details needed to chase this bug, in whichever kind of > software we're using it may be?
This assertion usually occurs when the tun/tap device locks up and doesn't accept any write syscalls. Can you try an earlier 2.6 kernel (or 2.4), and see if the problem goes away? I would lean towards thinking that this is a tun/tap driver issue, simply because I've never heard about it on anything other than the old unmaintained 2.2 driver, or in this case a very new kernel. But having said that, I can't yet rule out that it's an OpenVPN bug. I could certainly "fix" the assertion by making OpenVPN wait forever for the tun/tap device to accept output. But then OpenVPN would simply hang, and you would have even less information to go on. James