Sven Ingebrigt Ulland wrote:
[...]
Thanks to all of you who have contributed with your experiences with isakmpd/ipsec in OpenBSD. After some time now, I've seen some more of the good and bad sides of our VPN setup, and I'll share it with you.
How long have you been running openbsd isakmpd/ipsec (in production)?
It's been running for over a year now, and it's been very stable.
What problems, if any, have you had with the openbsd vpn implementations? Which of them are the most recurring? How do you usually fix them?
There are a few issues that I've seen with the implementation, or more aptly, my lack of detailed knowledge of the IPSec specs: 1) isakmpd isn't easily debuggable. When some error occurs, or when something expected does not occur, it is hard to know what debug level to increase in isakmpd. Of course, it would help a great deal to have detailed knowledge of the IPSec specs here, but I haven't found the time to get to know them very well. In that respect, I find the man page for isakmpd to be somewhat lacking. Not knowing how to debug properly leads to problems determining on which side the error is located, or if the fault is in an intermediate network. This can lead to a blame game with the other side, which doesn't do anyone much good. About that, I'm interested in hearing of good tips on debugging stuff like this. I use the normal tools like ping, {tcp,udp,icmp} traceroute, hping, tcpdump filtering on udp port 500 || proto 50 and isakmpd logging, but still fall short of determining the exact cause most of the time. Maybe I'm using the tools the wrong way. 2) A common problem is that we simply stop seeing data from one or more peers. (Our endpoint is set up as a slave for all the connections, so it is our peers that initiate connections.) What we usually do then, is to dump packets on the network interface to determine whether the peer is completely dead or if it's hung. 3) On some occations, the peer is hung up somehow, and keeps trying to send us an invalid SPI. Our IPSec rejects those, but it keeps sending them. What we then do is to stop isakmpd and then start it again. For some reason, this fixes the problem. We haven't dumped the traffic while restarting isakmpd yet, but it probably sends some seize and desist signal to all the peers. I'm wondering if it's possible to send this signal to just one peer.. that would keep the other tunnels alive.
Have you experienced any interoperability problems when establishing tunnels with peers that run other implementations (cisco, checkpoint, etc)? And if so, how do you work around those?
Our peers mostly run cisco or checkpoint equipment. In the isakmpd logs we see a *lot* of the following messages: "dropped message from 172.29.9.43 port 500 due to notification type PAYLOAD_MALFORMED" "dropped message from 172.29.9.43 port 500 due to notification type INVALID_PAYLOAD_TYPE" "message_parse_payloads: invalid next payload type <Unknown 111> in payload of type 8" (the number 111 varies from ~25 to ~125) "message_parse_payloads: reserved field non-zero: 17" (the number varies from 0x00 to 0xff). Having a look through the IPSec specs (33 RFCs! Damn, where to start?) would probably explain some of this behaviour. I'm guessing the proprietary boxes use some in-house extensions. Tips are greatly welcome! regards, Sven U