Hans-Joerg Hoexer wrote:
Hi,

On Thu, Aug 02, 2007 at 09:23:59PM +0200, Sven Ulland wrote:
I am running OpenBSD 4.0 on amd64, and I'm seeing that isakmpd builds
up a large amount of redundant phase 1 tunnels for one of our peers.
It will only report these when prompted with 'echo r > \
isakmpd.fifo', it's not shown in 'ipsecctl -s all'. This is causing
one of our peer VPN endpoints to run out of available tunnel resources
and drop packets. I am running two OpenBSD 4.0 VPN boxes in a
redundant setup with carp and sasyncd.

isakmpd in OpenBSD 4.0 is by default started with the -S flag, that
the manual says "will not delete SAs on shutdown by sending delete
messages to all peers", suitable for carp/sasyncd setups. What it
doesn't say, however, is that it also enables ui_daemon_passive.
According to isakmpd(8) in CURRENT: "In passive mode no packets are
sent to peers." Active/passive mode is not documented in 4.0 manpages,
but the functionality is there.

In a sasyncd/carp setup isamkpd is started in a passive mode using -S.  On
the machine that is carp master, sasyncd triggers isakmpd to start
negotiations.  On the backup machine, isamkpd stays in passive mode an
does nothing.

However, this should be done by the controling sasyncd only.  This
commands are not meant to be used by the user.  Therefore I guess we
decided to not document this in the man pgae...

Ah, I see. I assumed it was thought to be used from the command line,
since it was documented in isakmpd(8) in 4.1-current. It makes sense
that sasyncd takes care of the active/passive control, but it didn't
occur to me to actually check that it did at the time.

I was having recurrent problems with tunnels not being established.
Our isakmpd just sat there, not wanting to establish tunnels where our
end is set to be active in isakmpd.conf. It mostly ignored incoming
tunnel requests from peers (connection entries configured as passive
in isakmpd.conf) as well.

Is this after a fresh reboot or after restart sasync/isakmpd by hand?

Typically after restarting isakmpd by hand, although I seem to recall
that the problems started unprovoked. I.e. it gradually stopped
refreshing phase 1 or 2 connections. During the troubleshooting,
isakmpd and sasyncd would be restarted/reloaded/HUPed several times,
but without any effect.

A fresh reboot sometimes helped, but not always. The details are a bit
fuzzy, as it's been some time since since I forced it into active
mode. Since then, it has been running in active mode on the carp
master. It would probably mess things up severely if the master was to
lose contact with the slave.

Upon looking at the source, it was clear that 'echo M active > \
isakmpd.fifo' disables ui_daemon_passive (i.e. makes it active). This
is also mentioned in CURRENT's isakmpd(8). Enabling this caused all
our tunnels to suddenly establish and there was much rejoicing.

Now after a while, I saw that isakmpd might have become a little bit
*too* active. I should only be having one phase 1 tunnel to each peer,
but there has been set up around 470 (varies; I've seen 960 at worst)
phase 1 tunnels to one peer in particular. I can't remember anything
other than that it runs Cisco. I can dig up more info if it helps.

The following is gathered from /var/log/daemon after doing an 'echo \
r > isakmpd.fifo'. Excerpt:

 sa_report: 0x47b4d800 TMUK phase 1 doi 1 flags 0xb
 sa_report: icookie 1fe44ce55975a07f rcookie 876ef79120c13acc
 sa_report: msgid 00000000 refcnt 3
 sa_report: life secs 28800 kb 0
 sa_report: suite 1 proto 1
 sa_report: spi_sz[0] 0 spi[0] 0x0 spi_sz[1] 0 spi[1] 0x0
 sa_report: initiator id: 81f0402: 129.240.64.2, \
            responder id: d562735: 213.98.7.53, \
            src: 129.240.64.2 dst: 213.98.7.53

There are 470 of these right now. They all have different 0x........
identifiers and different {i,r}cookie. Other than that, they are
identical.

They are also listed in the {udp_encap,transport}_report. Example:

 transport_report: transport 0x45a30200 flags 0 refcnt 1
 udp_report: fd 9 src 129.240.64.2:500 dst 213.98.7.53:500

Except for the 0x........ ID, they are identical. refcnt is always 1,
and fd is 9 on all of them.

Now, this leads to two questions:
1) Is there something strange or wrong with the active/passive setting
on 4.0? I mean, since isakmpd is started default in passive mode and
-S and 'echo M {active,passive} > isakmpd.fifo' is not documented in
the man pages. -S is, but it doesn't mention active/passive mode
directly.

M {active, passive} is meant to be issued by sasyncd only.

Understood.

2) What could cause the massive phase 1 build-up I'm seeing? I'll be
starting the debug process now, and I'll post back if I can find
anything relevant.

could you please try to upgrade to 4.1-stable?  If I remember correctly,
there were some issues with 4.0.

Yes, I can try that. It's not done in a heartbeat, but I will respond
back to this thread when I know more.

I'm very (that's putting it mildly) interested in the issues with 4.0
that you mention. Would you be able to shed some more light on which
issues they were, or point me to references? It would be most
interesting.

Thanks for the feedback, Hans-Joerg.

Sven

Reply via email to