I would like to re-title this as something like "pf and iked instability on 
recent snapshots," but don’t know if doing so would break the mailing list 
thread, exiso, I left the subject unchanged...

> -----Original Message----- 
> From: Theodore Wynnychenko [mailto:t...@uchicago.edu] 
> Sent: Saturday, December 08, 2018 4:03 PM 
> To: misc@openbsd.org 
> Cc: 'Rachel Roch' 
> Subject: RE: TLS suddenly not working over IKED site-to-site 
> 
> > 
. 
. 
. 
> I now find I can no longer connect to with TLS/SSL over the iked tunnel 
> (the original behavior that seemed to have corrected itself).  Also, 
> icinga continues to be unable to verify the status of the remote hosts 
> over port 5665. 
> 
> I don't have time right now to try using s_client and s_server and 
> watching enc0 to see what is happening, but I will when I can. 
> 
> If anyone has an ideas on what may be happening, please let me know. 
> 
> Thanks 
> Ted 


Hello again; 

So, I am at a complete loss to understand what is going on. 
Today, I tried using openssl s_client and s_server to make a connection through 
the iked vpn (as I described in my last post).  However, with NO changes to 
iked.conf or pf.conf, today I had several connection attempts that completed 
correctly.  I have not included any output from those sporadic, completely 
functional connections.

Rather, today, most of the connections by s_client are not even acknowledged by 
the s_server on the other side of the iked vpn.

For example: 
On the s_client machine: 

# openssl s_client -state -connect "remote.host":https 
SSL_connect:before/connect initialization 
SSL_connect:SSLv3 write client hello A 
... and nothing more ... 

But on the s_server machine today all I see is: 
# openssl s_sever -state -accept https ...certificate options... 
Using auto DH parameters 
Using default temp ECDH parameters 
ACCEPT 
... and no connection attempt is ever acknowledged ... 

(Yesterday, at least this first part of the connection was received by the 
s_server: 
Using auto DH parameters 
Using default temp ECDH parameters 
ACCEPT 
SSL_accept:before/accept initialization 
... and nothing more yesterday ...) 


So, today using tcpdump on the outgoing interface of the s_client machine and 
the incoming interface of the "local" iked vpn endpoint shows:

16:43:05.107524 172.30.1.254.7305 > 172.30.7.205.443: S 
1751796302:1751796302(0) win 16384 <mss 1460,nop,nop,sackOK,nop,wscale 
6,nop,nop,timestamp 2698316052 0>

16:43:05.149146 172.30.1.254.7305 > 172.30.7.205.443: . ack 2119500805 win 256 
<nop,nop,timestamp 2698316052 3536824996>

16:43:05.149895 172.30.1.254.7305 > 172.30.7.205.443: P 0:196(196) ack 1 win 
256 <nop,nop,timestamp 2698316052 3536824996>

16:43:06.648487 172.30.1.254.7305 > 172.30.7.205.443: P 0:196(196) ack 1 win 
256 <nop,nop,timestamp 2698316055 3536824996>

16:43:09.648557 172.30.1.254.7305 > 172.30.7.205.443: P 0:196(196) ack 1 win 
256 <nop,nop,timestamp 2698316061 3536824996>

16:43:09.948433 172.30.1.254.7305 > 172.30.7.205.443: F 196:196(0) ack 1 win 
256 <nop,nop,timestamp 2698316061 3536824996>

16:43:15.648712 172.30.1.254.7305 > 172.30.7.205.443: FP 0:196(196) ack 1 win 
256 <nop,nop,timestamp 2698316073 3536825005>

And this traffic (incomplete thought it may be for an ssl handshake) appears to 
be passed to enc0 intact: 

16:43:05.105044 (authentic,confidential): SPI 0x151333df: 172.30.1.254.7305 > 
172.30.7.205.443: S 3570513915:3570513915(0) win 16384 <mss 
1300,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 2698316052 0> (encap)

16:43:05.146122 (authentic,confidential): SPI 0xe1c30e4a: 172.30.7.205.443 > 
172.30.1.254.7305: S 1312941075:1312941075(0) ack 3570513916 win 16384 <mss 
1300,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 3536824996 2698316052> 
(encap)

16:43:05.146654 (authentic,confidential): SPI 0x151333df: 172.30.1.254.7305 > 
172.30.7.205.443: . ack 1 win 256 <nop,nop,timestamp 2698316052 3536824996> 
(encap)

16:43:05.147365 (unprotected): SPI 0x0000ef27: 172.30.1.254.7305 > 
172.30.7.205.443: P 1:197(196) ack 1 win 256 <nop,nop,timestamp 2698316052 
3536824996> (encap)

16:43:06.645932 (unprotected): SPI 0x0000ef27: 172.30.1.254.7305 > 
172.30.7.205.443: P 1:197(196) ack 1 win 256 <nop,nop,timestamp 2698316055 
3536824996> (encap)

16:43:09.646049 (unprotected): SPI 0x0000ef27: 172.30.1.254.7305 > 
172.30.7.205.443: P 1:197(196) ack 1 win 256 <nop,nop,timestamp 2698316061 
3536824996> (encap)

16:43:09.945908 (authentic,confidential): SPI 0x151333df: 172.30.1.254.7305 > 
172.30.7.205.443: F 197:197(0) ack 1 win 256 <nop,nop,timestamp 2698316061 
3536824996> (encap)

16:43:09.981966 (authentic,confidential): SPI 0xe1c30e4a: 172.30.7.205.443 > 
172.30.1.254.7305: . ack 1 win 261 <nop,nop,timestamp 3536825005 
2698316052,nop,nop,sack 1 {197:197} > (encap)

16:43:15.646158 (unprotected): SPI 0x0000ef27: 172.30.1.254.7305 > 
172.30.7.205.443: FP 1:197(196) ack 1 win 256 <nop,nop,timestamp 2698316073 
3536825005> (encap)


BUT, at the other end of the VPN, on enc0, all that is seen leaving the iked 
VPN tunnel is: 

16:43:05.130558 (authentic,confidential): SPI 0x151333df: 172.30.1.254.7305 > 
172.30.7.205.443: S 3570513915:3570513915(0) win 16384 <mss 
1300,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 2698316052 0> (encap)

16:43:05.131049 (authentic,confidential): SPI 0xe1c30e4a: 172.30.7.205.443 > 
172.30.1.254.7305: S 1312941075:1312941075(0) ack 3570513916 win 16384 <mss 
1300,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 3536824996 2698316052> 
(encap)

16:43:05.174802 (authentic,confidential): SPI 0x151333df: 172.30.1.254.7305 > 
172.30.7.205.443: . ack 1 win 256 <nop,nop,timestamp 2698316052 3536824996> 
(encap)

16:43:09.966420 (authentic,confidential): SPI 0x151333df: 172.30.1.254.7305 > 
172.30.7.205.443: F 197:197(0) ack 1 win 256 <nop,nop,timestamp 2698316061 
3536824996> (encap)

16:43:09.966853 (authentic,confidential): SPI 0xe1c30e4a: 172.30.7.205.443 > 
172.30.1.254.7305: . ack 1 win 261 <nop,nop,timestamp 3536825005 
2698316052,nop,nop,sack 1 {197:197} > (encap)


I have no idea what this all means, or what to do with it. 
But, I am following up in case anybody has any idea of what may be happening. 

Also, yesterday I described how the local iked machine appeared to be blocking 
packets that were explicitly allowed by pf.conf.  From my post yesterday:

(   For example, in the log I see: 
Dec  8 15:50:01 ... pf: Dec 08 15:48:49.346816 rule 4/(match) block out on em0: 
172.30.7.205.22112 > 172.30.2.99.5665: R 3963276584:3963276584(0) ack 252894831 
win 0

But, pfctl is running with following: 

# pfctl -s rules 
match in all scrub (no-df random-id max-mss 1300) 
pass in quick on em1 all flags S/SA 
pass out quick on em1 all flags S/SA 
block drop in log on em0 all 
block drop out log on em0 all 
... 
pass quick inet proto tcp from 172.30.7.205 to 172.30.2.99 port = 5665 flags 
S/SA 
... and on.    ) 

Well, whatever was happening appears to have been resolved, because at about 
midnight local time on Sunday morning, icinga2 declared that the host was back 
up.

To be clear, I have made no changes to either pf.conf or iked.conf on any of 
the machines involved in this testing from Saturday.

Also, this had all been stable for the last (about) 2 years, until about 
two-three weeks ago.  I did have another post, where I discussed the fact the 
iked VPN had failed to be reestablished after an update about 3-4 snapshots 
back.  I got it working again by changing the local endpoint on the "remote" 
iked machine from the internal ip associated with the internal interface to an 
internal "alias" ip address associated with the outgoing/external interface of 
that machine.

But, again, it had been working for 2 years until the recent update. 

I don't have any idea of what may be helpful in figuring out what I am doing 
wrong, or what has changed, but I am happy to provide any information that may 
be of help.

I don't believe I have the knowledge to do more on my own at this point. 

Thanks for any advice. 
Ted 




Reply via email to