Re: [Openvpn-devel] [PATCH] multi.c: Allow floating to a new IP right after connection setup

2025-04-28 Thread Walter Doekes via Openvpn-devel
> On 24-04-2025 10:26, Arne Schwabe wrote:
> The code doesn't really do what the comment does. If you were trying to
> actually figure out if it is the same client you should rather compare
> the session ID of the tls state.

As for the session_id, I checked the following:

- m->session[TM_ACTIVE].session_id (matches)
- m->session[TM_ACTIVE].key[KS_PRIMARY].session_id_remote (matches)
- m->session[TM_ACTIVE].key[KS_LAME_DUCK].session_id_remote (is zero)

- m->session[TM_UNTRUSTED].session_id (does not match)
- m->session[TM_UNTRUSTED].key[KS_PRIMARY].session_id_remote (is zero)
- m->session[TM_UNTRUSTED].key[KS_LAME_DUCK].session_id_remote (is zero)

So, two session IDs match. I'll assume you meant the
m->session[TM_ACTIVE].session_id one then.

I can change the if-expression in my original patch to this:

  if (m1->multi_state == CAS_CONNECT_DONE
  && m2->multi_state == CAS_NOT_CONNECTED
  && m1->locked_cert_hash_set
  && !m2->locked_cert_hash_set
  && session_id_equal(&m1->session[TM_ACTIVE].session_id,
  &m2->session[TM_ACTIVE].session_id))

Or do you suggest something else?


>> The offending commit is:
>>
>>commit b364711486dc6371ad2659a5aa190941136f4f04
>>Author: Arne Schwabe 
>>Date:   Mon May 2 17:43:10 2022 +0200
>>
>>  Implement stateless HMAC-based sesssion-id three-way-handshake
>>
>> Double checked and confirmed. Apparently this alters things so the
>> source-IP-switched connection shows up as a second connection.
[...]

> Well that changes the behaviour of how new sessions are established,
> that this change is the one that is triggering the issue is even more
> perplexing.

Well. If it was expected, we wouldn't be here investigating ;)

Can I assist you with some test or extra log line?

Cheers,
Walter




___
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel


Re: [Openvpn-devel] [PATCH] multi.c: Allow floating to a new IP right after connection setup

2025-04-24 Thread Walter Doekes via Openvpn-devel

On 24-04-2025 10:26, Arne Schwabe wrote:
I still do not really understanding what network setup you have to 
trigger this kind of bizzare behaviour of having a VPN connection.
It would really be helpful if you can describe that more. Since the 
current explainations do not really explain for me why there is already 
an existing half-established connection.


The scenario you are fixing seem like:

  * somehow a connection attempt is done from IP A
  * a connection is fully established for IP B
  * connection for IP B tries to move to IP A


If you take into account that 2.5 works just fine with this scenario, I 
don't see why you would call this bizarre.


But I'll break it down some more:

- I have internet, exit IP: 1.1.1.10.

- I have a VPN, CorporateVPN at 2.2.2.1. I use it as my default route. 
CorporateVPN has one exit IP: 2.2.2.10.


- There is a second VPN server, SecondVPN, at 3.3.3.1, which I'd like to 
connect to.



From the perspective of SecondVPN, CorporateVPN is connecting to it:

- 2.2.2.10 -> 3.3.3.1 - openvpn UDP stuff, all works


But, in fact, this is tunneled traffic. There is also:

- 1.1.1.10 -> 2.2.2.1 - here traffic is doubly encrypted


So, I'd like the traffic to go directly instead, to lose the double 
encryption and the extra latency:


- 1.1.1.10 -> 3.3.3.1


To do that, I tell SecondVPN to push this route:

  push "route 3.3.3.1 255.255.255.255 net_gateway"


This causes my machine to create a direct route, bypassing the CorporateVPN.

- 1.1.1.10 -> 3.3.3.1 - now goes directly without the double tunnel


This seems to me like a legit use of net_gateway.

What happens is that SecondVPN sees traffic that previously came from 
2.2.2.10 now suddenly comes from 1.1.1.10. And this IP change is 
detected by openvpn as a "floating IP".


- In OpenVPN 2.5 this worked like a charm.

- In OpenVPN 2.6 it doesn't, as mentioned in the Github ticket.

I've reliably reproduced this several times, as there are a bunch of 
OpenVPN 2.5's that I've connected to that do not exhibit the problematic 
behaviour, while the OpenVPN 2.6's do (both 2.6.3 bookworm and 2.6.12 
noble).


Is this a sufficiently clear explanation?


You are also saying that there is a difference in OpenVPN 2.5 vs OPneVPN 
2.6. If that is indeed the case, a git bisect would be helpful to point 
to the commit that is breaking this.


This is a fair request. But bisecting will take me considerably more 
time, so this wasn't my first priority.


I did look into all the changes, and it indeed looked like the "floating 
IP detection" was untouched, while there are some connection setup 
speedup (shortcut?) changes. My suspicion is that the quicker connection 
setup has something to do with the changed behaviour.


In due time, I can get to bisecting. But if the information in this 
ticket and workaround is sufficient for someone else to say "A-HA", then 
that would be my preferred option.



 > msg(M_INFO, "peer %" PRIu32 " (%s) floating from %s to %s (m2 still 
setting up) state=%d/%d",


A user has no idea what m2 is. This message more confusing than helpful.


I'll look into rephrasing when making a new patch.


+    /* if the new connection is fresh and the old one is already 
connected, this

+ * might be a legitimate move to a new IP by the original client;
+ * for example when the server IP is pushed as net_gateway to 
escape from

+ * a double VPN. */
+    if (m1->multi_state == CAS_CONNECT_DONE && m2->multi_state == 
CAS_NOT_CONNECTED

+    && m1->locked_cert_hash_set && !m2->locked_cert_hash_set)


The code doesn't really do what the comment does. If you were trying to 
actually figure out if it is the same client you should rather compare 
the session ID of the tls state.


Also I think this code looks on the first glance to be broken in the way 
that an authorized client could just send a bunch of data packets with 
faked sender ip and then kill of other connections that are in the 
progress of connection.


Given your experience with openvpn, I betting you're right on these points.

I don't know where the "tls state session ID" is and if it's there. I'm 
dealing with OpenVPN 2.5 clients (no older ones, I hope).


If you can point me in the right direction, I'd be grateful. This is my 
first venture into openvpn code.



Cheers,
Walter Doekes
OSSO B.V.



___
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel


Re: [Openvpn-devel] [PATCH] multi.c: Allow floating to a new IP right after connection setup

2025-04-28 Thread Walter Doekes via Openvpn-devel
> On 24-04-2025 10:26, Arne Schwabe wrote:
>> You are also saying that there is a difference in OpenVPN 2.5 vs OPneVPN
>> 2.6. If that is indeed the case, a git bisect would be helpful to point
>> to the commit that is breaking this.

Well. This was my least worst bisect experience so far. So, thank you for
that! :)

  autoreconf -ivf && ./configure && echo '#define HAVE_EVP_PKEY_ID 1' >>
config.h && make -j12

was sufficient on Ubuntu/Noble and running didn't require extra
installation/path shuffling.

The offending commit is:

  commit b364711486dc6371ad2659a5aa190941136f4f04
  Author: Arne Schwabe 
  Date:   Mon May 2 17:43:10 2022 +0200

Implement stateless HMAC-based sesssion-id three-way-handshake


Double checked and confirmed. Apparently this alters things so the
source-IP-switched connection shows up as a second connection.

I haven't checked if it's something I can fix yet. But At least we have a
culprit now.


Cheers,
Walter Doekes
OSSO B.V.




___
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel


[Openvpn-devel] [PATCH] multi.c: Allow floating to a new IP right after connection setup

2025-05-07 Thread Walter Doekes via Openvpn-devel
From: Walter Doekes 

When you're connected to a VPN which is used as the default gateway, a
connection to a second VPN will cause a tunnel-in-tunnel traffic. If the
administrator of the second VPN wants to avoid that, by pushing its IP
as net_gateway, this means that the client's source IP switches right
after connect:

  the client source IP switches, from
  - the first-VPN-exit-IP, to the
  - regular-ISP-exit-IP

In openvpn 2.5 and below, this worked fine. Since openvpn 2.6,
specifially b364711486, this triggers the "Disallow float to an address
taken by another client" code. Since that change, the traffic from the
second source IP creates a second connection, which now needs special
handling in the check-floating-IP code.

This change allows one to switch to the new IP, if it is still in an
unconnected state. That makes the use-case mentioned above work again.

Github: closes OpenVPN/openvpn#704
Signed-off-by: Walter Doekes 
---
 src/openvpn/multi.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/openvpn/multi.c b/src/openvpn/multi.c
index a2d3fd10..51a00b71 100644
--- a/src/openvpn/multi.c
+++ b/src/openvpn/multi.c
@@ -3236,8 +3236,21 @@ multi_process_float(struct multi_context *m, struct 
multi_instance *mi,
 struct tls_multi *m1 = mi->context.c2.tls_multi;
 struct tls_multi *m2 = ex_mi->context.c2.tls_multi;
 
+/* if the new connection is fresh and the old one is already 
connected, this
+ * might be a legitimate move to a new IP by the original client;
+ * for example when the server IP is pushed as net_gateway to escape 
from
+ * a double VPN. */
+if (m1->multi_state == CAS_CONNECT_DONE
+&& m2->multi_state == CAS_NOT_CONNECTED
+&& m1->locked_cert_hash_set
+&& !m2->locked_cert_hash_set
+&& session_id_equal(&m1->session[TM_ACTIVE].session_id,
+  &m2->session[TM_ACTIVE].session_id))
+{
+/* allow this case */
+}
 /* do not float if target address is taken by client with another cert 
*/
-if (!cert_hash_compare(m1->locked_cert_hash_set, 
m2->locked_cert_hash_set))
+else if (!cert_hash_compare(m1->locked_cert_hash_set, 
m2->locked_cert_hash_set))
 {
 msg(D_MULTI_LOW, "Disallow float to an address taken by another 
client %s",
 multi_instance_string(ex_mi, false, &gc));
-- 
2.34.1



___
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel


Re: [Openvpn-devel] [PATCH] multi.c: Allow floating to a new IP right after connection setup

2025-05-25 Thread Walter Doekes via Openvpn-devel

Good. Your understanding of the situation is the same.

I did not yet make a reproducer config -- mostly because I don't think 
we're doing anything non-standard. But I did double check that latest 
2.6 is affected, tested both client and server.


And, I added some tracing code, and I observed the following:


else if (verdict == VERDICT_VALID_CONTROL_V1 || verdict == 
VERDICT_VALID_ACK_V1)
{   
/* ACK_V1 contains the peer id (our id) while CONTROL_V1 can but does not

 * need to contain the peer id */
struct gc_arena gc = gc_new();

bool ret = check_session_id_hmac(state, from, hmac, handwindow);

const char *peer = print_link_socket_actual(&m->top.c2.from, &gc);
if (!ret)
{   
msg(D_MULTI_MEDIUM, "Packet with invalid or missing SID from %s", peer);

}
else
{   
msg(D_MULTI_DEBUG, "Valid packet with HMAC challenge from peer (%s), "

"accepting new connection.", peer);
//ret = false; // setting false here makes all setup fail
}
gc_free(&gc);

msg(D_MULTI_MEDIUM, "do_pre_decrypt_check: verdict %d (%d)", verdict, 
ret);
return ret;
}


This is the code that has the changed behaviour.

In the before situation, we had this:


--- a/src/openvpn/mudp.c
+++ b/src/openvpn/mudp.c
@@ -55,8 +55,10 @@ do_pre_decrypt_check(struct multi_context *m)
 
 if (verdict == VERDICT_INVALID || verdict == VERDICT_VALID_CONTROL_V1)

 {
+msg(D_MULTI_MEDIUM, "do_pre_decrypt_check: verdict %d (false)", 
verdict);
 return false;
 }
+msg(D_MULTI_MEDIUM, "do_pre_decrypt_check: verdict %d (true)", verdict);
 return true;
 }
 


 do_pre_decrypt_check: verdict 1 (false)
m->pending = (nil), float = 0 // in multi_process_incoming_link
 Float requested for peer 0 to EXTIP:PORT


In the after-situation, we have this:

 do_pre_decrypt_check: verdict 2 (1)
 MULTI: multi_create_instance called
...
m->pending = 0x5a36ce132f10, float = 0 // in multi_process_incoming_link
 Float requested for peer 0 to EXTIP:PORT


verdict is VERDICT_VALID_CONTROL_V1 in both cases (which changed from 1 
to 2 after b364711486dc6371ad2659a5aa190941136f4f04), and the 
do_pre_decrypt_check now returns TRUE.


That TRUE-result results in calls to multi_create_instance, and now we 
have that "mystery connection".



When I find more time, I'll debug some more. But this might shed some 
light on things for you.


As far as I understand, the packet with VERDICT_VALID_CONTROL_V1 verdict 
is now detected as a new valid connection setup. That creates a 
connection on the new IP. And then the float gets rejected.


Walter



On 23-05-2025 02:26, Arne Schwabe wrote:

Totally fair that you don't want to apply a patch that you don't
understand. I on the other hand do not see why you're unable to reproduce.

The scenario is not at all complicated:

- Two vpn servers;
- first vpn server pushes a default gateway;
- second vpn server pushes its external IP as net_gateway (*);
- second vpn server immediately sees the client float from one IP to another.


What I understand so far:

- so you connect to vpn 1 first and that is a normal VPN with a default 
gateway and you get VPN1 IP


- Then via that VPN, you connect a 2nd VPN and you have as source the 
VPN IP, so the 2nd VPN server only see the VPN1 IP.


- after connection is established, you  do the host route directly to 
the server.


- 2nd VPN server sees a float from VPN1 IP to extern IP (EXTIP) of client

- Server refuses the float since there is already a not fully 
established connection on EXTIP


What I don't understand where the this not fully established connection 
should be coming from. That would mean that the server would have need 
to have received a valid connection attempt from EXTIP that was never 
established. And I do not understand from you explaination where that 
happens.



If you're unable to reproduce that, then:

- Either you're using a vastly different version and it has been fixed
since then (but not something that landed in debian/bookworm or
ubuntu/noble, and I _think_ I did try latest 2.6 as well);
- or you're using different settings (udp; auth/tls-auth; dev-tun;
subnet-topology);
- or there is some unknown factor involved that neither of us can think or
right now.

I will create a reproducer config so you can see the exact settings (apart
from the IP addresses).

In the mean time, can you confirm that you understand the scenario or ask
for additional clarification?


I wrote again down what you basically told me and there is still this 
mystery connection that blocks you. And there is no explaination why 
this connection exist in the first place. You are fixing the sympton of 
this ghost connection that blocks your float but from my perspective we 
have not really established why it exists in the first place.


Arne




___
Openvpn-devel mailing 

Re: [Openvpn-devel] [PATCH] multi.c: Allow floating to a new IP right after connection setup

2025-05-22 Thread Walter Doekes via Openvpn-devel
> The thing is that I do not really understand your scenario and how it
> exactly breaks for you to the extend that I cannot reproduce the issue.

I thought I explained things sufficiently in:

https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31502.html

Apparently not. Please let me know what is unclear about that explanation.


> You are saying that the client switches the IP address after connect. But
> that is just a regular float from the perspective of the VPN server. I
> still do not understand where the other connection that is already on that
> IP/port is coming from. It is also not an older connection as it is not a
> fully established connection either.

Well, as far as I can tell, it _is_ just a regular float... that stopped
working after the mentioned commit.

It is indeed that recent connection. From what I gather from the earlier
workings, we should not end up in that piece of code (where I added the
fix), but for some reason we _do_ now.


> In summary I am not able to either reproduce or understand what is
> happning in your scenario. And I do not want to apply a patch that I don't
> understand.

Totally fair that you don't want to apply a patch that you don't
understand. I on the other hand do not see why you're unable to reproduce.

The scenario is not at all complicated:

- Two vpn servers;
- first vpn server pushes a default gateway;
- second vpn server pushes its external IP as net_gateway (*);
- second vpn server immediately sees the client float from one IP to another.

If you're unable to reproduce that, then:

- Either you're using a vastly different version and it has been fixed
since then (but not something that landed in debian/bookworm or
ubuntu/noble, and I _think_ I did try latest 2.6 as well);
- or you're using different settings (udp; auth/tls-auth; dev-tun;
subnet-topology);
- or there is some unknown factor involved that neither of us can think or
right now.

I will create a reproducer config so you can see the exact settings (apart
from the IP addresses).

In the mean time, can you confirm that you understand the scenario or ask
for additional clarification?

Thank you!

Walter


(*) Why? Because if it didn't, traffic from the client to VPN-two goes
through VPN-one as well. And that incurs overhead: additional latency and
cpu load, possible MTU issues.


___
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel


Re: [Openvpn-devel] [PATCH] multi.c: Allow floating to a new IP right after connection setup

2025-05-22 Thread Walter Doekes via Openvpn-devel
Dear mailing list and Arne,

I saw in the IRC meeting notes that 2.7 is about to be branched.

I don't mean to push, but I personally think this bug is important enough
to get looked at and fixed, preferably before 2.7.0 is made.

Is this new patch good enough? Should I put it in Gerrit? Are there any
questions about the functionality?

I realise that I did not explain _why_ b364711486 breaks things. I'm also
curious, but I think the author of that patch, or someone else more
intimate with the codebase, can more easily explain. Regardless there is a
bug in 2.6.x right now, and I'd would like it if a fix is upstreamed; be
it my fix, or someone elses.

Thanks!
Walter


> From: Walter Doekes 
>
> When you're connected to a VPN which is used as the default gateway, a
> connection to a second VPN will cause a tunnel-in-tunnel traffic. If the
> administrator of the second VPN wants to avoid that, by pushing its IP
> as net_gateway, this means that the client's source IP switches right
> after connect:
>
>   the client source IP switches, from
>   - the first-VPN-exit-IP, to the
>   - regular-ISP-exit-IP
>
> In openvpn 2.5 and below, this worked fine. Since openvpn 2.6,
> specifially b364711486, this triggers the "Disallow float to an address
> taken by another client" code. Since that change, the traffic from the
> second source IP creates a second connection, which now needs special
> handling in the check-floating-IP code.
>
> This change allows one to switch to the new IP, if it is still in an
> unconnected state. That makes the use-case mentioned above work again.
>
> Github: closes OpenVPN/openvpn#704
> Signed-off-by: Walter Doekes 
> ---
>  src/openvpn/multi.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/openvpn/multi.c b/src/openvpn/multi.c
> index a2d3fd10..51a00b71 100644
> --- a/src/openvpn/multi.c
> +++ b/src/openvpn/multi.c
> @@ -3236,8 +3236,21 @@ multi_process_float(struct multi_context *m, struct
> multi_instance *mi,
>  struct tls_multi *m1 = mi->context.c2.tls_multi;
>  struct tls_multi *m2 = ex_mi->context.c2.tls_multi;
>
> +/* if the new connection is fresh and the old one is already
> connected, this
> + * might be a legitimate move to a new IP by the original client;
> + * for example when the server IP is pushed as net_gateway to
> escape from
> + * a double VPN. */
> +if (m1->multi_state == CAS_CONNECT_DONE
> +&& m2->multi_state == CAS_NOT_CONNECTED
> +&& m1->locked_cert_hash_set
> +&& !m2->locked_cert_hash_set
> +&& session_id_equal(&m1->session[TM_ACTIVE].session_id,
> +  &m2->session[TM_ACTIVE].session_id))
> +{
> +/* allow this case */
> +}
>  /* do not float if target address is taken by client with another
> cert */
> -if (!cert_hash_compare(m1->locked_cert_hash_set,
> m2->locked_cert_hash_set))
> +else if (!cert_hash_compare(m1->locked_cert_hash_set,
> m2->locked_cert_hash_set))
>  {
>  msg(D_MULTI_LOW, "Disallow float to an address taken by
> another client %s",
>  multi_instance_string(ex_mi, false, &gc));
> --
> 2.34.1
>
>



___
Openvpn-devel mailing list
Openvpn-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openvpn-devel