problems with outbound load-balancing (PF sticky-address for destination IPs)

Andy Lemin Fri, 02 Apr 2021 18:41:12 -0700

Hi smart people :)

The current implementation of ‘sticky-address‘ relates only to a sticky source 
IP.
https://www.openbsd.org/faq/pf/pools.html


This is used for inbound server load balancing, by ensuring that all socket 
connections from the same client/user/IP on the internet goes to the same 
server on your local server pool.

This works great for ensuring simplified memory management of session artefacts 
on the application being hosted (the servers do not have to synchronise the 
users session data as extra sockets from that user will always connect to the 
same local server)

However sticky-address does not have an equivalent for sticky destination IPs. 
For example when doing outbound load balancing over multiple ISP links, every 
single socket is load balanced randomly. This causes many websites to break 
(especially cookie login and single-sign-on style enterprise services), as the 
first outbound socket will originate randomly from one of the local ISP IPs, 
and the users login session/SSO (on the server side) will belong to that first 
random IP.

When the user then browses to or uses another part of that same website which 
requires additional sockets, the additional sockets will pass the SSO 
credentials from the first socket, but the extra socket connection will again 
be randomly load-balanced, and so the remote server will reject the connection 
as it is originating from the wrong source IP etc.

Therefore can I please propose a “sticky-address for destination IPs” as an 
analogue to the existing sticky-address for source IPs?

This is now such a problem that we have to use sticky-address even on outbound 
load-balancing connections, which causes internal user1 to always use the same 
ISP for _everthing_ etc. While this does stop the breakage, it does not result 
in evenly distributed balancing of traffic, as users are locked to one single 
transit, for all their web browsing for the rest of the day after being 
randomly balanced once first-thing in the morning, rather than all users 
balancing over all transits throughout the day.

Another pain; using the current source-ip sticky-address for outbound 
balancing, makes it hard to drain transits for maintenance. For example without 
source sticky-address balancing, you can just remove the transit from the Pf 
rule, and after some time, all traffic will eventually move over to the other 
transits, allowing the first to be shut down for whatever needs. But with the 
current source-ip sticky-address, that first transit will take months to drain 
in a real-world situations..

lastly just as a nice-to-have, how feasible would a deterministic load 
balancing algorithm be? So that balancing selection is done based on the “least 
utilised” path?

Thanks for your time and consideration,
Kindest regards Andy



Sent from a teeny tiny keyboard, so please excuse typos.

problems with outbound load-balancing (PF sticky-address for destination IPs)

Reply via email to