Ahhhhh, Your diagram makes perfect sense now :) Thank you - So it does not have to undergo a full rehashing of all links (which breaks _lots_ of sessions when NAT is involved), but also does not have to explicitly track anything in memory like you say đ So better than full re-hashing and cheaper than tracking.
PS; Thank you for confirming; "It therefor routes the same src/dst pair over the same nexthop as long as there are no changes to the route". I was getting hung up on the bit in the RFC that says "hash over the packet header fields that identify a flow", so I was imagining the hashing was using a lot of entropy including the ports. I guess I should have thought around that more and read it as "hash over the IP packet header fields that identify a flow" ;) I shall go and experiment :) On Wed, Sep 29, 2021 at 8:45 PM Claudio Jeker <cje...@diehard.n-r-g.com> wrote: > On Wed, Sep 29, 2021 at 08:07:43PM +1000, Andrew Lemin wrote: > > Hi Claudio, > > > > So you probably guessed I am using 'route-to { GW1, GW2, GW3, GW4 } > random' > > (and was wanting to add 'sticky-address' to this) based on your reply :) > > > > "it will make sure that selected default routes are sticky to source/dest > > pairs" - Are you saying that even though multipath routing uses hashing > to > > select the path (https://www.ietf.org/rfc/rfc2992.txt - "The router > first > > selects a key by performing a hash (e.g., CRC16) over the packet header > > fields that identify a flow."), subsequent new sessions to the same dest > IP > > with different source ports will still get the same path? I thought a new > > session with a new tuple to the same dest IP would get a different hashed > > path with multipath? > > OpenBSD multipath routing implements gateway selection by Hash-Threshold > from RFC 2992. It therefor routes the same src/dst pair over the same > nexthop as long as there are no changes to the route. If one of your > links drops then some sessions will move links but the goal of > hash-threshold is to minimize the affected session. > > > "On rerouting the multipath code reshuffles the selected routes in a way > to > > minimize the affected sessions." - Are you saying, in the case where one > > path goes down, it will migrate all the entries only for that failed path > > onto the remaining good paths (like ecmp-fast-reroute ?) > > No, some session on good paths may also migrate to other links, this is > how the hash-threshold algorithm works. > > Split with 4 nexthops, now lets assume link 2 dies and stuff gets > reshuffled: > +=================+=================+=================+=================+ > | link 1 | link 2 | link 3 | link 4 | > +=================+=====+===========+===========+=====+=================+ > | link 1 | link 3 | link 4 | > +=======================================================================+ > Unaffected sessions for drop > ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^ > Affected sessions because of drop > ################# ##### > Unsing other ways to split the hash into buckets (e.g. a simple modulo) > causes more change. > > Btw. using route-to with 4 gw will not detect a link failure and 25% of > your traffic will be dropped. This is another advantage of multipath > routing. > > Cheers > -- > :wq Claudio > > > Thanks for your time, Andy. > > > > On Wed, Sep 29, 2021 at 5:21 PM Claudio Jeker <cje...@diehard.n-r-g.com> > > wrote: > > > > > On Wed, Sep 29, 2021 at 02:17:59PM +1000, Andrew Lemin wrote: > > > > I see this question died on its arse! :) > > > > > > > > This is still an issue for outbound load-balancing over multiple > internet > > > > links. > > > > > > > > PF's 'sticky-address' parameter only works on source IPs (because it > was > > > > originally designed for use when hosting your own server pools - > inbound > > > > load balancing). > > > > I.e. There is no way to configure 'sticky-address' to consider > > > destination > > > > IPs for outbound load balancing, so all subsequent outbound > connections > > > to > > > > the same target IP originate from the same internet connection. > > > > > > > > The reason why this is desirable is because an increasing number of > > > > websites use single sign on mechanisms (quite a few different > > > architectures > > > > expose the issue described here). After a users outbound connection > is > > > > initially randomly load balanced onto an internet connection, their > > > browser > > > > is redirected into opening multiple additional sockets towards the > > > > website's load balancers / cloud gateways, which redirect the > connections > > > > to different internal servers for different parts of the site/page, > and > > > the > > > > SSO authentication/cookies passed on the additional sockets must to > > > > originate from the same IP as the original socket. As a result > outbound > > > > load-balancing does not work for these sites. > > > > > > > > The ideal functionality would be for 'sticky-address' to consider > both > > > > source IP and destination IP after initially being load balanced by > > > > round-robin or random. > > > > > > Just use multipath routing, it will make sure that selected default > routes > > > are sticky to source/dest pairs. You may want the states to be > interface > > > bound if you need to nat-to on those links. > > > > > > On rerouting the multipath code reshuffles the selected routes in a > way to > > > minimize the affected sessions. All this is done without any extra > memory > > > usage since the hashing function is smart. > > > > > > -- > > > :wq Claudio > > > > > > > > > > Thanks again, Andy. > > > > > > > > On Sat, Apr 3, 2021 at 12:40 PM Andy Lemin <andrew.le...@gmail.com> > > > wrote: > > > > > > > > > Hi smart people :) > > > > > > > > > > The current implementation of âsticky-addressâ relates only to a > sticky > > > > > source IP. > > > > > https://www.openbsd.org/faq/pf/pools.html > > > > > > > > > > This is used for inbound server load balancing, by ensuring that > all > > > > > socket connections from the same client/user/IP on the internet > goes > > > to the > > > > > same server on your local server pool. > > > > > > > > > > This works great for ensuring simplified memory management of > session > > > > > artefacts on the application being hosted (the servers do not have > to > > > > > synchronise the users session data as extra sockets from that user > will > > > > > always connect to the same local server) > > > > > > > > > > However sticky-address does not have an equivalent for sticky > > > destination > > > > > IPs. For example when doing outbound load balancing over multiple > ISP > > > > > links, every single socket is load balanced randomly. This causes > many > > > > > websites to break (especially cookie login and single-sign-on style > > > > > enterprise services), as the first outbound socket will originate > > > randomly > > > > > from one of the local ISP IPs, and the users login session/SSO (on > the > > > > > server side) will belong to that first random IP. > > > > > > > > > > When the user then browses to or uses another part of that same > website > > > > > which requires additional sockets, the additional sockets will pass > > > the SSO > > > > > credentials from the first socket, but the extra socket connection > will > > > > > again be randomly load-balanced, and so the remote server will > reject > > > the > > > > > connection as it is originating from the wrong source IP etc. > > > > > > > > > > Therefore can I please propose a âsticky-address for destination > IPsâ > > > as > > > > > an analogue to the existing sticky-address for source IPs? > > > > > > > > > > This is now such a problem that we have to use sticky-address even > on > > > > > outbound load-balancing connections, which causes internal user1 to > > > always > > > > > use the same ISP for _everthing_ etc. While this does stop the > > > breakage, it > > > > > does not result in evenly distributed balancing of traffic, as > users > > > are > > > > > locked to one single transit, for all their web browsing for the > rest > > > of > > > > > the day after being randomly balanced once first-thing in the > morning, > > > > > rather than all users balancing over all transits throughout the > day. > > > > > > > > > > Another pain; using the current source-ip sticky-address for > outbound > > > > > balancing, makes it hard to drain transits for maintenance. For > example > > > > > without source sticky-address balancing, you can just remove the > > > transit > > > > > from the Pf rule, and after some time, all traffic will eventually > move > > > > > over to the other transits, allowing the first to be shut down for > > > whatever > > > > > needs. But with the current source-ip sticky-address, that first > > > transit > > > > > will take months to drain in a real-world situations.. > > > > > > > > > > lastly just as a nice-to-have, how feasible would a deterministic > load > > > > > balancing algorithm be? So that balancing selection is done based > on > > > the > > > > > âleast utilisedâ path? > > > > > > > > > > Thanks for your time and consideration, > > > > > Kindest regards Andy > > > > > > > > > > > > > > > > > > > > Sent from a teeny tiny keyboard, so please excuse typos. > > > > > > > > > > > > >