On 03/06/08 13:17, Max Laier wrote:
Am Do, 6.03.2008, 09:36, schrieb Attila Nagy:
Hello,

I've just upgraded some of our 6-STABLE servers to 7-STABLE to notice
that pf reply-to for directly connected IPs seems to be broken.

I have the following relevant rule in pf.conf:
pass in on $ext_if reply-to ( $ext_if csmvip ) proto tcp from any to any
port 25 label "mxtraffic-tcp" keep state

which routes incoming SMTP connections (to be exact, the replies to
them) to the csmvip host, which is a load balancer. This is needed
because the LB doesn't do source NAT (it does destination NAT however to
direct traffic addressed to its virtual IP to the real servers' IPs),
and the servers have a different default route than the LB. This way the
servers reply to the LB, so it can rewrite the replies' source address
to its virtual IP, so the client will see the correct IP (the LB's
virtual IP) in the address, instead of the host's real address.

It seems that this still works in 7-STABLE for the internet (not
directly connected) hosts, but not for directly connected hosts, for
example the ones, which are in the same subnet as my servers.
To overcome this, I've had to add static ARP entries to the servers, to
tell that the clients' hardware address is the address of the load
balancer, but it would be better if the previous behaviour (as in
6-STABLE) could be restored.

Could anybody help to resolve this?

Might be the lack of sleep and coffee, but I can't quite figure out the
network layout you are talking about.  Could you draw up a small example
setup so I can follow?  Or at least (pseudo-)IP addresses for client,
load-balancer, pf-box and servers?
Of course, see: http://people.fsn.hu/~bra/freebsd/route-to-20080306/lb.png

10.0.0.1 is a normal router, which connects the servers to the internet (they can be reached at their real IPs through it and they can reach the internet via them). This is the default GW for the hosts. 10.0.0.2 is a load balancer, which has another address too (10.1.1.2), which is what I've called virtual IP. We do the service on this virtual IP, so the clients on the internet connect to this.
10.0.0.3 and 10.0.0.4 are the real servers behind the virtual IP.

The load balancer does this:
- a client -say 10.2.2.2 on the internet- connects to the load balancer's virtual IP, 10.1.1.2, for example with TCP/25 - the load balancer then takes a machine out of its configured pool for this service, for example 10.0.0.3 - to reach 10.0.0.3:25, the load balancer changes the packet's destination address from 10.1.1.2 to 10.0.0.3 and sends it out on its 10.0.0.2 interface - the host 10.0.0.3 gets this and sees that the source address is not directly connected, so it tries to send it via the default GW (10.0.0.1), which is of course not good, so we have a reply-to rule, which forces these replies back to 10.0.0.2 - the load balancer gets the reply, then changes its source IP to 10.1.1.2 (the virtual IP for this service) and sends out to the internet

We can't send the reply to 10.0.0.1 (the default GW), because it won't do the source NAT and the load balancer will see only one way of the traffic, and it won't let it through (it has a state table, so for example a TCP session must be built up through it, it has to follow the states). We also can't use the load balancer as a default GW, because it's not a router, it's pointless to route normal outgoing traffic through it (which is originated at the hosts), and the load balancer wouldn't allow this anyway, because it only handles configured services and everything must be in its state table.

This part works OK both with FreeBSD 6 and 7, the replies to the servers go to the load balancer.

What does not work is the scenario, when 10.0.0.4 tries to connect to 10.1.1.2. You can easily see, that this is a problem, because 10.0.0.4 sends the first packet through 10.0.0.1, which in turn forwards it to 10.1.1.2 (the virtual IP of the load balancer). The load balancer then changes the destination IP from 10.1.1.2 to 10.0.0.3 (I really should draw three or more boxes, but please imagine that 10.0.0.4 won't connect to a service in which it is placed, so it won't happen that both the source and the destination is the same machine) and sends it through its 10.0.0.2 interface. The host gets the packet and sees that the source is 10.0.0.4, which is directly connected to it, so it sends the answer directly and 10.0.0.3 will stand confused (sent a request to 10.1.1.2, got the reply from 10.0.0.3). This is the second thing, which is the above reply-to is there: the machine must send the reply to 10.0.0.2, not to 10.0.0.4. The load balancer then will change the source address to 10.1.1.2 and send the reply to 10.0.0.4, so everything will be OK.

This is what works in 6.x, but not in 7.

I hope I could make this clear enough, if not, please ask.

Thanks,

--
Attila Nagy                                   e-mail: [EMAIL PROTECTED]
Free Software Network (FSN.HU)                 phone: +3630 306 6758
http://www.fsn.hu/

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to