> On Jul 7, 2016, at 10:36 PM, Mickey Spiegel <emspi...@us.ibm.com> wrote: > > -----Guru Shetty <g...@ovn.org> wrote: ----- > >> To: Mickey Spiegel/San Jose/IBM@IBMUS >> From: Guru Shetty <g...@ovn.org> >> Date: 07/07/2016 09:34PM >> Cc: ovs dev <dev@openvswitch.org> >> Subject: Re: [ovs-dev] [PATCH 1/2] ovn-northd: Ability to loop-back >> in a router. >> >> >> >> On 7 July 2016 at 21:28, Guru Shetty <g...@ovn.org> wrote: >> >> >>> On 7 July 2016 at 20:30, Mickey Spiegel <emspi...@us.ibm.com> wrote: >>> To: dev@openvswitch.org >>> From: Gurucharan Shetty >>> Sent by: "dev" >>> Date: 07/05/2016 11:15AM >>> Subject: [ovs-dev] [PATCH 1/2] ovn-northd: Ability to loop-back in a router. >>> >>> Currently, when a client looks at a load balancer VIP, >>> it notices that it is in a different subnet than itself >>> and sends the packet to its connected router port's >>> MAC address. The load balancer intercepts it. >>> >>> If the load balancer VIP translates to an endpoint IP in a >>> different subnet (than the one client has), than the >>> load balancing works fine because the router will send >>> the packet to the correct destination. >>> >>> But if one of the endpoints that VIP translated into >>> was in the same subnet as the client, the OVN router >>> fails to send the packet back via the same interface. >> >> So the load balancer is translating the destination IP, >> but leaving the MAC address unchanged? >> Based on the MAC address, the packet is forwarded to >> the router patch port? >> Yes. This does look like a common behavior. Atleast, the default >> Kubernetes load balancers (or any iptables based load-balancers) seem >> to do that. > > This does not seem clean. I still wonder whether it would make > more sense to start over on a separate logical switch for the load > balancer, leading to a different patch port into the logical router. I feel right now that it complicates the topology for not a lot of useful benefits.
> >> --snip... >> >> >> I am concerned about two aspects of this proposal: >> 1. It applies to all traffic to directly connected subnets, not just >> for load balancer traffic. That is a significant change in behavior. >> Agreed. (Having said that, some Physical routers seem to do the same >> thing. i.e. have the capability to send back the traffic. I am not >> sure whether all Physical routers are capable of doing it.) > > A quick search told me that one of the major router vendors allowed > that 8 or 9 years ago. Not sure if they allow it now. > > Their firewalls do not allow it by default, but have a configuration knob. > >> 2. It is removing the inport early on in the router ingress pipeline, >> which scares me and seems like it will make debugging difficult. >> You could narrow it down quite a bit by matching on inport, but >> that still leaves the behavior that concerns me for some traffic. >> Looking at my design for NAT in a distributed router, removing >> the inport would break it. I suspect there might be other >> future features that might act on inport, such as RPF. >> >> >> This is only true when the destination IP address is in the same >> subnet as the router port. For other cases, inport is available. Do >> you also need to send back traffic? I guess what I am getting at is, >> why do you think this will hurt other features which won't loop-back? > > This is not about loopback. It is about the mechanism that you chose > to achieve your goal, zeroing out the inport very early in the router > ingress pipeline. Other lookups later in the router ingress pipeline > may need to have the inport available for match conditions. For the > NAT design that I am working on, I want to match on the router > gateway address (SNAT) and inport == gateway port, together. For > RPF, it could be any router port. > >> Looks like my patch does it for every router port in that router. >> That is clearly wrong and was not my intention. If I limit it to only >> the port which has that subnet, would that satisfy your concern? > > No. That is what I mentioned above by "narrow it down quite a bit > by matching on inport". You would still be zeroing out the inport > in some cases, which may affect later pipeline stages that want to > match on inport. > > Once you put this change in, in what cases are you still precluding > inport == outport? > Only when the dest IP matches a default, static or dynamic route > rather than a connected subnet. > Does the inport == outport check still have any significant value > once you do that? > I would argue not much. The simplest solution in that case would > be to turn off the check for router datapaths, though I would still > think it should be protected by a configuration knob of some sort. > If you turn off the check for router datapaths, the change would be > in physical.c for table 34, and would not affect the logical flows > constructed by northd. That is one way to look at it and makes sense. Let me think over this and talk to people for more ideas. > > Mickey > >> (For cases like that, a workaround would be to store inport in a >> register for later use? ) >> >> >>> /* NAT in Gateway routers. */ >>> -- >>> 1.9.1 >>> >>> _______________________________________________ >>> dev mailing list >>> dev@openvswitch.org >>> http://openvswitch.org/mailman/listinfo/dev >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev