On Mon, Oct 3, 2016 at 2:21 PM, Darrell Ball <dlu...@gmail.com> wrote:
> On Mon, Oct 3, 2016 at 10:54 AM, Han Zhou <zhou...@gmail.com> wrote: > > > > > > > On Sun, Oct 2, 2016 at 2:14 PM, Darrell Ball <dlu...@gmail.com> wrote: > > > > > > > > > > > > On Sun, Oct 2, 2016 at 11:27 AM, Han Zhou <zhou...@gmail.com> wrote: > > >> > > >> On Sat, Oct 1, 2016 at 4:34 PM, Darrell Ball <dlu...@gmail.com> > wrote: > > >> > > > >> > Do not install any potential logical switch "router type" > > >> > port arp responders. Logical router port arp responders > > >> > should be sufficient in this respect. > > >> > It seems a little wierd for a logical switch not proxying > > >> > for a remote VIF to be responding to arp requests and we > > >> > are not functionally using this capability in ovn. > > >> > > > >> Hi Darrell, > > >> > > >> The arp responder for patch port is useful e.g. when a VM pings the > > default gateway IP. Would removing the flow cause the arp request get > > flooded? And what's the benefit of removing it here? > I agree with Han that removing the flow would cause the ARP request to get flooded in more cases, so there would be some performance impact. > > > > > > > > 1) Modelling: I would expect the L3 gateway arp responder to be > > associated with the L3 > > > gateway router datapath, at the very least. That way, the modeling is > > correct and we don't have a situation where, for example, a phantom > gateway > > router is never even downloaded to a HV, > > > but is "responding" or rather appearing to respond to arp requests. > > > > > > > Ok, I see your concern. To achieve this expectation, it may be done in a > > way that is similar as the regular LS ports: reply ARP only if > > Logical_Switch_Port.up = true. When gateway router is bound to a chassis > we > > can set the LS patch port up to true. And for distributed routers we can > > set patch port up directly. This way we can avoid responding ARP before > > gate router is bound. > > > > I think you missed the main aspect. > There is a layering violation in doing this and also a modeling issue. > The key idea can be summarized as "A logical router should respond to arps > to itself" rather than some logical switch proxying that. > I don't think this is a layering violation. You are objecting to an ARP response being generated by the switch on a port far away from the actual destination port. Why is it so different whether the endpoint is a router or a VIF? Both routers and VIFs can and do generate their own ARP responses, but that does not rule out the optimization that responds to ARPs immediately at the source switch port. > This has implications for cases where an IP address is shared by several > gateways > and then the binding is used to designate the gateway used. > > If there are cases where an IP address can appear on the same network with different MAC addresses, then you have a problem. We would need to know more about this use case. Note that you pretty much do have a knob already to control this behavior. If the addresses specified on the switch's "router" type port are only ethernet addresses, then the switch will not generate any ARP replies. If the addresses specified on the switch's "router" type port include both ethernet and IP addresses, then the switch will generate ARP replies for each specified ethernet/IP address combination. The only other place where it looks like a switch port IP address is used is for IPAM and DCHP. I did not look into this in any more detail, so I am not sure of all the implications of leaving out the IP addresses from the switch "router" type port. > > > > > However, I wonder even this change may not be needed. For my > understanding > > ARP is just to resolve address. Do you see any real problem of replying > > even if the gateway router is not yet bound? I don't think this is a > > problem of modeling. It might look weird just because it behaves slightly > > different from traditional view. I would prefer keep the simplicity. > > > > Having both logical switch and logical router arp responders for the same > gateway router > is not simpler; it is more complicated > I suggest having a single arp responder built by the associated logical > router. > > > > > > > > > > 2) We install an arp responder for the logical routers, including L3 > > gateway(s) today (see below). > > > We check for inport in this rule and this inport is only associated > with > > the L3 gateway HV. > > > So only the L3 gateway HV should respond. Meaning, if there is a > > response, the L3 gateway > > > datapath is really there. > > > > But the L2 flooding would still happen, right? > > > > Of course; > Since a L3 gateway resides on a remote HV only, the packets need to > traverse the > network to confirm reachability and binding of that L3 gateway. > > > > > > > > > > > 3) Usually, there are a limited number of L3 gateways and therefore > > associated bindings. > > > Also, for VMs participating in south<->north traffic, the bindings are > > less likely > > > to timeout since there are multiple uses of the L3 gateway for each VM. > > > > > > > With a big L2, even a small percent of VM doing ARP will cause annoying > > flooding. Moreover, considering containers come and go frequently this > > would be more common. So I think it is still better to suppress ARP for > > south-north if there is no real problem. > > > > I don't buy it. > > Today, we skip using arp responders for packets arriving on localnet and > vtep ports, > meaning the arp requests go to all VMs. > We should be clear why we are skipping ARP responders for packets arriving on "localnet" and "vtep" ports. It has nothing to do with performance. If the switch port type is such that there might be a traditional L2 network behind it, then there are two potentially serious problems: 1. Due to flooding of ARP requests in the traditional L2 network, it is possible to receive multiple copies of the same ARP request on different hypervisors. If they all reply to the ARP request, then the source of the ARP request will receive multiple replies. 2. If the ARP replies have the same MAC address from different attachment points to the traditional L2 network, then they can mess up L2 learning. For switch port types other than "localnet", "vtep" and "l2gateway", it seems like the switch ARP response is replying right at the source of the ARP request. If there is no flooding of the ARP request, and no L2 learning implication, then how does the switch ARP responder cause any problems? For "l2gateway", I guess the current supported scenarios support only one "l2gateway" port to each attached traditional L2 network? If that is the case, then even in this case the ARP response would not cause any problems. This would be a much more serious issue since external abuse is possible. > > This L3 gateway case is more limited and other approaches are possible to > mitigate > this. > We discussed this internally and we are otherwise thinking to have a user > visible > configuration for arp responders in general. > Do you need a new knob, or would it be good enough to leave the IP addresses out of the switch "router" type port's addresses, specifying only the ethernet addresses, as described above? Mickey > If we really cannot tolerate a few containers coming and going then we have > a serious > problem that already exists for localnet and vtep cases as well as pure L2 > forwarding > decisions. > > > > > > > > > > > >> > > >> > > >> > > >> Han > > > > > > > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev