On 01/07/16 at 06:50pm, Hannes Frederic Sowa wrote: > On 07.01.2016 18:21, Thomas Graf wrote: > >On 01/07/16 at 08:35am, Jesse Gross wrote: > >>On Thu, Jan 7, 2016 at 3:49 AM, Thomas Graf <tg...@suug.ch> wrote: > >>>A simple start could be to add a new return code for > MTU drops in > >>>the dev_queue_xmit() path and check for NET_XMIT_DROP_MTU in > >>>ovs_vport_send() and emit proper ICMPs. > >> > >>That could be interesting. The problem in the past was making sure > >>that ICMPs that are generated fit in the virtual network appropriately > >>- right addresses, etc. This requires either spoofing addresses or > >>some additional knowledge about the topology that we don't currently > >>have in the kernel. > > > >Are you worried about emitting an ICMP with a source which is not > >a local host address? > > We have uRPF enabled for IPv4 by default on all kernels. Thus if we generate > an IPv4 ICMP packet back with an error message it must have a source address > which the receiving kernel considers valid. Valid means that sending to the > source address would have used the same outgoing interface the ICMP error > came in from.
Agreed. I think this is given though as we would reverse the addresses as icmp_send() already does: saddr = iph->daddr; > >Can't we just use icmp_send() in the context of the inner header and > >feed it to the flow table to send it back? It should be the same as > >for ip_forward(). > > The bridge's ip address often has no valid path as seen from the end host > system receiving the icmp error, because the openvswitch is not really part > of the L3 forwarding chain. I don't think the IP of the bridge ever comes into play. It shouldn't. I'm not even sure what could be considered the address of the bridge ;-) > Faking the address from the packet (e.g. using the destination address of > the original packet) will make traceroute go nuts. I think you are worried about an ICMP error from a hop which does not decrement TTL. I think that's a good point and I think we should only send an ICMP error if the TTL is decremented in the action list of the flow for which we have seen a MTU based drop (or TTL=0). I don't really see a difference between ip_forward(), some sophisticated tc action or OVS. As soon as they decremented TTL and perform L3 forwarding, then they should send out ICMP errors to allow for proper PMTU. > Normally ethernet devices don't return icmp error messages. E.g. broken > jumbo frame configuration just leads to silent packet loss because the > packet is discarded before a router can handle it. Thus it would be best in > case of local ovs installation if the error is already transported back to > the client application via the network call stack. This might be very > difficult in case we enqueue the packet to a backlog queue and reschedule > softirqs. Probably we need some way of faking source addresses from bridges > now.... :/ I think the major complications comes from the assumption that OVS is a bridge. This is not necessarily the case as stated above. If a flow is doing L3 forwarding, we should send ICMPs as expected from a router. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev