On Thu, Mar 28, 2013 at 5:41 AM, Rajahalme, Jarno (NSN - FI/Espoo) <jarno.rajaha...@nsn.com> wrote: > > On Mar 27, 2013, at 1:47 , ext Jesse Gross wrote: > >> On Mon, Mar 25, 2013 at 12:03 PM, Jarno Rajahalme >> <jarno.rajaha...@nsn.com> wrote: >>> Changes the default tunnel dont_fragment from "true" (don't >>> fragment) to "false" (allow fragmentation). Tunnel outer headers >>> will not have the DF bit set by default, and if "df=true" option is >>> given for a tunnel, also local fragmentation will be disabled. >>> The name of the option is changed from "df_default" to "df" to be in >>> line with the rest of the tunneling code. >>> >>> Signed-off-by: Jarno Rajahalme <jarno.rajaha...@nsn.com> >> >> I can see the desire to make these two settings consistent, although >> it really seems preferable to me to have DF on in most situations to >> avoid possible repeated fragmentation. I also don't know that there's >> much benefit to turning local_df off since the alternative is to >> simply drop the packet (it will also generate an ICMP message but in >> the case of tunnels, the sender will never get it). > > I have no need to insist, but it seems to me that DF should be used (only) > when doing path MTU discovery, i.e., you are prepared to receive the > associated ICMP messages and decrease your message size accordingly. Since > OVS no longer does that for tunnels, maybe DF use should also be retired. > Relating to this, I'd think most implementations fragment only as the last > resort to avoid dropping packets. So, by setting DF and not doing PMTUD we > are essentially saying that it is OK to drop the tunneled packets if they > don't happen fit to the MTU on a link somewhere down the path.
There actually still is some path MTU discovery taking place to the local IP stack. If we need to fragment a packet locally and then a downstream router needs to fragment the packet, it will send us an ICMP message. We will then adjust our local fragmentation size and not continue to drop packets. > So, if we retire the use of DF by default, we can drop the local_df (read as: > "local_do_fragment" :-) setting and let the tunnel config to choose between > "OK to fragment" and "DO NOT fragment" for the whole path, including the > local stack. It is nice to have the setting unified but I think the reasons that people might want to change those settings are somewhat different. For the DF bit, it's usually because there might be firewalls dropping ICMP messages creating a blackhole for large packets. local_df doesn't have this problem so it's really only to avoid burning CPU to do fragmentation. > One strategy to avoid unnecessary fragmentation would be to not use the > maximum segment size when you must fragment. For example, if you have 1600 > byte IP packet to transmit over a link with MTU of 1500 bytes, it would be > better to fragment it to 812 and 808 byte packets, than, say 1500 byte and > 120 byte packets. That way the risk for further fragmentation (e.g., due to > yet another layer of tunneling) would be smaller. That is true, although since we don't implement IP fragmentation ourselves it would need to be a change to the Linux IP stack. > Finally, the option name "df_default" seems like a remnant from the time we > had the "df_inherit" option. IMO that should be fixed regardless. It is leftover and I agree that it's no longer the best name. However, we should avoid breaking compatibility unnecessarily. I suppose it depends on how many people are using it. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev