On Thu, Mar 28, 2013 at 5:41 AM, Rajahalme, Jarno (NSN - FI/Espoo)
<jarno.rajaha...@nsn.com> wrote:
>
> On Mar 27, 2013, at 1:47 , ext Jesse Gross wrote:
>
>> On Mon, Mar 25, 2013 at 12:03 PM, Jarno Rajahalme
>> <jarno.rajaha...@nsn.com> wrote:
>>> Changes the default tunnel dont_fragment from "true" (don't
>>> fragment) to "false" (allow fragmentation).  Tunnel outer headers
>>> will not have the DF bit set by default, and if "df=true" option is
>>> given for a tunnel, also local fragmentation will be disabled.
>>> The name of the option is changed from "df_default" to "df" to be in
>>> line with the rest of the tunneling code.
>>>
>>> Signed-off-by: Jarno Rajahalme <jarno.rajaha...@nsn.com>
>>
>> I can see the desire to make these two settings consistent, although
>> it really seems preferable to me to have DF on in most situations to
>> avoid possible repeated fragmentation.  I also don't know that there's
>> much benefit to turning local_df off since the alternative is to
>> simply drop the packet (it will also generate an ICMP message but in
>> the case of tunnels, the sender will never get it).
>
> I have no need to insist, but it seems to me that DF should be used (only) 
> when doing path MTU discovery, i.e., you are prepared to receive the 
> associated ICMP messages and decrease your message size accordingly. Since 
> OVS no longer does that for tunnels, maybe DF use should also be retired. 
> Relating to this, I'd think most implementations fragment only as the last 
> resort to avoid dropping packets. So, by setting DF and not doing PMTUD we 
> are essentially saying that it is OK to drop the tunneled packets if they 
> don't happen fit to the MTU on a link somewhere down the path.

There actually still is some path MTU discovery taking place to the
local IP stack.  If we need to fragment a packet locally and then a
downstream router needs to fragment the packet, it will send us an
ICMP message.  We will then adjust our local fragmentation size and
not continue to drop packets.

> So, if we retire the use of DF by default, we can drop the local_df (read as: 
> "local_do_fragment" :-) setting and let the tunnel config to choose between 
> "OK to fragment" and "DO NOT fragment" for the whole path, including the 
> local stack.

It is nice to have the setting unified but I think the reasons that
people might want to change those settings are somewhat different.
For the DF bit, it's usually because there might be firewalls dropping
ICMP messages creating a blackhole for large packets.  local_df
doesn't have this problem so it's really only to avoid burning CPU to
do fragmentation.

> One strategy to avoid unnecessary fragmentation would be to not use the 
> maximum segment size when you must fragment. For example, if you have 1600 
> byte IP packet to transmit over a link with MTU of 1500 bytes, it would be 
> better to fragment it to 812 and 808 byte packets, than, say 1500 byte and 
> 120 byte packets. That way the risk for further fragmentation (e.g., due to 
> yet another layer of tunneling) would be smaller.

That is true, although since we don't implement IP fragmentation
ourselves it would need to be a change to the Linux IP stack.

> Finally, the option name "df_default" seems like a remnant from the time we 
> had the "df_inherit" option. IMO that should be fixed regardless.

It is leftover and I agree that it's no longer the best name.
However, we should avoid breaking compatibility unnecessarily.  I
suppose it depends on how many people are using it.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to