On 12.07.2012 16:55, George Neville-Neil wrote:

On Jul 11, 2012, at 17:57 , Navdeep Parhar wrote:

On 07/11/12 14:30, g...@freebsd.org wrote:
Howdy,

Does anyone know the reason for this particular check in
ip_output.c?

        if (rte != NULL && (rte->rt_flags & (RTF_UP|RTF_HOST))) {
                /*
                 * This case can happen if the user changed the MTU
                 * of an interface after enabling IP on it.  Because
                 * most netifs don't keep track of routes pointing to
                 * them, there is no way for one to update all its
                 * routes when the MTU is changed.
                 */
                if (rte->rt_rmx.rmx_mtu > ifp->if_mtu)
                        rte->rt_rmx.rmx_mtu = ifp->if_mtu;
                mtu = rte->rt_rmx.rmx_mtu;
        } else {
                mtu = ifp->if_mtu;
        }

To my mind the > ought to be != so that any change, up or down, of the
interface MTU is eventually reflected in the route.  Also, this code
does not check if it is both a HOST route and UP, but only if it is
one other the other, so don't be fooled by that, this check happens
for any route we have if it's up.

I believe rmx_mtu could be low due to some intermediate node between this host 
and the final destination.  An increase in the MTU of the local interface 
should not increase the path MTU if the limit was due to someone else along the 
route.

Yes, it turns out to be complex.  We have several places that store the MTU.  
There is the interface,
which knows the MTU of the directly connected link, a route, and the host 
cache.  All three of these
are used to determine the maximum segment size (MSS) of a TCP packet.  The 
route and the interface
determine the maximum MTU that the MSS can have, but, if there is an entry in 
the host cache
then it is preferred over either of the first two.  See tcp_update_mss() in 
tcp_input.c to
see what I'm talking about.

We have three sources of the MTU for TCP to chose from (sorted in priority 
order):

 1. Hostcache to use a previous discovered value (pmtud).

 2. Most specific route, which can be manually set when it is known that
    a lower MTU exists along that path.

 3. Interface MTU.

The third one isn't really being used because the routes inherit the MTU
from the interface.  Number 3 is relevant when we don't store the MTU
with the route anymore unless manually set.

I believe that the quoted code above has been wrong from the day it was 
written, in that what it
really says is "if the route is up" and not "if the route is up and is a host 
route" which is
what I believe people to read that as.  If the belief is that this code is 
really only there for
hosts routes, then the proper fix is to make the sense of the first if match 
that belief
and, again, to change the > to != so that when the administrator of the box 
bumps the MTU in
either direction that the route reflects this.  It is not possible for PMTU on 
a single link
to a host route to bump the number down if the interface says it's not to be 
bumped.  And,
even so, any host cache entry will override and avoid this code.

The cited code is wrong in that it doesn't only test for host routes.
It is correct though that it only works one way by reducing the route
MTU to the interface MTU.  Doing an "!=" would break manual setting
of MTU on a route.

IIRC this test comes from the day when we had a host route for every
inpcb and changes to the interface didn't reflect back on all those
host routes.

It can be fixed by either testing just for (rte != NULL) or by doing
away with the bogus RTF_HOST bit.  Passing an inactive route to ip_output()
isn't exactly useful and may lead to some later bogosity.

--
Andre
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to