Re: multiple routing tables review patch ready for simple testing.

Bruce M Simpson Wed, 30 Apr 2008 08:39:47 -0700

Julian Elischer wrote:

An interface may however be present in entries from multiple FIBs
in which case the INCOMING packets on that interface need to
be disambiguated with respect to which FIB they belong to.


Yes, there is no way the forwarding code alone can do this.

It should not be expected to, and it's important to maintain a cleanfunctional separation there, otherwise one ends up in the same quagmirewhich has been plaguing a lot of QoS research projects over the years(Where do I put this bit of the system?)


This is a job for an outside entity (from the fibs).
In this case a packet classifier such as pf or ipfw is ideal
for the job. providing an outside mechanism for implementing
whatever policy the admin wants to set up.


Absolutely. This has been the intent from the beginning.

There is no "one size fits all" approach here. We could put a packetclassifier into the kernel which works just fine for DOCSIS consumerdistribution networks, but has absolutely no relevance to an ATMbackbone (these are the two main flavours of access for folk in the UK).


I find it is convenient to envision each routing FIB as a routing
plane, in a stack of such planes. Each plane may know about the same
interfaces or different interfaces. When a packet enters a routing
plane it is routed according to the internal rules of that plane.
Irrespective of how other planes may act.  Each plane can only route
a packet to interfaces that are know about on that plane.
Incoming packets on an interface don't know what plane to go to
and must be told which to use by the external mechanism. It
IS possible that an interface in the future might have a default
plane, but I haven't implemented this.


This limitation seems fine for now.

Users can't be expected to configure the defaults "by default" if theyaren't supported, so, if overall the VRF-like feature defaults to off,and there are big flashing bold letters saying "You must fully configurethe forwarding plane mappings if you wish to use multiple FIBs", thenthat's fine by me.


if you have several alias addresses on an interface it is possible
that some FIBS know about some of them and others know about other
addresses. New addresses when added are added to each FIB and
whatever is adding them shoudl remove them from the ones that don't
need it.  This may change but it fits in with how the current code
works and keeps the diff to a manageable size.

In any event, for plain old IP forwarding, a node's endpointaddresses are used only as convenient ways of referring to physical links.


To back up and give this some detailed background:

For example, 192.0.2.1/24 might be configured on fxp0, and wereceive a packet on another interface for 192.0.2.2. When resolving aroute, the forwarding code needs to do a lookup to see from where192.0.2.2 is reachable before the next-hop is resolved in the table.That happens on a per-FIB basis, when the patches are applied -- howeverthe job of tagging input for which FIB is the job of the classifier.

The problems with the above approach begin when an input interfaceresides in multiple virtual FIBs (no 1:1 mapping), or when you can'trefer to it by an address (it has no address -- unnumberedpoint-to-point link, or addresses do not apply), or when you attempt toimplement encapsulation (e.g. GRE, IPIP) in the forwarding layer.

Then, you're reliant on each individual FIB having resolvednext-hops correctly. The existing forwarding code already does some ofthis by forcing the ifp to be set for any route added to the table. Thisis done implicitly for routes which transit point-to-point interfaces.BSD has had some weaknesses in this area. It makes implementingthings like VRRP particularly difficult, which is why the ifnet approachto CARP was used (the forwarding table gets to see a single ifp); iteliminates a level of possible recursion from that layer of the routingstack.

With multicast, for example, next-hops can't be identified by IPv4addresses alone. Every forwarding decision has potentially more than oneresult, and links are referred to by physical link (this could be anifp, an interface index, a name, whatever), and where messages areforwarded is determined using a link-scope protocol such as IGMP.

There, it's reasonable to expect that the user partitioned off themulticast forwarding planes into separate virtual FIBs, and that theappropriate rules in the classifier are configured.

For SSM, the key (S,G) match has to happen in the input classifier,if one is going to route flows OK using the multiple FIB feature -- themulticast routing daemons have to be aware of it, 'cuz you can't run aseparate instance of PIM for every set of flows -- PIM is greedyper-link, a !1:1 mapping problem exists, PIM has no way of tellingseparate instances apart (no hierarchy in the form of e.g. OSPF areas,and even OSPF won't let you put a link in more than one area -- virtuallinks don't count!)

This is so much whizzing in the wind without a new MROUTINGimplementation though, and hierarchical multicast routing is a projectin of itself.


To summarize:

For now, the limitations of the system should be documented so thatusers don't inadvertently configure local forwarding loops, even forunicast traffic; with multicast, the amplification effect ofmisconfiguration is inherently more damaging to a network.

The IPv4 address of an interface can't be used as an identifier forsource routing -- there is no way of knowing that was the next-hop usedby the last-hop, the information just ain't there -- so if you have thesame input interfaces in multiple virtual FIBs, you need to double checkthe appropriate match rules are in place for the flows to go where youwant them to go.

(and it suits what I need for work where a route manager daemon

knows to do this.)

This is another reason why I maintain that RIB and FIB should havefunctional separation.

It's unreasonable to expect the kernel to perform next-hopresolution on every route presented to it, beyond that which is requiredby the link layer (i.e. ARP, and that should be functionally separatedtoo). Recursive resolution also demands stack space, and this is ascarce kernel resource.

Of course, well behaved routers are engineered such that therecursion takes place at RIB level, where limits and policy can be moreeasily applied, and before the route is plumbed into the hardware TCAM(or software FIB). Don't try to make the kernel do your dirty laundry.


cheers
BMS

P.S. I see you tweaked verify_path() to do the lookup in the numberedFIB. Cool.

I should point out that for ad-hoc networks, the ability to turn offRPF/uRPF for multicast is needed as the routing domain is often NOTfully converged -- so the RPF checks normally present may discardlegitimate traffic which hasn't been forwarded yet. An encapsulation istypically used to maintain forwarding state which is relevant to theparticular topology in use.

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: multiple routing tables review patch ready for simple testing.

Reply via email to