Lucy,

1. Constraining the distribution of Leaf A-D routes.

If you look at sections 9.2.3.2.1 and 9.2.3.4.1 of RFC 6514, you'll see that there are some rules that enable you to avoid sending a Leaf A-D route on an EBGP session unless a corresponding I/S-PMSI A-D route was received over that session. There are similar rules in RFC 6514 governing the distribution of C-multicast routes. These rules are intended to prevent the Leaf A-D routes and C-multicast routes from being distributed more widely than necessary. Whether these rules always work is questionable; they tend to have hidden assumptions about the deployment.

But if you want to investigate ways to optimize the distribution of the Leaf A-D routes, that's a good place to start.

One might try the following rule. If R1 receives a Leaf A-D route, and if R1 is not identified in the route's RT, and if the Leaf A-D route has a route key that is the NLRI of an S-PMSI A-D route that R1 has installed, then only distribute the Leaf A-D route on a BGP session that leads to the BGP speaker that is the next hop of the S-PMSI A-D route. Whether this rule actually works in various deployment scenarios would require further investigation.

[Lucy] To suppress unnecessary redistribution, a P-tunnel BGP node tracks P-tunnel neighbor state. A BGP next hop is one of P-tunnel downstream neighbor, upstream neighbor, and N/A. The policy is, if the BGP next hop of the UPDATE of Leaf A-D route is the downstream neighbor, redistribution the route; if not, no redistribution.

I don't understand this proposal; I don't see how you can tell by examining the next hop of the Leaf A-D route whether you need to redistribute the route. A rule based on the next hop of the corresponding I/S-PMSI A-D route sounds more promising.

Another approach would be to use Constrained Route Distribution. This would ensure that the Leaf A-D route reaches its target, and would prevent the route from traveling over "unnecessary" alternate paths. In certain deployment scenarios, ORF is also available as a way to prevent routes from being distributed unnecessarily. Both these methods are forms of RT-based filtering, and both are independent of MVPN.

Of course, one also has to worry about creating a robustness problem if route distribution is constrained so that routes follow only one path.

Since the topic of this thread is "comment on draft-ietf-bess-ir", and since that draft is in WGLC, I'll just point out again that this issue is not specific to ingress replication.

[Lucy] IMO: this mechanism for membership announcement raises a BIG concern on the scalability and performance. Why is it not a concern for you?

I wouldn't say it's not a concern, but it's important not to focus exclusively on the worst case. Typical deployment scenarios don't come close to the worst case, and there are various tools and filtering policies that can be used to constrain the distribution of updates based on the RTs.

2. Changing your parent on an IR tree

I think we have a disconnect here, having to do with the layering between the MVPN application and BGP.

MVPN can create a route and give it to BGP. MVPN can set and modify attributes of the route. MVPN can withdraw the route. But the distribution of the route is controlled by BGP.

MVPN cannot tell BGP "send an update for NLRI X with attribute A1 on BGP session S1, but send an update for NLRI X with attribute A2 on BGP session S2". MVPN cannot tell BGP "send an update for NLRI X on session S1 but send a withdraw for NLRI X on session S2." And MVPN cannot control the timing of BGP's route distribution procedures.

In short, MVPN does not create and send the update messages.

[Lucy] To change the parent, a child sends out the UPDATE of Leaf A-D route with new parent address in RT.

MVPN can tell BGP to change the RT and the PMSI Tunnel attribute on a given Leaf A-D route. Suppose MVPN replaces the RT so that the RT now identifies the new parent rather than the old one.

If Constrained Route Distribution is being used, this will cause an explicit withdraw to be sent to the old parent. There is no way for the MVPN process in the child node to control the timing of this BGP message.

If Constrained Route Distribution is not being used, changing the RT will cause BGP to send a new update to the old parent as well as to the new parent. The old parent will treat this as a replacement route, and will consider the old route to have been (implicitly) withdrawn. This behavior is mandated by section 3.1 of RFC4271. Since the old parent is not identified in the RT, the action it must take is the action specified for the withdrawal of a Leaf A-D route.

Now suppose MVPN doesn't simply replace the RT on the Leaf A-D route, but adds a second RT, identifying the new parent. MVPN would also have to replace the PMSI Tunnel attribute, to specify a new label for the new parent to use. The old parent would see this route as a replacement route. The route still identifies the old parent, but has a new label in its PMSI Tunnel attribute. So the old parent will continue sending traffic to the child, but will use the new label. Now both old and new parents are using the same label, and the result will be data duplication.

I don't think there is any feasible way to switch parents in a "make before break" fashion without requiring the old parent to have some explicit knowledge that the switch is taking place. The procedure we chose is to have the old parent time out the data plane entry for the child. While this is not the only possible procedure, I don't think there is anything both (a) simpler and (b) compatible with BGP's route distribution procedures and with the layering between MVPN and BGP.

Of course, one could modify the MVPN/BGP layering by building more MVPN-specific knowledge into BGP, one one could even decide that section 3.1 of RFC4271 shouldn't apply to Leaf A-D routes . Certainly there are some cases where BGP knows that certain MCAST-VPN routes have to be handled in a special way. But that's a lot more complicated than having the parent node simply run a timer to time out the data plane states when a route is withdrawn.

Eric




_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

Reply via email to