Hi Eric,


Please see inline below with [Lucy1] and green text.



-----Original Message-----
From: Eric C Rosen [mailto:ero...@juniper.net]
Sent: Wednesday, September 02, 2015 9:59 AM
To: Lucy yong; Jeffrey (Zhaohui) Zhang; draft-ietf-bess...@tools.ietf.org
Cc: bess@ietf.org
Subject: Re: [bess] comment on draft-ietf-bess-ir



Hi Lucy,



It is certainly true that the BGP route distribution mechanism is not optimal 
for multicast signaling.  The advantages and disadvantages of using BGP for 
multicast signaling were discussed extensively in the WG when RFCs 6513 and 
6514 were being written.  But the entire mechanism is built around the standard 
BGP route distribution procedures, and one has to be very cautious about making 
changes.

[Lucy1] I did not work on the subject by then and am not aware of these 
discussions. With these already published RFC, changes need to be cautious. But 
if there is an issue when using it, the enhancement is necessary.



> [Lucy] What you say is that, in BGP distribution mechanism, BGP  > (child 
> here) chooses the bestpath (i.e. parent here) for a particular  > NLRI, then 
> distributes the NLRI to all BGP peers (including the  > parent node).



Right.  But ...



One thing you have to remember is that the child might not have a BGP session 
directly with the parent.  For instance, one can have a deployment where 
ingress replication is used in a non-segmented manner, such that the ingress PE 
makes a copy for each egress PE.  In this case, when an egress PE originates a 
Leaf A-D route, the RT will identify the ingress PE.  But ingress and egress 
PEs generally don't have BGP sessions to each other.  In many deployments, the 
ingress and egress PEs are clients of a route reflector; the egress PE sends 
the Leaf A-D route to the RR, and the RR redistributes it to the ingress PE.  
In other deployments, the Leaf A-D route from a given egress PE may have to 
travel over several BGP sessions before it reaches the ingress PE.

[Lucy1] For the RR case, RR will redistribute it to all PEs including the 
ingress PE, right? RR can be exception, i.e. RR always redistributes Leaf A-D. 
For other deployment, it seems the case that the IR applies, i.e. there is a 
distribution tree built by IR.



> Since the NLRI is only stored by the parent node and may be removed  > by old 
> parent node, such distribution mechanism has no advance for  > such purpose 
> and causes a scaling issue. To reduce the distribution  > symptom, it should 
> explicitly require that, if a node receiving a  > leaf A-D route is not the 
> parent node including old parent node, the  > node should not redistribute 
> the leaf A-D route; in other words, only  > the parent node is allowed to 
> readvertise the leaf A-D route.



This would break the deployment scenarios described above.

[Lucy1] This mechanism raises a big scaling concern. If we have a separate rule 
for RR, my suggestion above may still work. RR has different roles.

BTW: I still think that REFRESH is better than UPDATE for this purpose.



> One question, if one ASBR or ABR node that is the parent for a set of  > 
> downstream neighbors fails, what is the procedure for the downstream  > 
> neighbors to select a new parent?



If a node fails, its BGP sessions go down, and the routes it originated are 
considered to have been withdrawn.  If parent and child are separated by a RR, 
failure of the parent will cause its session to the RR to go down.  The RR will 
then send withdraws to the child for all of the routes that were originated by 
the parent and redistributed to the child.



If there are two potential parents for a given flow, each will have originated 
an S-PMSI A-D route with an NLRI identifying the flow.  One of these S-PMSI A-D 
routes is the bestpath for that NLRI, and the next hop (or P2MP Inter-Area 
Segmented Next Hop Extended Community) of that route determines the parent that 
is chosen by a given child.  If that parent fails, the route from the other 
potential parent becomes the best path, and the child will then rehome itself 
by changing the RT on the Leaf A-D route.

[Lucy1] This means that each child node maintain all potential parents' 
information for a given flow although it only selects one patent. This is a 
resource scaling concern.



> If a child fails, the parent should

> update multicast state as if the child is withdrawn.



If a child fails, its Leaf A-D route appears to the parent to have been 
withdrawn, and this causes the change in multicast state.



Note that the procedures for dealing with withdrawn S-PMSI or Leaf A-D routes 
are not specific to Ingress Replication.

[Lucy1] Yes, this one is obvious.



Thanks,

Lucy



Eric
_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

Reply via email to