Hi Huaimo,

> > > 1)           There is no concrete procedure/method for fault tolerance
> > > to multiple failures. When multiple failures happen and split the
> > > flooding topology, the convergence time will be increased
> > > significantly without fault tolerance. The longer the convergence
> > > time, the more the traffic lose.
> > 
> > there is a solution for multiple failures - see section 6.7.11.
> > 
>  
> Section 6.7.11 just briefly mentions that the edges of split parts will 
> determine and repair the split after the split of the flooding topology 
> happens. However, there is not any details or description on how to determine 
> or repair the split. This is not useful for implementers.


I’m sorry that you don’t find it useful. Determining the split is trivial: when 
you receive an IIH, it has a system ID of the another system in it. If that 
other system is not currently part of the flooding topology, then it is quite 
clear that it is disconnected from the flooding topology. Repairing the split 
is done by enabling temporary flooding on the new link.

There is an issue here that we have not yet resolved, which is the rate that 
new links should be temporarily added to the flooding topology.  Some believe 
that adding any new link is the correct thing to do as it minimizes the 
recovery time. Others feel that enabling too many links could cause a flooding 
collapse, so link addition should be highly constrained. We are still 
discussing this and invite the WG’s opinions.


> > > 2)           The extensions to Hello protocols for enabling “temporary
> > flooding” over a new link is not needed.
> > 
> > not if you do flooding on every link that comes up. If you want to be 
> > smarter, then you need to
> > selectively enable flooding only under specific conditions and that must be 
> > done from both sides of
> > the new link.
>  
> There are only a limited number of conditions (or cases).  In each 
> condition/case, it is deterministic whether we need to enable “temporary 
> flooding” for a new link when it is up.  Thus there is no need for any 
> extensions to Hello protocols for enabling “temporary flooding” on a new link.


We know of only two cases: (1) the neighbor is not part of the flooding 
topology and we feel that we can add more temporary flooding. (2) The neighbor 
is not part of the flooding topology and we cannot add more temporary flooding.

Obviously, in the case where we want to add temporary flooding, that TLV is 
needed in the IIH.

 
> For example, suppose that we have a current flooding topology containing all 
> live nodes in an area, when a new link comes up, we may just have two 
> conditions/cases. One condition/case is that the new link is attached to a 
> new node not on the current flooding topology. In this condition/case, the 
> new link needs to be enabled for “temporary flooding” after it is up.


Agreed, which is why we need the TLV.


> The other condition/case is that the new link is attached to nodes on the 
> current flooding topology. In this condition/case, there is no need to enable 
> “temporary flooding” on the link.


Agreed.

Note that there are some additional corner cases.  Since the two neighbors may 
not have the exact same information, one may consider the other to be on the 
flooding topology when in fact it is not.  This might happen in the case of a 
node reboot. The IIH TLV gives us an explicit way of signaling, rather than 
simply guessing and sometimes getting it wrong.


> > > 3)           The extensions to Hello protocols for requesting/signaling
> > > “temporary flooding” for a connection does not work.
> > 
> > sorry, but if you see a problem, please provide details, saying above is
> > simply unproductive.
>  
> “The nodes … will try to repair the flooding topology locally by enabling 
> temporary flooding towards the nodes that they consider disconnected from the 
> flooding topology ...”
>  
> The above quoted text is from draft-li-lsr-dynamic-flooding-02, where 
> “enabling temporary flooding towards the nodes” is to request/signal 
> “temporary flooding” for a connection to connect partitioned/disconnected 
> flooding topology into one through the extensions to Hello protocols 
> described in draft-li-lsr-dynamic-flooding-02. Right?
>  
> The extensions to Hello protocols for requesting/signaling “temporary 
> flooding” for a connection to connect partitioned/disconnected flooding 
> topology into one does not work since the connection may have two or more 
> hops and a Hello packet may get lost.


All adjacencies are a single hop in both IS-IS and OSPF.  Yes, Hello packets 
may be lost. Fortunately, they are periodically transmitted, thus the next 
transmission will also contain the TLV.  If IIH’s are getting lost at a 
significant rate, then the adjacency will not (and should not) come up.  Thus, 
the request for temporary flooding will propagate to the neighbor in all cases 
that matter.


> It is not convenient for a user/operator to configure on an area leader since 
> the leader is dynamically selected. How do you address this?


No configuration is required.  The election algorithm selects the area leader.  
The rules are in the draft.  An implementation may have a default priority and 
a default algorithm setting, so no configuration is mandatory.  If the operator 
desires a specific node to become area leader, then configuration may be 
required to adjust the priority.  FWIW, we have this already working in our 
implementation.  It Just Works.


> After the user/operator does some configurations on the (designated) leader, 
> will the backup leader takes over the configurations after the designated 
> leader is down?


There is no need for a backup leader.  If the area leader is partitioned from 
the topology, then leader election is repeated, resulting in a new leader.  
Again, no configuration is required.

Tony



_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Reply via email to