Re: [Lsr] WG Adoption Call for draft-li-lsr-dynamic-flooding-02 + IPR poll.

Huaimo Chen Tue, 05 Mar 2019 21:05:28 -0800

Hi Tony,

> From: Tony Li [mailto:[email protected]]
> Sent: Wednesday, February 27, 2019 2:07 AM
> To: Huaimo Chen <[email protected]>
> Cc: Peter Psenak <[email protected]>; Christian Hopps <[email protected]>; 
> [email protected];
> lsr- [email protected]; [email protected]
> Subject: Re: [Lsr] WG Adoption Call for draft-li-lsr-dynamic-flooding-02 + 
> IPR poll.
>
>
> Hi Huaimo,
>
> > > 1)           There is no concrete procedure/method for fault tolerance
> > > to multiple failures. When multiple failures happen and split the
> >>  flooding topology, the convergence time will be increased
> > > significantly without fault tolerance. The longer the convergence
> >>  time, the more the traffic lose.
> >
> > there is a solution for multiple failures - see section 6.7.11.
> >


> > Section 6.7.11 just briefly mentions that the edges of split parts will 
> > determine
> > and repair the split after the split of the flooding topology happens. 
> > However,
> > there is not any details or description on how to determine or repair the 
> > split.
> > This is not useful for implementers.


> I’m sorry that you don’t find it useful. Determining the split is trivial: 
> when you receive an IIH,
> it has a system ID of the another system in it. If that other system is not 
> currently part of the
> flooding topology, then it is quite clear that it is disconnected from the 
> flooding topology.
> Repairing the split is done by enabling temporary flooding on the new link.

For an adjacency between two nodes is up, the Hello packets exchanged between 
them will not change node/system IDs in them.
How do you determine that other system is not currently part of the flooding 
topology?


> There is an issue here that we have not yet resolved, which is the rate that 
> new links should be
> temporarily added to the flooding topology.  Some believe that adding any new 
> link is the
> correct thing to do as it minimizes the recovery time. Others feel that 
> enabling too many links
> could cause a flooding collapse, so link addition should be highly 
> constrained. We are still
> discussing this and invite the WG’s opinions.

The issue is resolved by the solutions in draft-cc-lsr-flooding-reduction.
One solution is below, where the given distance can be adjusted/configured.
If we want every node to flood on all its links, we let the given
distance to a big number. If we want the nodes within 2 hops to a failure
to flood on all their links, we set the given distance to 2.
   “In one way, when two or more failures on the current flooding
   topology occur almost in the same time, each of the nodes within a
   given distance (such as 3 hops) to a failure point, floods the link
   state (LS) that it receives to all the links (except for the one from
   which the LS is received) until a new flooding topology is built.”

Another solution is just adding minimum links temporarily on the flooding
topology to repair the split flooding topology until a new flooding topology
is built.

> > > 2)           The extensions to Hello protocols for enabling “temporary
> > flooding” over a new link is not needed.
> >
> > not if you do flooding on every link that comes up. If you want to be 
> > smarter, then you need to
> > selectively enable flooding only under specific conditions and that must be 
> > done from both sides of
> > the new link.

> > There are only a limited number of conditions (or cases).  In each 
> > condition/case, it is
> > deterministic whether we need to enable “temporary flooding” for a new link 
> > when it
> > is up.  Thus there is no need for any extensions to Hello protocols for 
> > enabling
> > “temporary flooding” on a new link.


> We know of only two cases: (1) the neighbor is not part of the flooding 
> topology and we feel
> that we can add more temporary flooding. (2) The neighbor is not part of the 
> flooding topology
> and we cannot add more temporary flooding.

> Obviously, in the case where we want to add temporary flooding, that TLV is 
> needed in the IIH.


> > For example, suppose that we have a current flooding topology containing 
> > all live
> > nodes in an area, when a new link comes up, we may just have two 
> > conditions/cases.
> > One condition/case is that the new link is attached to a new node not on 
> > the current
> > flooding topology. In this condition/case, the new link needs to be enabled 
> > for
> > “temporary flooding” after it is up.


> Agreed, which is why we need the TLV.

The link can be enabled for “temporary flooding” by the node without using any 
TLV or Hello with the TLV.

> >The other condition/case is that the new link is attached to nodes on the 
> >current
> >flooding topology. In this condition/case, there is no need to enable 
> >“temporary
> > flooding” on the link.


>Agreed.

>Note that there are some additional corner cases.  Since the two neighbors may 
>not have the
>exact same information, one may consider the other to be on the flooding 
>topology when in fact
> it is not.  This might happen in the case of a node reboot. The IIH TLV gives 
> us an explicit way
> of signaling, rather than simply guessing and sometimes getting it wrong.

The TLV in Hello packet just requests for adding “temporary flooding” on the 
link. The other information is accessed by the node locally. The TLV in Hello 
packet does not help for corner case. In the case where a node is rebooted, a 
new link attached to a new node may apply.


>> > 3)           The extensions to Hello protocols for requesting/signaling
>> > “temporary flooding” for a connection does not work.
>>>
>>> sorry, but if you see a problem, please provide details, saying above is
>>> simply unproductive.

>> “The nodes … will try to repair the flooding topology locally by enabling
>>temporary flooding towards the nodes that they consider disconnected from the
>>flooding topology ...”

>>The above quoted text is from draft-li-lsr-dynamic-flooding-02, where
>> “enabling temporary flooding towards the nodes” is to request/signal
>> “temporary flooding” for a connection to connect partitioned/disconnected
>>flooding topology into one through the extensions to Hello protocols described
>>in draft-li-lsr-dynamic-flooding-02. Right?

>>The extensions to Hello protocols for requesting/signaling “temporary
>>flooding” for a connection to connect partitioned/disconnected flooding
>>topology into one does not work since the connection may have two or more
>>hops and a Hello packet may get lost.


>All adjacencies are a single hop in both IS-IS and OSPF.  Yes, Hello packets 
>may be lost.
>Fortunately, they are periodically transmitted, thus the next transmission 
>will also contain the
> TLV.  If IIH’s are getting lost at a significant rate, then the adjacency 
> will not (and should not)
>come up.  Thus, the request for temporary flooding will propagate to the 
>neighbor in all cases
>that matter.

It takes too long when Hello packet is lost. Repairing split flooding topology 
needs to be fast.

>>It is not convenient for a user/operator to configure on an area leader since 
>>the
>>leader is dynamically selected. How do you address this?


>No configuration is required.  The election algorithm selects the area leader. 
> The rules are in
>the draft.  An implementation may have a default priority and a default 
>algorithm setting, so no
>configuration is mandatory.  If the operator desires a specific node to become 
>area leader, then
>configuration may be required to adjust the priority.  FWIW, we have this 
>already working in
>our implementation.  It Just Works.

It does not mean that a user/operator configures/select an area leader. It 
means that a user/operator configures other things such as indicating an 
algorithm or selecting the centralized mode on the area leader.

>>After the user/operator does some configurations on the (designated) leader, 
>>will the
>>backup leader takes over the configurations after the designated leader is 
>>down?


>There is no need for a backup leader.  If the area leader is partitioned from 
>the topology, then
> leader election is repeated, resulting in a new leader.  Again, no 
> configuration is required.

The above does not talk about topology split, but about the leader down. After 
a user/operator has configured some things on the leader, and the leader has 
got them and distributed them in some form, and then some time later, the 
leader goes down, a new leader is selected. In this case, will the new leader 
take and maintain the configurations or the information derived from the 
configurations done on the old leader.

Best Regards,
Huaimo

>Tony

_______________________________________________
Lsr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lsr

Re: [Lsr] WG Adoption Call for draft-li-lsr-dynamic-flooding-02 + IPR poll.

Reply via email to