[Lsr] Re: Another counter-example

Tony Przygienda Mon, 09 Dec 2024 06:15:44 -0800

On Mon, Dec 9, 2024 at 12:26 PM Henk Smit <[email protected]> wrote:


> A few points regarding flooding reduction algorithms.
>
>
> Leaderless.
> ===
> I prefer a distributed flooding algorithm over a centralized flooding
> algorithm.
> But I do not prefer the leaderless way. It will be hard already to make 2
> different
> flooding algorithms in the same network cooperate. (legacy and a new one).
> Having multiple algorithms, and not knowing what the other routers do,
> will be even harder.
>
> Having a leader is very much like having routers advertise the FAD in
> flex-algo/
> (This analogy comes from Peter).
> It ensures everyone runs the same FA algorithm. And if a router doesn't
> understand the FAD,
> it falls back to default behaviour (and just doesn't participate in this
> flex-algo).
> Having something similar for the flooding algorithm too will make things
> simpler.
> And thus more robust.
>

Those are beliefs basically and because FAD was solved that way does not
mean it's better in any way.
First, synchronizing distributed computation where e'one must run same
algorithm to populate same
RIB is by nature scope of whole domain and hence e'one has to agree on the
algorithm used
(and agreeing in distributed fashion would either lead to the same solution
as suggested here choping
an area into "components" possibly
or otherwise to Paxos problem so centralized-leader is in a sense skirting
the problem by paying
with blast radius. Blast radius may be simpler to understand and specify
until you get caught by them,
enough interesting incidents
recently, although not necessarily the IGP related ones, made it into the
press). And again, distributed
computation over whole domain where e'one must agree on same algorithm
is a different problem from reduction where each component can run a
different algorithm as long
its properly signalled and stitched by full flooding. In fact,
the analogy would be multiple "computation domains" where each "domain"
agrees on the algorithm,
seamless MPLS does that e.g. since about forever stitching on the edges.

And no'one here AFAIS guns for this as something that is operationally
desirable except in case
of algorithm upgrade/migration where this capability guarantees minimum
blast radius/risk operational
procedure.


>
>
> Flooding topology repair.
> ===
> The hard part of optimized flooding is not coming up with different
> algorithms to prune links.
> The hard part is to repair the flooding topology when it breaks.
> If I understand correctly, the only real mechanism we have for that is
> sending CSNPs on p2p links?
> CSNPs are not reliable. So once you use them, you need to keep sending
> periodic CSNPs forever.
> Imho when you have to do that at scale, that solution is worse than the
> original problem we were trying to solve.
>

Please (re-)-read the disttopo draft


>
>
> Flooding reduction when an IS-IS process (re)starts.
> ===
> Imho, the biggest scalability issue with flooding is when an IS-IS
> process, in a large network, (re)starts.
> That is when potentially all LSPs are sent over all adjacencies.
> No other event will cause this much (local) flooding.
> But unfortunately, when IS-IS starts, it has an empty LSPDB. And thus it
> has no idea about the topology yet.
> So it can not really participate in any flooding reduction algorithm
> (yet).
> Only when it has received all or most LSPs, it can participate in any
> flooding reduction algorithm.
> But by then, the need for it has already diminished.
>

during restart, the problem is incast and not flood reduction. For incast,
please look at RIFT section 6.3.5 (RAIN), this is independent of
any flood reduction.  Otherwise other nodes have computation for flood
reduction already cached in place (modulo the originator upcoming links,
one possiblity here is to tighten the disttopo clause that says "do not
flood back to the source")

--- tony


>
>
> So in any proposed flooding reduction algorithm, I would like to see these
> two points discussed in detail.
> 1) How to repair a broken flooding topology ? (Preferably without periodic
> CSNPs). And
> 2) How beneficial is the algorithm during process (re)start?
>
>
> henk
>
> _______________________________________________
> Lsr mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>

_______________________________________________
Lsr mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Lsr] Re: Another counter-example

Reply via email to