On 09/07/2018 20:53, Ahmed Bashandy wrote:
Thanks for the comments
See the reply inline at #Ahmed

Ahmed

On 5/29/18 3:35 AM, Alexander Vainshtein wrote:

Robert, Chris and all,

I agree with Robert that it is up to the authors of an individual submission what they consider in or out of scope of the draft.

However, I agree with Chris that the authors of an individual draft asking for its adoption by a specific WG should do their best to address the comments they have received from the WG members.

#Ahmed: Thanks a lot

SB> I am not sure what that reply means.

From my POV this did not happen in the case of the draft in question – for the following reasons:

1.In his early RTG-DIR review <https://datatracker.ietf.org/doc/review-bashandy-rtgwg-segment-routing-ti-lfa-00-rtgdir-early-bryant-2017-05-31/> of the draft Stewart  has pointed to the following issues with the -00 version of the draft (needless to say, I defer to Stewart regarding resolution of these issues):

a.*No IPR disclosures for draft-bashandy*in spite of 3 IPR disclosures for its predecessor draft-francois.  I have not seen any attempts to address this issue – at least, search of IPR disclosures for draft-bashandy did not yield any results today

#Ahmed: Since draft-bashandy inherits draft-francois, then the IPR of the latter applies to the former. But if there is a spec that requires re-attaching the IPR of an inherited draft to the inheriting draft, it would be great to point it out or point out some other draft where that occurred so that I can follow the exact procedure.

SB> There is (at least now) an IPR disclosure. Hopefully this covers all of the applicable IPR. We should move on.

b.*Selecting the post-convergence path *(inheritance from draft-francois) does not provide for any benefits for traffic that will not pass via the PLR */after convergence/*.

i.The authors claim to have addressed this issue by stating that “Protection applies to traffic which traverses the Point of Local Repair (PLR). Traffic which does NOT traverse the PLR remains unaffected.”


SB> It is not as simple as that, and I think that the draft needs to provide greater clarity.

I think there will be better examples, but consider

              12
      +--------------+
      |              |
A-----B-----C---//---D----E
        10  | |
            F--------G

Traffic injected at C will initially go C-D-E at cost 2, will be repaired C-F-G-D-E at cost 4 and will remain on that path post convergence. This congruence of path is what TI-LFA claims.

However, a long standing concern about traffic starting further back in the network needs to be more clearly addressed in the draft to clearly demonstrate the scope of applicability.

For traffic starting at A, before failure the path is A-B-C-D-E cost 13

TI-LFA will repair to make the path A-B-C-F-G-D-E cost 15 because TI-LFA optimises based on local repairs computed at C.

After repair the path will be A-B-D-E cost 14.

So the draft needs to make it clear to the reader that TI-LFA only provides benefit to traffic which traverses the PLR before and after failure. Traffic which does not pass through the PLR after the failure will need to be traffic engineered separately from traffic that passes though the PLR in both cases.


ii.From my POV this is at best a misleading statement because it does not really address Stewart’s comment which was about traffic that */traversed the PLR before convergence/* but */would not traverse it after convergence/*.

iii.This is not a fine distinction: actually it indicates that selecting post-convergence path for repair is more or less  useless (unless the traffic originates at the PLR).

#Ahmed: Thanks for pointing out this *additional benefit* of providing a post-convergence back path. If a flow starts to use the PLR after a failure, then the presence of a post convergence backup path on the PLR extends the benefits of using the post-convergence path to flows that did not use the PLR prior the topology change. I will modify the statement in the introduction to indicate that :)

SB> Sorry, I don't understand that point, please could you provide a network fragment that illustrates the advantage?


c.*Selecting the post-convergence path is detrimental to scalability of the solution*. Please note that in RFC 7490 <https://tools.ietf.org/html/rfc7490> “the Q-space of E with respect to link S-E is used as a proxy for the Q-space of each destination”  in order to provide a scalable solution – but this clearly is not the case of draft-bashandy if post-convergence paths are used. To the best of my understanding, the authors did not, so far, do anything to address this comment.

#Ahmed: I fail to see why there is a scalability problem. draft-bashandy-rtgwg-segment-routing-ti-lfa just prefers particular node(s) in the Q space if that node(s) is (are) along the post convergence path. But if I am missing something, it would be great to point out exactly what aspect of scalability you are concerned about so that we can address it

SB> The point is that traffic is constrained to a subset of Q-space on the false assumption that it needs to traverse that part of Q-space post convergence. The network fragment above clearly illustrates the point.

<snip>

>
        > The work has four basic components, the concept of resolving the

        > problem of P and Q being non-adjacent, the use of SR to solve the

        > non-adjacency, the use of the post convergence path following failure

        > and the applicability of these techniques to an SR network. The first

        > and second points seem of utility in non-SR networks, and so I am

        > surprised that they are not called out as such, in the first case

        > perhaps with consideration to strategically places RSVP tunnels, or

        > binding segments.

        The draft already mentions that the work builds on top of existing FRR 
work. For example

        the second statement of the abstract already says

           builds on proven IP-FRR concepts being

           LFAs, remote LFAs (RLFA), and remote LFAs with directed forwarding

           (DLFA).

        The statement about the possibility of using RSVP is clearly outside 
the scope of document as mentioned in first paragraph of the introduction.

SB> In the introduction you set out the context of the work so I don't think that RSVP should be dismissed so lightly since what you are doing is pretty much what an RSVP tunnel repair would do, including taking into account the traffic volumes. Setting out why the complexity of this repair technique outweighs the complexity of RSVP is part of the deployment consideration process, and you ought to provide a balanced view so that
the reader can evaluate the issues for themselves.

>
        > The issue of mapping repair path to the post convergence path to the

        > something that has always concerned me in this concept. It is true

        > that traffic that always passes through the PLR will experience the

        > properties the authors describe, but not all traffic will pass through

        > the PLR post convergence. The post failure path will be topology

        > dependent, and may take a different path from the point of ingress.

        #Ahmed

        The fourth paragraph in the introduction clearly mentions that we are 
protecting the traffic passing through the PLR.

.... See above. It only applies to a subset of that traffic.

>
        > I am also concerned that the authors do not discuss the need for loop

        > free convergence, since although traffic going through the repair path

        > will be loop-free, traffic arriving at the PLR might not be.. Consider

        > for example a topology fragment that looks like a clock with a router

        > at each minute. Traffic enters at 9 o'clock, leave at 3 o'clock and

        > goes via 12 o'clock and 12 o'clock fails.  The routers 9..12 will

        > re-converge at different times and this may give rise to the

        > micro-looping of traffic trying to get to the PLR. A summary of the

        > problem and a pointer to the companion draft may be sufficient.

        #Ahmed

        The last statement in the first paragraph in the introduction refers 
the reader to the uloop avoidance draft which handles non-local failures


SB> The question I have had for a long time, is whether traffic patterns are such that deployment of this makes any sense without co-deployment of the uloop draft. The implication as always been that it does, but that puts the onus on the operator to validate that position against all first and probably all
second failures for each evolution of their topology.

>
        > There is no discussion of multiple failures, nor as far as I can see

        > of failures that are worse than anticipated. This is an important

        > point that needs to be established early. Some methods, (MRT)

        > intrinsically address multiple failures, others (NV) intrinsically

        > exclude them. Simple LFA needs a supervisor to quickly abandon all

        > hope when they occur.

        #Ahmed

        As specified in the 3rd paragraph of the introduction the scope of the 
document is limited to single link, single node, and single local SRLG failure.

SB> Given that there *will* be multiple failures from time to time, the text needs to provide
advice the matter.

>
        > In an SR network the paths used are not the shortest paths, they are a

        > collection of shortest paths, so there needs to be some discussion on

        > the interaction between the SR paths and repair paths to consider

        > whether it is unconditionally safe against forwarding loops.. It would

        > presumably be so if the authors borrowed the concept of repair

        > addresses rather than normal forwarding addresses from not-via, but I

        > don't think they have done this.

        #Ahmed

        Again the second statement of the 1st paragraph of the Introduction says

           By relying on segment routing this document provides

                  a local repair mechanism for standard IGP shortest path

        So the scope of the document is quite clear


SB> Reflecting on this, I cannot think of a case where you would generate a forwarding loop, provided you ensure that no path between SR nodes crosses the failure, including ECMP paths.

SB> Please can you confirm that ECMP through the failure is prohibited by the algorithm?

>
        > There should also be some discussion on the original path constraints

        > that are applicable to the repair. Presumably the ingress node

        > constrained the traffic to go though failed node F for a reason. If

        > the repair is unconstrained that reason could be violated, but this is

        > not discussed in the text.

        #Ahmed

        Same response as the response to the previous comment. The scope is 
standard IGP shortest paths

SB> So this can never be used to repair an SR path?

I think you either need to state that as a constraint, or specify how you fulfil the requirements of the node creating the original SR path, or you need to state that when TI-LFA is being used to repair an SR
path the path constraints are in not in general respected.

> >
        > In the Security section you say:

>
        >     The behavior described in this document is internal functionality

        >     to a router that result in the ability to guarantee an upper bound

        >     on the time taken to restore traffic flow upon the failure of a

        >     directly connected link or node. As such no additional security

        >     risk is introduced by using the mechanisms proposed in this

        >     document.

> >
        > SB> I am not sure that the above is correct. There may be a security

        > reason

        > SB> why a packet was steered along a path which breaks when you use

        > this

        > SB> technique.

        #Ahmed

        The security consideration section has been modified to to indicate that

        the traffic is being steered over the post convergence path and hence 
there

        is no security risk because this is the path that the operator intended 
to use

        after the failure through the metrics configured on the links.

SB> I think we have a counter example.

        In fact by expediting

        rerouting the traffic over the intended post convergence path without 
waiting

        for IGP reconvergence, we have introduced a minor security enhancement 
by reducing

        misforwarding and/or traffic drop

>
        > In the conclusion you say:

>
        >     The

        >     mechanism is able to calculate the backup path irrespective of the

        >     topology as long as the topology is sufficiently redundant.

> >
        > SB> That is certainly true in classic. I am not sure this is

        > universally

        > SB> true under SR which includes the use of non-shortest path and

        > SB> binding segments.

        #Ahmed

        Again the document is restricted to IGP shortest path as mentioned in 
the introduction


SB> The security section needs to state that if this is used to repair an SR path additional
security considerations apply.

It sounds like it needs to state that in the event of multiple failures the result is
undetermined.

        s

        > Minor issues:

>
        >     For each destination in the network, TI-LFA prepares a data-plane

        >     switch-over to be activated upon detection of the failure of a

        >     link used to reach the destination.

>
        > SB> To make the scaling clearer to the reader, I think you need

        > SB> to make it clear that for each protected link, you determine

        > SB> the repair needed to reach every destination reachable over that

        > SB> link. You sort of say that, but it's a bit hidden.

        #Ahmed

        I do not understand the difference between the text in the draft and the

        text that you are proposing. We think that our text is quite clear

        >     We provide the TI-LFA approach that achieves guaranteed coverage

        >     against link, node, and local SRLG failure, in any IGP network,

        >     relying on the flexibility of SR.

>
        > SB> Should that be any SINGLE link.... failure?

        #Ahmed

        As mentioned above few times above, the introduction clearly mentions 
*single*

SB> Yes, but the point is so important that you cannot over emphasise it.

        > In the text (and the text that follows)

>
        >     To do so, S applies a "NEXT" operation on Adj(S-F) and then two

        >     consecutive "PUSH" operations: first it pushes a node segment for

        > F,

        >     and then it pushes a protection list allowing to reach F while

        >     bypassing S-F.

>
        > You need to reference the SR operations.

        #Ahmed

        This paragraph is in Section 5.2.1. The latest version refers to the SR 
draft


SB> I think the point remains.

>
        > Also you are considering Adj segments, and presumably they were there

        > for a reason, but you do not discuss that.

        #Ahmed

        Section 5.2 discusses protecting adjacency segments

>
        > In 5.3.1 and 5.3.2 you have a list of conditions, but do not make it

        > clear whether any or all must be true.

>
        #Ahmed

        The intention is for all of the conditions to be true. I will make it 
clear in the next version

SB> Thank you.

        > Nits

>
        > 1. Introduction

>
        >     Segment Routing aims at supporting services with tight SLA

        >     guarantees [1]. This document provides a local repair mechanism

        >     relying on SR-capable of restoring end-to-end connectivity in the

        >     case of a sudden failure of a network component.

>
        > SB> Grammar needs a little work in the last sentence.

        #Ahmed

        Addressed in the latest version of the document

SB> Thank you.

        > In Fig 1, I assume that the blobs are network fragments.

>
        > In the conclusion you say:

        >     This document proposes a mechanism that is able to pre-calculate a

        >     backup path for every primary path so as to be able to protect

        >     against the failure of a directly connected link or node.

        > SB> you need to add SRLG

        #Ahmed

        Addressed in the latest version of the draft

SB> Thanks you.

- Stewart


_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Reply via email to