On 09/07/2018 20:53, Ahmed Bashandy wrote:
Thanks for the comments
See the reply inline at #Ahmed
Ahmed
On 5/29/18 3:35 AM, Alexander Vainshtein wrote:
Robert, Chris and all,
I agree with Robert that it is up to the authors of an individual
submission what they consider in or out of scope of the draft.
However, I agree with Chris that the authors of an individual draft
asking for its adoption by a specific WG should do their best to
address the comments they have received from the WG members.
#Ahmed: Thanks a lot
SB> I am not sure what that reply means.
From my POV this did not happen in the case of the draft in question
– for the following reasons:
1.In his early RTG-DIR review
<https://datatracker.ietf.org/doc/review-bashandy-rtgwg-segment-routing-ti-lfa-00-rtgdir-early-bryant-2017-05-31/>
of the draft Stewart has pointed to the following issues with the
-00 version of the draft (needless to say, I defer to Stewart
regarding resolution of these issues):
a.*No IPR disclosures for draft-bashandy*in spite of 3 IPR
disclosures for its predecessor draft-francois. I have not seen any
attempts to address this issue – at least, search of IPR disclosures
for draft-bashandy did not yield any results today
#Ahmed: Since draft-bashandy inherits draft-francois, then the IPR of
the latter applies to the former. But if there is a spec that requires
re-attaching the IPR of an inherited draft to the inheriting draft, it
would be great to point it out or point out some other draft where
that occurred so that I can follow the exact procedure.
SB> There is (at least now) an IPR disclosure. Hopefully this covers all
of the applicable IPR. We should move on.
b.*Selecting the post-convergence path *(inheritance from
draft-francois) does not provide for any benefits for traffic that
will not pass via the PLR */after convergence/*.
i.The authors claim to have addressed this issue by stating that
“Protection applies to traffic which traverses the Point of Local
Repair (PLR). Traffic which does NOT traverse the PLR remains
unaffected.”
SB> It is not as simple as that, and I think that the draft needs to
provide greater clarity.
I think there will be better examples, but consider
12
+--------------+
| |
A-----B-----C---//---D----E
10 | |
F--------G
Traffic injected at C will initially go C-D-E at cost 2, will be
repaired C-F-G-D-E at cost 4 and will remain on that path post
convergence. This congruence of path is what TI-LFA claims.
However, a long standing concern about traffic starting further back in
the network needs to be more clearly addressed in the draft to clearly
demonstrate the scope of applicability.
For traffic starting at A, before failure the path is A-B-C-D-E cost 13
TI-LFA will repair to make the path A-B-C-F-G-D-E cost 15 because TI-LFA
optimises based on local repairs computed at C.
After repair the path will be A-B-D-E cost 14.
So the draft needs to make it clear to the reader that TI-LFA only
provides benefit to traffic which traverses the PLR before and after
failure. Traffic which does not pass through the PLR after the failure
will need to be traffic engineered separately from traffic that passes
though the PLR in both cases.
ii.From my POV this is at best a misleading statement because it does
not really address Stewart’s comment which was about traffic that
*/traversed the PLR before convergence/* but */would not traverse it
after convergence/*.
iii.This is not a fine distinction: actually it indicates that
selecting post-convergence path for repair is more or less useless
(unless the traffic originates at the PLR).
#Ahmed: Thanks for pointing out this *additional benefit* of providing
a post-convergence back path. If a flow starts to use the PLR after a
failure, then the presence of a post convergence backup path on the
PLR extends the benefits of using the post-convergence path to flows
that did not use the PLR prior the topology change. I will modify the
statement in the introduction to indicate that :)
SB> Sorry, I don't understand that point, please could you provide a
network fragment that illustrates the advantage?
c.*Selecting the post-convergence path is detrimental to scalability
of the solution*. Please note that in RFC 7490
<https://tools.ietf.org/html/rfc7490> “the Q-space of E with respect
to link S-E is used as a proxy for the Q-space of each destination”
in order to provide a scalable solution – but this clearly is not
the case of draft-bashandy if post-convergence paths are used. To the
best of my understanding, the authors did not, so far, do anything to
address this comment.
#Ahmed: I fail to see why there is a scalability problem.
draft-bashandy-rtgwg-segment-routing-ti-lfa just prefers particular
node(s) in the Q space if that node(s) is (are) along the post
convergence path.
But if I am missing something, it would be great to point out exactly
what aspect of scalability you are concerned about so that we can
address it
SB> The point is that traffic is constrained to a subset of Q-space on
the false assumption that it needs to traverse that part of Q-space post
convergence. The network fragment above clearly illustrates the point.
<snip>
>
> The work has four basic components, the concept of resolving the
> problem of P and Q being non-adjacent, the use of SR to solve the
> non-adjacency, the use of the post convergence path following failure
> and the applicability of these techniques to an SR network. The first
> and second points seem of utility in non-SR networks, and so I am
> surprised that they are not called out as such, in the first case
> perhaps with consideration to strategically places RSVP tunnels, or
> binding segments.
The draft already mentions that the work builds on top of existing FRR
work. For example
the second statement of the abstract already says
builds on proven IP-FRR concepts being
LFAs, remote LFAs (RLFA), and remote LFAs with directed forwarding
(DLFA).
The statement about the possibility of using RSVP is clearly outside
the scope of document as mentioned in first paragraph of the introduction.
SB> In the introduction you set out the context of the work so I don't
think that RSVP should
be dismissed so lightly since what you are doing is pretty much what an
RSVP tunnel
repair would do, including taking into account the traffic volumes.
Setting out why the
complexity of this repair technique outweighs the complexity of RSVP is
part of the
deployment consideration process, and you ought to provide a balanced
view so that
the reader can evaluate the issues for themselves.
>
> The issue of mapping repair path to the post convergence path to the
> something that has always concerned me in this concept. It is true
> that traffic that always passes through the PLR will experience the
> properties the authors describe, but not all traffic will pass through
> the PLR post convergence. The post failure path will be topology
> dependent, and may take a different path from the point of ingress.
#Ahmed
The fourth paragraph in the introduction clearly mentions that we are
protecting the traffic passing through the PLR.
.... See above. It only applies to a subset of that traffic.
>
> I am also concerned that the authors do not discuss the need for loop
> free convergence, since although traffic going through the repair path
> will be loop-free, traffic arriving at the PLR might not be.. Consider
> for example a topology fragment that looks like a clock with a router
> at each minute. Traffic enters at 9 o'clock, leave at 3 o'clock and
> goes via 12 o'clock and 12 o'clock fails. The routers 9..12 will
> re-converge at different times and this may give rise to the
> micro-looping of traffic trying to get to the PLR. A summary of the
> problem and a pointer to the companion draft may be sufficient.
#Ahmed
The last statement in the first paragraph in the introduction refers
the reader to the uloop avoidance draft which handles non-local failures
SB> The question I have had for a long time, is whether traffic patterns
are such that deployment of this
makes any sense without co-deployment of the uloop draft. The
implication as always been that it
does, but that puts the onus on the operator to validate that position
against all first and probably all
second failures for each evolution of their topology.
>
> There is no discussion of multiple failures, nor as far as I can see
> of failures that are worse than anticipated. This is an important
> point that needs to be established early. Some methods, (MRT)
> intrinsically address multiple failures, others (NV) intrinsically
> exclude them. Simple LFA needs a supervisor to quickly abandon all
> hope when they occur.
#Ahmed
As specified in the 3rd paragraph of the introduction the scope of the
document is limited to single link, single node, and single local SRLG failure.
SB> Given that there *will* be multiple failures from time to time, the
text needs to provide
advice the matter.
>
> In an SR network the paths used are not the shortest paths, they are a
> collection of shortest paths, so there needs to be some discussion on
> the interaction between the SR paths and repair paths to consider
> whether it is unconditionally safe against forwarding loops.. It would
> presumably be so if the authors borrowed the concept of repair
> addresses rather than normal forwarding addresses from not-via, but I
> don't think they have done this.
#Ahmed
Again the second statement of the 1st paragraph of the Introduction says
By relying on segment routing this document provides
a local repair mechanism for standard IGP shortest path
So the scope of the document is quite clear
SB> Reflecting on this, I cannot think of a case where you would
generate a forwarding loop, provided
you ensure that no path between SR nodes crosses the failure, including
ECMP paths.
SB> Please can you confirm that ECMP through the failure is prohibited
by the algorithm?
>
> There should also be some discussion on the original path constraints
> that are applicable to the repair. Presumably the ingress node
> constrained the traffic to go though failed node F for a reason. If
> the repair is unconstrained that reason could be violated, but this is
> not discussed in the text.
#Ahmed
Same response as the response to the previous comment. The scope is
standard IGP shortest paths
SB> So this can never be used to repair an SR path?
I think you either need to state that as a constraint, or specify how
you fulfil the requirements of the
node creating the original SR path, or you need to state that when
TI-LFA is being used to repair an SR
path the path constraints are in not in general respected.
>
>
> In the Security section you say:
>
> The behavior described in this document is internal functionality
> to a router that result in the ability to guarantee an upper bound
> on the time taken to restore traffic flow upon the failure of a
> directly connected link or node. As such no additional security
> risk is introduced by using the mechanisms proposed in this
> document.
>
>
> SB> I am not sure that the above is correct. There may be a security
> reason
> SB> why a packet was steered along a path which breaks when you use
> this
> SB> technique.
#Ahmed
The security consideration section has been modified to to indicate that
the traffic is being steered over the post convergence path and hence
there
is no security risk because this is the path that the operator intended
to use
after the failure through the metrics configured on the links.
SB> I think we have a counter example.
In fact by expediting
rerouting the traffic over the intended post convergence path without
waiting
for IGP reconvergence, we have introduced a minor security enhancement
by reducing
misforwarding and/or traffic drop
>
> In the conclusion you say:
>
> The
> mechanism is able to calculate the backup path irrespective of the
> topology as long as the topology is sufficiently redundant.
>
>
> SB> That is certainly true in classic. I am not sure this is
> universally
> SB> true under SR which includes the use of non-shortest path and
> SB> binding segments.
#Ahmed
Again the document is restricted to IGP shortest path as mentioned in
the introduction
SB> The security section needs to state that if this is used to repair
an SR path additional
security considerations apply.
It sounds like it needs to state that in the event of multiple failures
the result is
undetermined.
s
> Minor issues:
>
> For each destination in the network, TI-LFA prepares a data-plane
> switch-over to be activated upon detection of the failure of a
> link used to reach the destination.
>
> SB> To make the scaling clearer to the reader, I think you need
> SB> to make it clear that for each protected link, you determine
> SB> the repair needed to reach every destination reachable over that
> SB> link. You sort of say that, but it's a bit hidden.
#Ahmed
I do not understand the difference between the text in the draft and the
text that you are proposing. We think that our text is quite clear
> We provide the TI-LFA approach that achieves guaranteed coverage
> against link, node, and local SRLG failure, in any IGP network,
> relying on the flexibility of SR.
>
> SB> Should that be any SINGLE link.... failure?
#Ahmed
As mentioned above few times above, the introduction clearly mentions
*single*
SB> Yes, but the point is so important that you cannot over emphasise it.
> In the text (and the text that follows)
>
> To do so, S applies a "NEXT" operation on Adj(S-F) and then two
> consecutive "PUSH" operations: first it pushes a node segment for
> F,
> and then it pushes a protection list allowing to reach F while
> bypassing S-F.
>
> You need to reference the SR operations.
#Ahmed
This paragraph is in Section 5.2.1. The latest version refers to the SR
draft
SB> I think the point remains.
>
> Also you are considering Adj segments, and presumably they were there
> for a reason, but you do not discuss that.
#Ahmed
Section 5.2 discusses protecting adjacency segments
>
> In 5.3.1 and 5.3.2 you have a list of conditions, but do not make it
> clear whether any or all must be true.
>
#Ahmed
The intention is for all of the conditions to be true. I will make it
clear in the next version
SB> Thank you.
> Nits
>
> 1. Introduction
>
> Segment Routing aims at supporting services with tight SLA
> guarantees [1]. This document provides a local repair mechanism
> relying on SR-capable of restoring end-to-end connectivity in the
> case of a sudden failure of a network component.
>
> SB> Grammar needs a little work in the last sentence.
#Ahmed
Addressed in the latest version of the document
SB> Thank you.
> In Fig 1, I assume that the blobs are network fragments.
>
> In the conclusion you say:
> This document proposes a mechanism that is able to pre-calculate a
> backup path for every primary path so as to be able to protect
> against the failure of a directly connected link or node.
> SB> you need to add SRLG
#Ahmed
Addressed in the latest version of the draft
SB> Thanks you.
- Stewart
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg