Hi Yingzhen/John/Ahmed/All, I have some text suggestions for your consideration.
a) Abstract v17 A key aspect of TI-LFA is the FRR path selection approach establishing protection over the expected post-convergence paths from the point of local repair, reducing the operational need to control the tie-breaks among various FRR options. v18 Although not a TI-LFA requirement or constraint, TI-LFA also brings the benefit of the ability to provide a backup path that follows the expected post-convergence path, reducing the operational need to control the tie-breaks among various FRR options. NEW An *important* aspect of TI-LFA is the FRR path selection approach establishing protection over the expected post-convergence paths from the point of local repair, reducing the operational need to control the tie-breaks among various FRR options. b) sec 6.1 v18 When a direct neighbor is in P(S,X) and Q(D,x) and the link to that direct neighbor is on the post-convergence path, the outgoing interface is set to that neighbor and the repair segment list SHOULD be empty. NEW When a direct neighbor is in P(S,X) and Q(D,x) and the link to that direct neighbor is on the post-convergence path, the outgoing interface is set to that neighbor and the repair segment list *is *empty. c) sec 6.2 v17 When a remote node R is in P(S,X) and Q(D,x) and on the post-convergence path, the repair list SHOULD be made of a single node segment to R and the outgoing interface SHOULD be set to the outgoing interface used to reach R. v18 When a remote node R is in P(S,X) and Q(D,x) and on the post-convergence path, the repair list can be made of a single node segment to R and the outgoing interface set to the outgoing interface used to reach R, thereby minimizing the size of the repair-list while keeping the repair path on the post-convergence path. NEW When a remote node R is in P(S,X) and Q(D,x) and on the post-convergence path, the repair list is made of a single node segment to R and the outgoing interface *is *set to the outgoing interface used to reach R. d) sec 6.3 v17 When a node P is in P(S,X) and a node Q is in Q(D,x) and both are on the post-convergence path and both are adjacent to each other, the repair list SHOULD be made of two segments: A node segment to P (to be processed first), followed by an adjacency segment from P to Q. v18 When a node P is in P(S,X) and a node Q is in Q(D,x) and both are on the post-convergence path and both are adjacent to each other, the repair list size can be minimized while keeping the repair path on the post-convergence path by constructing it from two segments: A node segment to P (to be processed first), followed by an adjacency segment from P to Q. NEW When a node P is in P(S,X) and a node Q is in Q(D,x) and both are on the post-convergence path and both are adjacent to each other, the repair list *is* made of two segments: A node segment to P (to be processed first), followed by an adjacency segment from P to Q. e) sec 9 v17 An implementation MAY support TI-LFA to protect Node-SIDs associated to a FlexAlgo. In such a case, rather than computing the expected post-convergence path based on the regular SPF, an implementation SHOULD use the constrained SPF algorithm bound to the FlexAlgo (using the Flex Algo Definition) instead of the regular Dijkstra in all the SPF/rSPF computations that are occurring during the TI-LFA computation. This includes the computation of the P-Space and Q-Space as well as the post-convergence path. An implementation MUST only use Node-SIDs bound to the FlexAlgo and/or Adj-SIDs that are unprotected or, in case of SRv6, adj-SIDs that are bound to the FlexAlgo to build the repair list. v18 An implementation MAY support TI-LFA to protect Node-SIDs associated to a FlexAlgo. In such a case, rather than computing the expected post-convergence path based on the regular SPF, an implementation MAY use the constrained SPF algorithm bound to the FlexAlgo (using the Flex Algo Definition) instead of the regular Dijkstra in all the SPF/rSPF computations that are occurring during the TI-LFA computation. This includes the computation of the P-Space and Q-Space as well as the post-convergence path. If an implementation uses the constrained SPF algorithm bound to the FlexAlgo, then the implementation MUST only use Node-SIDs bound to the FlexAlgo and/or Adj-SIDs that are unprotected or, in case of SRv6, adj-SIDs that are bound to the FlexAlgo to build the repair list. NEW An implementation MAY support TI-LFA to protect Node-SIDs associated *with* a Flex Algo. In such a case, rather than computing the expected post-convergence path based on the regular SPF, an implementation *SHOULD *use the constrained SPF algorithm bound to the Flex Algo (using the Flex Algo Definition) instead of the regular Dijkstra in all the SPF/rSPF computations that are occurring during the TI-LFA computation. This includes the computation of the P-Space and Q-Space as well as the post-convergence path. *Furthermore, the implementation SHOULD only use Node-SIDs/Adj-SIDs bound to the Flex Algo and/or unprotected Adj-SIDs of the regular SPF to build the repair list. The use of regular Dijkstra for the TI-LFA computation or building of the repair path using SIDs other than those recommended does not ensure that the traffic going over TI-LFA repair path during the fast-reroute period is honoring the Flex Algo constraints.* In addition to the above text change suggestions, I would remind that strict following of post-convergence is not guaranteed by TI-LFA as it depends on the protection scheme selected. There is the following text that explains this scenario in Appendix A. Readers should be aware that FRR protection is pre-computing a backup path to protect against a particular type of failure (link, node, SRLG). When using the post-convergence path as FRR backup path, the computed post-convergence path is the one considering the failure we are protecting against. This means that FRR is using an expected post-convergence path, and this expected post-convergence path may be actually different from the post-convergence path used if the failure that happened is different from the failure FRR was protecting against. As an example, if the operator has implemented a protection against a node failure, the expected post-convergence path used during FRR will be the one considering that the node has failed. However, even if a single link is failing or a set of links is failing (instead of the full node), the node-protecting post-convergence path will be used. The consequence is that the path used during FRR is not optimal with respect to the failure that has actually occurred. I hope this helps get us closer to the resolution of the open issues with this document. Thanks, Ketan On Fri, Nov 15, 2024 at 8:09 AM Yingzhen Qu <yingzhen.i...@gmail.com> wrote: > Speaking as WG member, I agree with John's comments and what Stewart and > Sasha said at the mic, the removal of the requirement to follow > post-convergence path is a big change. If it's not mandatory anymore, we > need to document under what situation, post-convergence path is recommended > and why? and the situations why it's not necessary to follow > post-convergence path. > > As WG co-chair, this change should be clearly communicated with the WG. We > need to poll the WG for consensus. If it helps, we can have an interim > meeting to discuss and review the document. > > Thanks, > Yingzhen > > On Thu, Nov 14, 2024 at 2:16 PM John Scudder <j...@juniper.net> wrote: > >> Hi Ahmed, >> >> Thanks for the update. I read the diff, and I listened to the recording >> of your rtgwg presentation. >> >> I've written a long message. For convenience, the bottom line (TL;DR as >> it were) is that I think the conversation that was started with Stewart and >> Sasha at the mic line at IETF-121 needs to be worked through. Once the >> RTGWG chairs and AD are satisfied, I'll abide by that. >> >> Now the long version: >> >> On Nov 13, 2024, at 3:01 PM, Ahmed Bashandy <abashandy.i...@gmail.com> >> wrote: >> >> I uploaded version 18 of the ti-lfa draft to address the two DISCUSS >> items in >> >> https://urldefense.com/v3/__https://datatracker.ietf.org/doc/draft-ietf-rtgwg-segment-routing-ti-lfa/ballot/__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1x_LjTNsQ$ >> >> - To address John Scudder's Discuss, I made the modifications to remove >> the word "key" from the abstract as suggested by Sasha at >> >> https://urldefense.com/v3/__https://mailarchive.ietf.org/arch/msg/rtgwg/nWR4uYaT3T30XRiyRdAoIqO22AM/__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1ytjIan9Q$ >> and Pierre at >> >> https://urldefense.com/v3/__https://mailarchive.ietf.org/arch/msg/rtgwg/zHP2qvP2Ew1oWl5G7Gq8niu8vy8/__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1wFAV_5AQ$ >> >> - To address Murray Discuss (as well as as comments from others) I >> removed the word "SHOULD" from sections 6.2, 6.3, and 9 as I suggested >> during my presentation during the rtgwg meeting last Tuesday Nov/5/24. >> The entire recording of the RTGWG meeting can be found in >> >> https://urldefense.com/v3/__https://meetecho-player.ietf.org/playout/?session=IETF121-RTGWG-20241105-0930__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1yu45oEfQ$ >> >> The slides that I presented in in PDF format can be found in >> >> https://urldefense.com/v3/__https://datatracker.ietf.org/meeting/121/materials/slides-121-rtgwg-02-tilfa-bgppic-00.pdf__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1zaUaCWtg$ >> >> >> Please take a look and see if the modifications are good to address the >> two DISCUSS Items >> >> >> In your update you've gotten rid of "key". That's fine as far as it goes, >> and I agree it resolves the inconsistency between the abstract and body. >> But that was just an editorial issue, the canary in the coal mine as it >> were, that illuminated the more general point. Perhaps I expressed myself >> poorly in the DISCUSS and that's what led us down this rabbit hole of >> focusing on the word "key". I apologize for that. My larger concern was >> expressed very ably by Stewart in the Q&A of your presentation. Rather than >> try to paraphrase him, I've taken the liberty of starting with the >> transcript [1] and cleaning it up, appended below. Stewart nails it. (I >> kept most of it as close to verbatim as I could but did remove a little bit >> of procedural "keep it quick" stuff from the chair. This is of course not >> an official transcript anyway.) >> >> To elaborate a bit, though: as far as I can tell, the contribution (and >> it is a big contribution!) of the spec is to show how to use >> post-convergence paths for restoration. If you remove that (which I can >> because it's optional), it seems as though there is nothing left that >> wasn't already specified before (for example in RFC 7490, and others). >> >> You mentioned in your comments at the meeting that post-convergence was >> made optional "because some platforms cannot do it". Normally, when we have >> a platform that can't do a specification, that's fine, the platform simply >> wouldn't claim conformance to that specification. If you have, say, a >> platform that can only forward based on the IPv4 or IPv6 header but not on >> the MPLS header, you don't change the MPLS specification to say forwarding >> on the MPLS header is not mandatory. You just don't claim conformance with >> MPLS. (I chose an extreme case, of course, in hopes of clearly illustrating >> the point.) >> >> If I were confident that the WG consensus is yes, absolutely the WG wants >> to publish this document in its current "post-convergence is explicitly >> optional" state, I would move from DISCUSS to ABSTAIN. I would choose >> ABSTAIN rather than NOOBJ because of the observation above, that as far as >> I can tell once you remove post-convergence there's nothing left that >> hasn't been done before. (Note that ABSTAIN is a non-blocking, though also >> non-supporting, ballot position.) >> >> However, it is not clear to me that this is, indeed, a solid WG >> consensus. In addition to Stewart and Sasha's comments, you also mentioned >> that you've gotten private emails raising the same concern. Calling >> consensus for RTGWG isn't my job, I would defer to the chairs and AD (Jim) >> on that point, but it sounded to me from the RTGWG meeting like this was >> the next action. >> >> One last point, right at the end of the discussion of the draft you say, >> "I avoid shoulds because of the pushback that I get. But in my opinion it >> should be a should. [...] Either you guys want me to put it back as a >> mandatory or say why it's not mandatory. I have a reason why it is not >> mandatory and I just mentioned it and I can put that." >> >> Interestingly, this coincides closely with Murray's DISCUSS ballot, about >> SHOULD. I get it that you have different views on the use of SHOULD, but >> per my reading of RFC 2119 the case under discussion here is exactly the >> kind of situation where it becomes useful. To remind us of what 2119 says: >> >> ``` >> 3. SHOULD This word, or the adjective "RECOMMENDED", mean that there >> may exist valid reasons in particular circumstances to ignore a >> particular item, but the full implications must be understood and >> carefully weighed before choosing a different course. >> ``` >> >> As far as I can tell, that is what you are saying: an implementation >> SHOULD use the post-convergence path unless (conditions you will name, >> e.g., "length of the SID stack is long enough, hardware cannot support >> it"), in which case that implementation MUST fall back to (whatever the >> right fallback posture is, RFC 7490 perhaps). >> >> I don't insist you use that language or even that approach, nor am I sure >> it would satisfy the WG -- I just offer it as a point to consider. >> >> Thanks, >> >> --John >> >> My edited transcript: >> >> Stewart (17:12) >> >> So, Ahmed when this piece of work started, many of us have tracked this >> piece of work since the first day it was presented at the IETF. When it was >> presented, the word "key" was important because it was a fundamental >> concept of the design that the repair path had to follow the >> post-convergence path and the document kind of has that sort of subtly >> written in, in various places, except in the places where it doesn't. >> >> So I think what is... what the authors need to do is to be quite clear to >> the working group if it is no longer key, if it is no longer a mandat- a >> requirement to follow the post-convergence path, then there needs to be an >> explanation as to why this position has changed and then the text body >> needs to reflect the consensus position of the working group on whether it >> is important that it follows the "post-convergence path" or it's not >> important or there are times when it is and times when it is not, and in >> which case those circumstances should be documented in the text. >> >> Ahmed (18:21) >> >> So the document really says that it is not mandatory and it is important >> and it explains why it is important like I can read part of the document >> and I'll point them out, actually, I'll reply to your email, but the point >> here is that we don't really try to put justifications because then I will >> go into the details of the implementation. I just put the spec there and >> say, you know what? It is important, but it's not mandatory. You don't have >> to follow it. Your implementation doesn't have to follow it. If you want to >> follow it, I have paragraphs that says how you follow it in certain >> scenarios like that is... >> >> (cross-talk) >> >> Stewart (18:54) >> >> I think you're skipping the important point. The original thesis was that >> this was a required congruence. That has been dropped, the least you need >> to do is to explain to the working group why the requirement for congruence >> has been changed. And then we need to decide what text needs to go in the >> document to reflect that change of positions. But absolutely, this was a >> fundamental of the original design and it seems to have been quietly and >> subtly changed without explanation. >> >> Ahmed (19:28) >> >> Okay so I thought it's uh yeah I can add a statement that's why it is I >> thought it's obvious basically because some platforms cannot do it. It's as >> simple as that. I'll put the sentence if this is why it has been dropped. >> This was basically a feedback that we got I can try and dig the emails it >> has been a long while that some hardware simply cannot support it or some >> software cannot support it if the number if the length of the SID stack is >> long enough, hardware cannot support it so we can still do topology >> independent which means you can still get your backup up but it will not be >> over the post-convergence path. That is the only reason really. >> >> Stewart (20:08) >> >> I think this probably needs a longer conversation than we can have in >> this working group and I think uh John I mean Jim probably needs to convene >> a group of experts. >> >> Ahmed (20:19) >> >> [elided] >> >> Sasha (20:34) >> >> I just wanted to second... to say exactly what Stewart has said. I have >> nothing to add. [Garbled] ... something is called the key aspect of a >> feature and then called non-mandatory is not... creates a confusion to put >> it mildly. This has to be resolved one way or another with explanations >> because there is a loss of history behind this change of requirements. I >> actually... I second what Stuart has said. >> >> Ahmed (21:13) >> >> Okay sure, okay, I think I got the point. So I'm open to discussions I >> have no problem really. Okay sure. >> >> JeffT (21:23) >> >> Ahmed, do you feel we need another discussion on this? Is it clear what >> working group is expecting from you in terms of changes and clarifications? >> >> Ahmed (21:31) >> >> Yeah, my understanding, and again, I'm talking about Stuart and Sasha's >> comments that the original draft was... I'll have to dig it out to be >> honest, it's been a long while... that to be TI-LFA the repair path has to >> be post-convergence. This has been dropped from must-have to important, and >> I avoid shoulds because of the pushback that I get. But in my opinion it >> should be a should. But it seems like Sasha and Stuart want it back. And >> not only Sasha and Stewart, there are other but also other [garbled] >> exchange email privately, but because it's private I'm not going to divulge >> their names that also think that it should be put back to mandatory and I'm >> open to either way. Either you guys want me to put it back as a mandatory >> or say why it's not mandatory. I have a reason why it is not mandatory and >> I just mentioned it and I can put that. I'll discuss it with the co-authors >> and see what they want, but I understand Stuart and Sasha's comments. >> >> [1] >> https://meetecho-player.ietf.org/playout/?session=IETF121-RTGWG-20241105-0930 >> > _______________________________________________ > rtgwg mailing list -- rtgwg@ietf.org > To unsubscribe send an email to rtgwg-le...@ietf.org >
_______________________________________________ rtgwg mailing list -- rtgwg@ietf.org To unsubscribe send an email to rtgwg-le...@ietf.org