As for the editorial suggestions, I agree with all of them. I have incorporated these changes and attached the diffs from the latest version (version -18) in this email
Ahmed On 11/15/24 2:44 AM, Ketan Talaulikar wrote:
Hi Yingzhen/John/Ahmed/All, I have some text suggestions for your consideration. a) Abstract v17A key aspect of TI-LFA is the FRR path selection approach establishing protection over the expected post-convergence paths from the point of local repair, reducing the operational need to control the tie-breaks among various FRR options.v18Although not a TI-LFA requirement or constraint, TI-LFA also brings the benefit of the ability to provide a backup path that follows the expected post-convergence path, reducing the operational need to control the tie-breaks among various FRR options.NEWAn *_important_* aspect of TI-LFA is the FRR path selection approach establishing protection over the expected post-convergence paths from the point of local repair, reducing the operational need to control the tie-breaks among various FRR options.b) sec 6.1 v18When a direct neighbor is in P(S,X) and Q(D,x) and the link to that direct neighbor is on the post-convergence path, the outgoing interface is set to that neighbor and the repair segment list SHOULD be empty.NEWWhen a direct neighbor is in P(S,X) and Q(D,x) and the link to that direct neighbor is on the post-convergence path, the outgoing interface is set to that neighbor and the repair segment list *_is_ *empty.c) sec 6.2 v17When a remote node R is in P(S,X) and Q(D,x) and on the post-convergence path, the repair list SHOULD be made of a single node segment to R and the outgoing interface SHOULD be set to the outgoing interface used to reach R.v18When a remote node R is in P(S,X) and Q(D,x) and on the post-convergence path, the repair list can be made of a single node segment to R and the outgoing interface set to the outgoing interface used to reach R, thereby minimizing the size of the repair-list while keeping the repair path on the post-convergence path.NEWWhen a remote node R is in P(S,X) and Q(D,x) and on the post-convergence path, the repair list is made of a single node segment to R and the outgoing interface *_is_ *set to the outgoing interface used to reach R.d) sec 6.3 v17When a node P is in P(S,X) and a node Q is in Q(D,x) and both are on the post-convergence path and both are adjacent to each other, the repair list SHOULD be made of two segments: A node segment to P (to be processed first), followed by an adjacency segment from P to Q.v18When a node P is in P(S,X) and a node Q is in Q(D,x) and both are on the post-convergence path and both are adjacent to each other, the repair list size can be minimized while keeping the repair path on the post-convergence path by constructing it from two segments: A node segment to P (to be processed first), followed by an adjacency segment from P to Q.NEWWhen a node P is in P(S,X) and a node Q is in Q(D,x) and both are on the post-convergence path and both are adjacent to each other, the repair list *_is_* made of two segments: A node segment to P (to be processed first), followed by an adjacency segment from P to Q.e) sec 9 v17An implementation MAY support TI-LFA to protect Node-SIDs associated to a FlexAlgo. In such a case, rather than computing the expected post-convergence path based on the regular SPF, an implementation SHOULD use the constrained SPF algorithm bound to the FlexAlgo (using the Flex Algo Definition) instead of the regular Dijkstra in all the SPF/rSPF computations that are occurring during the TI-LFA computation. This includes the computation of the P-Space and Q-Space as well as the post-convergence path. An implementation MUST only use Node-SIDs bound to the FlexAlgo and/or Adj-SIDs that are unprotected or, in case of SRv6, adj-SIDs that are bound to the FlexAlgo to build the repair list.v18An implementation MAY support TI-LFA to protect Node-SIDs associated to a FlexAlgo. In such a case, rather than computing the expected post-convergence path based on the regular SPF, an implementation MAY use the constrained SPF algorithm bound to the FlexAlgo (using the Flex Algo Definition) instead of the regular Dijkstra in all the SPF/rSPF computations that are occurring during the TI-LFA computation. This includes the computation of the P-Space and Q-Space as well as the post-convergence path. If an implementation uses the constrained SPF algorithm bound to the FlexAlgo, then the implementation MUST only use Node-SIDs bound to the FlexAlgo and/or Adj-SIDs that are unprotected or, in case of SRv6, adj-SIDs that are bound to the FlexAlgo to build the repair list.NEWAn implementation MAY support TI-LFA to protect Node-SIDs associated *_with_* a Flex Algo. In such a case, rather than computing the expected post-convergence path based on the regular SPF, an implementation *_SHOULD_ *use the constrained SPF algorithm bound to the Flex Algo (using the Flex Algo Definition) instead of the regular Dijkstra in all the SPF/rSPF computations that are occurring during the TI-LFA computation. This includes the computation of the P-Space and Q-Space as well as the post-convergence path. *_Furthermore, the implementation SHOULD only use Node-SIDs/Adj-SIDs bound to the Flex Algo and/or unprotected Adj-SIDs of the regular SPF to build the repair list. The use of regular Dijkstra for the TI-LFA computation or building of the repair path using SIDs other than those recommended does not ensure that the traffic going over TI-LFA repair path during the fast-reroute period is honoring the Flex Algo constraints._*In addition to the above text change suggestions, I would remind that strict following of post-convergence is not guaranteed by TI-LFA as it depends on the protection scheme selected. There is the following text that explains this scenario in Appendix A.Readers should be aware that FRR protection is pre-computing a backup path to protect against a particular type of failure (link, node, SRLG). When using the post-convergence path as FRR backup path, the computed post-convergence path is the one considering the failure we are protecting against. This means that FRR is using an expected post-convergence path, and this expected post-convergence path may be actually different from the post-convergence path used if the failure that happened is different from the failure FRR was protecting against. As an example, if the operator has implemented a protection against a node failure, the expected post-convergence path used during FRR will be the one considering that the node has failed. However, even if a single link is failing or a set of links is failing (instead of the full node), the node-protecting post-convergence path will be used. The consequence is that the path used during FRR is not optimal with respect to the failure that has actually occurred.I hope this helps get us closer to the resolution of the open issues with this document.Thanks, KetanOn Fri, Nov 15, 2024 at 8:09 AM Yingzhen Qu <yingzhen.i...@gmail.com> wrote:Speaking as WG member, I agree with John's comments and what Stewart and Sasha said at the mic, the removal of the requirement to follow post-convergence path is a big change. If it's not mandatory anymore, we need to document under what situation, post-convergence path is recommended and why? and the situations why it's not necessary to follow post-convergence path. As WG co-chair, this change should be clearly communicated with the WG. We need to poll the WG for consensus. If it helps, we can have an interim meeting to discuss and review the document. Thanks, Yingzhen On Thu, Nov 14, 2024 at 2:16 PM John Scudder <j...@juniper.net> wrote: Hi Ahmed, Thanks for the update. I read the diff, and I listened to the recording of your rtgwg presentation. I've written a long message. For convenience, the bottom line (TL;DR as it were) is that I think the conversation that was started with Stewart and Sasha at the mic line at IETF-121 needs to be worked through. Once the RTGWG chairs and AD are satisfied, I'll abide by that. Now the long version:On Nov 13, 2024, at 3:01 PM, Ahmed Bashandy <abashandy.i...@gmail.com> wrote: I uploaded version 18 of the ti-lfa draft to address the two DISCUSS items in https://urldefense.com/v3/__https://datatracker.ietf.org/doc/draft-ietf-rtgwg-segment-routing-ti-lfa/ballot/__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1x_LjTNsQ$ - To address John Scudder's Discuss, I made the modifications to remove the word "key" from the abstract as suggested by Sasha at https://urldefense.com/v3/__https://mailarchive.ietf.org/arch/msg/rtgwg/nWR4uYaT3T30XRiyRdAoIqO22AM/__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1ytjIan9Q$ and Pierre at https://urldefense.com/v3/__https://mailarchive.ietf.org/arch/msg/rtgwg/zHP2qvP2Ew1oWl5G7Gq8niu8vy8/__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1wFAV_5AQ$ - To address Murray Discuss (as well as as comments from others) I removed the word "SHOULD" from sections 6.2, 6.3, and 9 as I suggested during my presentation during the rtgwg meeting last Tuesday Nov/5/24. The entire recording of the RTGWG meeting can be found in https://urldefense.com/v3/__https://meetecho-player.ietf.org/playout/?session=IETF121-RTGWG-20241105-0930__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1yu45oEfQ$ The slides that I presented in in PDF format can be found in https://urldefense.com/v3/__https://datatracker.ietf.org/meeting/121/materials/slides-121-rtgwg-02-tilfa-bgppic-00.pdf__;!!NEt6yMaO-gk!Dhbz2cZyaM1s3x99GGj8o3EmDms0MLvUF97k-_fYkmiRooU3ofACGYnn2oXLzk8yYNKcT3uUxTjpe1zaUaCWtg$ Please take a look and see if the modifications are good to address the two DISCUSS ItemsIn your update you've gotten rid of "key". That's fine as far as it goes, and I agree it resolves the inconsistency between the abstract and body. But that was just an editorial issue, the canary in the coal mine as it were, that illuminated the more general point. Perhaps I expressed myself poorly in the DISCUSS and that's what led us down this rabbit hole of focusing on the word "key". I apologize for that. My larger concern was expressed very ably by Stewart in the Q&A of your presentation. Rather than try to paraphrase him, I've taken the liberty of starting with the transcript [1] and cleaning it up, appended below. Stewart nails it. (I kept most of it as close to verbatim as I could but did remove a little bit of procedural "keep it quick" stuff from the chair. This is of course not an official transcript anyway.) To elaborate a bit, though: as far as I can tell, the contribution (and it is a big contribution!) of the spec is to show how to use post-convergence paths for restoration. If you remove that (which I can because it's optional), it seems as though there is nothing left that wasn't already specified before (for example in RFC 7490, and others). You mentioned in your comments at the meeting that post-convergence was made optional "because some platforms cannot do it". Normally, when we have a platform that can't do a specification, that's fine, the platform simply wouldn't claim conformance to that specification. If you have, say, a platform that can only forward based on the IPv4 or IPv6 header but not on the MPLS header, you don't change the MPLS specification to say forwarding on the MPLS header is not mandatory. You just don't claim conformance with MPLS. (I chose an extreme case, of course, in hopes of clearly illustrating the point.) If I were confident that the WG consensus is yes, absolutely the WG wants to publish this document in its current "post-convergence is explicitly optional" state, I would move from DISCUSS to ABSTAIN. I would choose ABSTAIN rather than NOOBJ because of the observation above, that as far as I can tell once you remove post-convergence there's nothing left that hasn't been done before. (Note that ABSTAIN is a non-blocking, though also non-supporting, ballot position.) However, it is not clear to me that this is, indeed, a solid WG consensus. In addition to Stewart and Sasha's comments, you also mentioned that you've gotten private emails raising the same concern. Calling consensus for RTGWG isn't my job, I would defer to the chairs and AD (Jim) on that point, but it sounded to me from the RTGWG meeting like this was the next action. One last point, right at the end of the discussion of the draft you say, "I avoid shoulds because of the pushback that I get. But in my opinion it should be a should. [...] Either you guys want me to put it back as a mandatory or say why it's not mandatory. I have a reason why it is not mandatory and I just mentioned it and I can put that." Interestingly, this coincides closely with Murray's DISCUSS ballot, about SHOULD. I get it that you have different views on the use of SHOULD, but per my reading of RFC 2119 the case under discussion here is exactly the kind of situation where it becomes useful. To remind us of what 2119 says: ``` 3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. ``` As far as I can tell, that is what you are saying: an implementation SHOULD use the post-convergence path unless (conditions you will name, e.g., "length of the SID stack is long enough, hardware cannot support it"), in which case that implementation MUST fall back to (whatever the right fallback posture is, RFC 7490 perhaps). I don't insist you use that language or even that approach, nor am I sure it would satisfy the WG -- I just offer it as a point to consider. Thanks, --John My edited transcript: Stewart (17:12) So, Ahmed when this piece of work started, many of us have tracked this piece of work since the first day it was presented at the IETF. When it was presented, the word "key" was important because it was a fundamental concept of the design that the repair path had to follow the post-convergence path and the document kind of has that sort of subtly written in, in various places, except in the places where it doesn't. So I think what is... what the authors need to do is to be quite clear to the working group if it is no longer key, if it is no longer a mandat- a requirement to follow the post-convergence path, then there needs to be an explanation as to why this position has changed and then the text body needs to reflect the consensus position of the working group on whether it is important that it follows the "post-convergence path" or it's not important or there are times when it is and times when it is not, and in which case those circumstances should be documented in the text. Ahmed (18:21) So the document really says that it is not mandatory and it is important and it explains why it is important like I can read part of the document and I'll point them out, actually, I'll reply to your email, but the point here is that we don't really try to put justifications because then I will go into the details of the implementation. I just put the spec there and say, you know what? It is important, but it's not mandatory. You don't have to follow it. Your implementation doesn't have to follow it. If you want to follow it, I have paragraphs that says how you follow it in certain scenarios like that is... (cross-talk) Stewart (18:54) I think you're skipping the important point. The original thesis was that this was a required congruence. That has been dropped, the least you need to do is to explain to the working group why the requirement for congruence has been changed. And then we need to decide what text needs to go in the document to reflect that change of positions. But absolutely, this was a fundamental of the original design and it seems to have been quietly and subtly changed without explanation. Ahmed (19:28) Okay so I thought it's uh yeah I can add a statement that's why it is I thought it's obvious basically because some platforms cannot do it. It's as simple as that. I'll put the sentence if this is why it has been dropped. This was basically a feedback that we got I can try and dig the emails it has been a long while that some hardware simply cannot support it or some software cannot support it if the number if the length of the SID stack is long enough, hardware cannot support it so we can still do topology independent which means you can still get your backup up but it will not be over the post-convergence path. That is the only reason really. Stewart (20:08) I think this probably needs a longer conversation than we can have in this working group and I think uh John I mean Jim probably needs to convene a group of experts. Ahmed (20:19) [elided] Sasha (20:34) I just wanted to second... to say exactly what Stewart has said. I have nothing to add. [Garbled] ... something is called the key aspect of a feature and then called non-mandatory is not... creates a confusion to put it mildly. This has to be resolved one way or another with explanations because there is a loss of history behind this change of requirements. I actually... I second what Stuart has said. Ahmed (21:13) Okay sure, okay, I think I got the point. So I'm open to discussions I have no problem really. Okay sure. JeffT (21:23) Ahmed, do you feel we need another discussion on this? Is it clear what working group is expecting from you in terms of changes and clarifications? Ahmed (21:31) Yeah, my understanding, and again, I'm talking about Stuart and Sasha's comments that the original draft was... I'll have to dig it out to be honest, it's been a long while... that to be TI-LFA the repair path has to be post-convergence. This has been dropped from must-have to important, and I avoid shoulds because of the pushback that I get. But in my opinion it should be a should. But it seems like Sasha and Stuart want it back. And not only Sasha and Stewart, there are other but also other [garbled] exchange email privately, but because it's private I'm not going to divulge their names that also think that it should be put back to mandatory and I'm open to either way. Either you guys want me to put it back as a mandatory or say why it's not mandatory. I have a reason why it is not mandatory and I just mentioned it and I can put that. I'll discuss it with the co-authors and see what they want, but I understand Stuart and Sasha's comments. [1] https://meetecho-player.ietf.org/playout/?session=IETF121-RTGWG-20241105-0930 _______________________________________________ rtgwg mailing list -- rtgwg@ietf.org To unsubscribe send an email to rtgwg-le...@ietf.org _______________________________________________ rtgwg mailing list --rtgwg@ietf.org To unsubscribe send an email tortgwg-le...@ietf.org
<<< text/html; charset=UTF-8; name="draft-ietf-rtgwg-segment-routing-ti-lfa-19.diff.html": Unrecognized >>>
_______________________________________________ rtgwg mailing list -- rtgwg@ietf.org To unsubscribe send an email to rtgwg-le...@ietf.org