Hi Adrian, All, Thank you for the detailed review and comments. I am taking over the pen from Cheng as requested, and have just posted the -08 to avoid expiration. During this update we fixed some easy comments but more discussions are needed before we make further revision. Please check my point inline.
Best wishes, Haomian -----邮件原件----- 发件人: Adrian Farrel <adr...@olddog.co.uk> 发送时间: 2024年6月29日 4:00 收件人: draft-ietf-pce-state-s...@ietf.org 抄送: pce@ietf.org 主题: A review of draft-ietf-pce-state-sync-07 Hi, As this document approaches being ready for working group last call, I thought it might be helpful if I did a review. Cheers, Adrian === The document title could do with some clean-up. - Remove the full stop - Perhaps make it more straight-forward. For example, Procedures for Communication between Stateful Path Computation Elements --- Abstract para 2 s/an LSP/LSP/ --- Abstract para 3 s/a stateful/stateful/ --- 1. para 2 s/paths,/paths)/ --- 1. para 4 s/an LSP state/LSP state/ s/of a LSP/of an LSP/ s/to only a single PCE/to a single PCE/ --- 1. s/to allow a stateful/to allow stateful/ --- 1. OLD Further, the examples in this section are for illustrative purpose to showcase the need for inter-PCE stateful PCEP sessions. NEW This section contains illustrative examples to showcase the need for inter-PCE stateful PCEP sessions. END <Haomian> All above fixed. However, I find the examples running through sections 1.2, 1.3, and 1.4 to be quite unnecessary. There is a feeling of "trying too hard" to prove that there are uses for the protocol extensions described in the body of the document. I would have been very happy with just one paragraph listing out a few of the possible use cases. Further, section 4 (which seems to rework the examples in sections 1.2, 1.3, and 1.4) is not really an explanation of how the protocol extension works, but more of a set of use cases. Again, this feels like it is trying to prove the value of the extension. Do we really need it? It is such a simple protocol extension. <Haomian> I would agree on this point. By reading through I understand how to use the state sync, but it’s too much to be included in the document. We need to agree on ‘how many percentage of the current text we should keep here?’ My personal suggestion is around 30%. --- 1.2 para 1 s/an LSP state/LSP state/ s/grants the control/grants control/ <Haomian> All above fixed. --- 1.2 para 2 This is really to read. In a multi PCE deployment (redundancy, loadbalancing...), with the current specification defined in [RFC8231], when a PCE makes an update, it is the PCC that is in charge of reporting the LSP status to all PCEs with LSP parameter change which brings additional hops and delays in notifying the overall network of the LSP parameter change. a) s/current specification/specification/ b) s/with LSP parameter change/with any LSP parameter changes/ c) I can't tell what the final part of the paragraph means. Is it the reporting that brings additional hops and delays? How does the reporting cause this? <Haomian> I interpreted the content as, and we can simplify it. To update once confirmed. OLD: when a PCE makes an update, it is the PCC that is in charge of reporting the LSP status to all PCEs with any LSP parameter changes which brings additional hops and delays in notifying the overall network of the LSP parameter change. NEW: when a PCE makes an update, it is the PCC that is in charge of reporting the LSP status to all PCEs with such LSP parameter updates. --- 1.2 para 4 s/As stateful PCE make/As a stateful PCE makes/ s/immediately to/to/ --- All the figures in the document need numbers and titles. The text should refer to them using <xref> rather than "the figure above" etc. <Haomian> leave it to later version. --- 1.2 OLD PCE1 is responsible to compute paths for PCC1 and PCE2 is responsible to compute paths for PCC2. NEW PCE1 is responsible for computing paths for PCC1, and PCE2 is responsible for computing paths for PCC2. END OLD PCE2 will so be notified of the change only after receiving the PCRpt message from PCC1. NEW So PCE2 will be notified of the change only after receiving the PCRpt message from PCC1. END <Haomian> Fixed. --- I'm confused by the example given in 1.2. LSP1 is under the control of PCE1, so any changes that are made are made with the direct instruction from PCE1. Therefore, it is not necessary to wait for the change to be reported by PCC1 - PCE2 can be notified of the intended change. Surely that would be more efficient. (Yes, that would only be possible if there is a session between the PCEs.) Additionally, when the change is made in the network, it seems unlikely to me that PCC1 would be aware that it is LSP2 that has had its resources reduced because PCC1 does not know about the existence of LSP2. However, the network nodes servicing LSP2 would know about this and so it would be reported to PCC2 which can report it direct to PCE2. (Note that this does not take anything away from the proposed protocol extension. It is just a confusing example.) <Haomian> I don’t see anything wrong about the current text �C the confusion comes from too many alternative way to do the sync and people may have different intuition. I would propose to remove the descriptions about LSP2, just saying PCC1 reporting the update of LSP1 to PCE2 is slower than sync it from PCE1 to PCE2, does it work better? --- 1.3 s/failure, PCC/failure, the PCC/ s/sending new/sending a new/ Expand 'ERO' on first use. <Haomian> Fixed. --- 1.3 When the failed PCE or PCEP session comes back online, it will be up to the implementation to do preemption. Doing preemption may lead to some disruption on the existing path if path results from both PCEs are not exactly the same. The term 'preemption' may cause some confusion. I don't think you are referring to the type of preemption of resources mentioned in the example in the previous section. Perhaps... When the failed PCE or PCEP session comes back online, it will be up to the implementation whether to revert back to the original primary PCE. Reverting may lead to some disruption on the existing path if computation results from both PCEs are not exactly the same. <Haomian> Fixed. --- 1.3 By considering a network with multiple PCCs and implementing multiple stateful PCEs for redundancy purpose, there is no guarantee that at any time all the PCCs delegate their LSPs to the same PCE. There is something not quite right about this sentence. Is there a 'not' missing, such as... By considering a network with multiple PCCs and implementing multiple stateful PCEs for redundancy purpose, there is no guarantee that at any time all the PCCs will not delegate their LSPs to the same PCE. The word 'preemption' is used later in the section, as well. <Haomian> I am having different understanding, perhaps the following. (BTW I am not a fan of double negative) NEW: By considering a network with multiple PCCs and implementing multiple stateful PCEs for redundancy purpose, it is not likely that all PCCs delegate their LSPs to the same PCE. --- 1.3 OLD The set of LSPs that are dependent to each other may start from a different head-end. NEW The set of LSPs that are dependent on each other may start from different head-ends. END <Haomian> Fixed. --- 1.3 OLD In the topology, all links cost metric is set to 1 NEW In the topology, all link cost metrics are set to 1 END <Haomian> Fixed. --- 1.3. The figures on page 7, it is slightly odd that you have named the destination/egress of the LSPs as PCCs. It is true that they might be PCCs for other LSPs, but is that relevant? <Haomian> How about change the arrows from uni-directional to bi-directional? --- It might help if the scenarios in section 1.3 were given their own subsections (1.3.1, etc.). <Haomian> Fixed. --- 1.3 The figures for scenarios 2, 3, 4 show D=0 and D=1 next to the links. But I don't see anything that explains what this means. I assume that this is the "delegate" flag. <Haomian> Yes you are right. --- 1.3 scenario 3 s/sequence of event/sequence of events/ <Haomian> Fixed. --- 1.3 scenario 6 This scenario makes me very uneasy. Why do you have two domains if you are sharing all of the topology information? A substantial reason for domains is scalability. Another reason is confidentiality. Why do you show the PCEs as belonging to the separate domains if they share all of the information? Surely they should be shown as outside the domains? BGP-LS was intended as a "north-bound" policy-based export of information. It does not follow that a BGP-LS speaker in one domain would entertain a session with a node outside the domain, nor that it would share a full set of information. The question of multi-domain PCE and BGP-LS has been discussed in a large set of previous RFCs, and this approach seems to go against all of that prior work. But looking at this scenario, I wonder why you even assume that the PCEs need to be able to see topology in both domains. It just seems unnecessary. <Haomian> I agree with the comment, Scenario 6 is overcomplicated to me from optical perspective, it’s not necessary and we may remove it. However I would also listen to packet experts. --- 2.1 OLD This document specify a mechanism to set-up a PCEP session between the stateful PCEs. Creating such a session is already authorized by multiple scenarios like the one described in [RFC4655] (multiple PCEs that are handling part of the path computation) and [RFC6805] (hierarchical PCE) but was only focused on the stateless PCEP sessions. As stateful PCE brings additional features (LSP state synchronization, path update, delegation, ...), thus some new behaviors need to be defined. This inter-PCE PCEP session will allow the exchange of LSP states between PCEs that would help some scenarios where PCEP sessions are lost between PCC and PCE. This inter-PCE PCEP session is henceforth called a state-sync session. NEW This document specify a mechanism to set up a PCEP session between the stateful PCEs. Creating a PCEP session between PCEs is already enabled for multiple scenarios like the ones described in [RFC4655] (multiple PCEs that are handling part of a path computation) and [RFC6805] (hierarchical PCE). But that earlier work focused only on the sessions between stateless PCEs. Stateful PCE brings additional features to PCEP (LSP state synchronization, path update, delegation, ...). Thus some new behaviors need to be defined on the inter-PCE PCEP session. This inter-PCE PCEP session allows the exchange of LSP states between PCEs that can help in some scenarios where PCEP sessions are lost between PCCs and PCEs. This inter-PCE PCEP session is called a "state-sync session" in this document. END <Haomian> Fixed. --- 2.1 s/session will allow for a PCE/session will allow a PCE/ <Haomian> Fixed. --- 2.2 To provide the best efficiency, an LSP association constraint-based computation requires that a single PCE performs the path computation for all LSPs in the association group. I don't agree with this as stated. I do agree that it can be more optimal, and that there are some algorithms that compute two paths at the same time, but it is also possible to construct "split brain" solutions that work fine, if a little more slowly and with more information exchange. Further, not all LSP associations need knowledge of the path of one LSP to establish the other LSPs. <Haomian>Following changes proposed: OLD: To provide the best efficiency, an LSP association constraint-based computation requires that a single PCE performs the path computation for all LSPs in the association group. NEW: To achieve better efficiency, an LSP association constraint-based computation MAY require that a single PCE performs the path computation for all LSPs in the association group. --- 2.2 s/This document specify/This document specifies/ <Haomian> Fixed. --- 2.2 The priority could be set per association, per PCC, or for all LSPs. Is this really... The priority could be set per association, per PCC, or for all PCEs. <Haomian> Fixed. --- 2.2 s/shortest path at/shortest path as/ <Haomian> Fixed. --- In 2.2 I am not clear whether PCE2 is transferring delegation to PCE1 or just asking PCE1 to perform a computation. I think that PCE2 is allowed to ask anyone for help performing a computation, but the issue of delegation could be sensitive - the PCC has delegated to PCE2: does that mean that the PCC is giving permission for the delegation to be passed on? Could this be sensitive because PCE2 might be in a domain that the PCC doesn't trust? Or do you assume that, because the PCC has a session with both PCEs, it trusts them equally? I did find 3.5 about "sub-delegation" and I think this is relevant. <Haomian> We need some consensus on this comment in section 2.2 (primary-secondary relationship): I assume that PCC has a session with both primary/secondary PCEs and delegate the LSP to primary PCE, and such delegation would be migrated to secondary PCE in case there is a need (may because of the failure of primary PCE or just a policy). We can tweak the text after we are on the same page, proposal are welcome. --- 3.1.1 A PCE indicates its support of state-sync procedures during the PCEP Initialization phase [RFC5440]. The OPEN object in the Open message MUST contains the "Stateful PCE Capability" TLV defined in [RFC8231]. A new P (INTER-PCE-CAPABILITY) flag is introduced to indicate the support of state-sync. There is some history of flags being given letters, although not many of these are tracked in the IANA registry (I note that you also don't ask for this in section 11.3). RFC 8623 defines the P2MP-LSP-INSTANTIATION-CAPABILITY flag and calls it the P-flag. So you have a clash. <Haomian> So we need to check what is already used, it’s not clear at https://www.iana.org/assignments/pcep/pcep.xhtml#open-object-flag-field, and tentatively I can propose to use ‘Y’ flag for sync (S is also used), I am not sure if it is a clash again. We also need to request in the IANA section 11.3. --- 3.1.1 * P (INTER-PCE-CAPABILITY - 1 bit - TBD4): If set to 1 by a PCEP Speaker, the PCEP speaker indicates that the session MUST follow the state-sync procedures as described in this document. The P bit MUST be set by both speakers: if a PCEP Speaker receives a STATEFUL-PCE-CAPABILITY TLV with P=0 while it advertised P=1 or if both set P flag to 0, the session SHOULD be set-up but the state- sync procedures MUST NOT be applied on this session. There's a contradiction here. Initially, you say that if a PCE sets P=1 the session MUST follow the procedures. Then you say that if the other PCE sets P=0 the procedures MUST NOT be followed. I think it is clear what you intend, but it needs tidying. How about... * P (INTER-PCE-CAPABILITY - 1 bit - TBD4): If set to 1 by a PCEP speaker, the PCEP speaker indicates that it wants to use the state-sync procedures as described in this document. If the P bit is set by both speakers, the procedures MUST be used. If a PCEP speaker receives a STATEFUL-PCE-CAPABILITY TLV with P=0 while it advertised P=1 or if both set P flag to 0, the session SHOULD be set-up but the state-sync procedures MUST NOT be applied on this session. A PCE MAY decide to close a session if the received setting of the P flag is not acceptable. <Haomian> Fixed. --- In 3.2 I wonder how a PCE (acting as a PCE) can tell the difference between information received from a PCC and a PCE acting as a PCC. This could be made a bit clearer (because it is pretty important to stop the PCEs reporting LSPs back to each other. This gets even more complicated if there are more than 2 PCEs. I suspect this shows up in 3.3 with the ORIGINAL-LSP-DB-VERSION <Haomian> I agree on the point, PCE cannot differentiate whether the PCC is a pure PCC or a comprehensive “PCC+PCE”. The illustration on ‘all PCEs’ or ‘each PCE’ is complicating the problem, and it is suggested to make it clear for 2 PCEs first. We will make the revision after we have consensus, proposals are welcome. --- 3.3 When propagating LSP state changes from a PCE to other PCEs, it is mandatory to ensure that a PCE always uses the freshest state coming from the PCC. I know why you say that. "Mandatory" sounds like a BCP 14 sort of word. I wonder if you want to use "MUST". But also, I wonder whether this can be reworded simply as what the PCE does. <Haomian> Fixed. --- 3.3 s/and log such an event/and SHOULD log such an event/ <Haomian> Fixed. --- 3.4 When a PCE receives a PCRpt on a state-sync session, it stores the LSP information into the original PCC address context (as the LSP belongs to the PCC). I'm not sure what "into the original PCC address context" means. Is this simply that the LSP information is stored in the context of the PCC that originally reported the LSP? <Haomian> I think so, the text tries to capture ‘the LSP information are stored together with the PCC address’. If true I think we can simplify as ‘When a PCE receives a PCRpt on a state-sync session, it stores the LSP information’. --- 3.4 A PCE SHOULD maintain a single state for a particular LSP and SHOULD maintain the list of sources it learned a particular state from. You have two cases of "SHOULD". They are fine, but you need to explain what the alternatives are ("MAY"), and how/why an implementation makes the choice. In fact, can you check through the whole document and look at the uses of "SHOULD" to make sure the alternatives are properly covered. <Haomian> I would propose to simplify into one SHOULD first and then explain for the alternatives, and check the global usage of SHOULD. --- 3.5 s/it loose control/it loses control/ <Haomian> Fixed. --- 3.5 If the highest priority PCE is failing or if the state-sync session between the local PCE and the highest priority PCE failed, the local PCE MAY decide to delegate the LSP to the next highest priority PCE or to take back control of the LSP. It is a local policy decision. What does "is failing" mean? How can one PCE know that another is failing? <Haomian> I don’t think the PCE knows the failure automatically, it’s more likely to be manually configured. --- 3.5 In the case of sub-delegation, is there a requirement that the PCE that is sub-delegated to has a PCEP session to the PCC that is headend for the LSP? Suppose... PCE2--PCE1 | PCC If PCE1 sub-delegates to PCE2, how does PCE2 control the LSP without a session to the PCC? But how does PCE1 know about the existence of PCE2's sessions? Do we rely on one of: - All PCCs MUST have sessions to all PCEs - A PCE MUST reject sub-delegation if it doesn't have a session to the PCC You have text that says... In the case of sub-delegation, the computing PCE will send the PCUpd only to all state-sync sessions (as it has no direct delegation from a PCC). ...and... When a PCE receives a valid PCUpd on a state-sync session, it SHOULD forward the PCUpd to the appropriate PCC (identified based on the SPEAKER-ENTITY-ID TLV value) that delegated the LSP originally This implies that PCE2 would send the PCUpd to PCE1, but PCE1 could have been failing, or the session between PCE1 and the PCC might be down. <Haomian> Again, too much generalization is making the problem complicated. As I said in section 2.2, we may focus on ‘two PCEs’ scenario first, and complete the primary-secondary delegation first. And then we may stop. How to prioritize and determine the delegation order among multiple PCEs are dependent on the algorithm which should not be standardized. --- 3.5.1 A PCE SHOULD NOT compute a path using an association-group constraint if it has delegation for only a subset of LSPs in the association-group It is unclear to me that a PCE can know this. In some cases, the association is for a known number of LSPs (e.g., bidirectional), but in other cases there can be a large number of LSPs in the group (e.g., VN), and the group can be added to at any time. <Haomian>We need to check the motivation on section 3.5.1. I would reword this sentence as “A PCE SHOULD ONLY compute a path using an association-group constraint if it has delegation for all of LSPs in the association-group”. The PCE know this by checking the LSP identifier to see if they are correctly delegated. --- 3.7 s/a LSP/an LSP/ s/to other PCE/to another PCE/ <Haomian> Fixed. --- 6. s/among PCEP speaker/among PCEP speakers/ <Haomian> Fixed. --- 6. OLD * ID Length: defines the length of the Speaker identity actual field (non-padded). NEW * ID Length: defines the length of the Speaker Entity identity field not counting any padding. END <Haomian> Fixed. --- 6. If a PCEP speaker receives a message with PCEP-PATH-VECTOR TLV and finds its speaker information already present in the PCEP-PATH-VECTOR TLV, it MUST ignore the PCEP message and SHOULD log it as an error. Might be nice to say why... ...because this represents a message loop <Haomian> Fixed. --- 6. The list of speakers within the PCEP-PATH-VECTOR TLV MUST be ordered. Ah, but what order? When sending a PCEP message (PCRpt, PCUpd, or PCInitiate), a PCEP Speaker MAY add the PCEP-PATH-VECTOR TLV with a PCEP-SPEAKER- INFORMATION containing its own information. Addition is presumably in a specific order. "add to the end of the list"? You do say "append" at the bottom of the paragraph, so I think it is clear what you intend: you just need to bring it out more clearly. <Haomian> I am confused. Does it mean “the PCEP speaker MAY append a new PCEP-SPEAKER-INFORMATION containing its own information at the end of the TLV”? --- 7. This is a reasonable section. I'm interested in the transfer of information outside the control of the PCC. The PCC is in a trust relationship with the PCE, but this document allows the PCE to share the information with other PCEs. While there is an implied trust relationship between PCEs, there could be a long chain and the PCC is not aware of the chain, I think. This could be handled if the complete Path Vector TLV was returned back through the PCEs to the PCC. Then, at least, it would know who had seen its information. <Haomian> That’s also why I indicated to focus on no more than 2 PCEs in earlier comments, trying to manage the length of the chain, not too long. --- Thanks for section 8. Of course, it would be better if someone would write some code because that shows there is an actual need for this feature. <Haomian> it’s not there yet, let’s try to find if any. --- Thanks, also, for section 9. --- 9.1 s/for a inter-PCE session/for an inter-PCE session/ s/They MUST allow configuration of/They MUST be allowed to configure/ s/MAY also allow configuration of /MAY also be allowed to configure/ x2 <Haomian> Fixed.
_______________________________________________ Pce mailing list -- pce@ietf.org To unsubscribe send an email to pce-le...@ietf.org