Hi Ben, et al., I've uploaded version -14 of the draft that includes all the updates we've discussed. Please let me know if you find this version satisfactory addresses your DISCUSS and COMMENTS.
Regards, Greg A new version of I-D, draft-ietf-bess-mvpn-fast-failover-14.txt has been successfully submitted by Greg Mirsky and posted to the IETF repository. Name: draft-ietf-bess-mvpn-fast-failover Revision: 14 Title: Multicast VPN Fast Upstream Failover Document date: 2020-12-21 Group: bess Pages: 24 URL: https://www.ietf.org/archive/id/draft-ietf-bess-mvpn-fast-failover-14.txt Status: https://datatracker.ietf.org/doc/draft-ietf-bess-mvpn-fast-failover/ Htmlized: https://datatracker.ietf.org/doc/html/draft-ietf-bess-mvpn-fast-failover Htmlized: https://tools.ietf.org/html/draft-ietf-bess-mvpn-fast-failover-14 Diff: https://www.ietf.org/rfcdiff?url2=draft-ietf-bess-mvpn-fast-failover-14 Abstract: This document defines Multicast Virtual Private Network (VPN) extensions and procedures that allow fast failover for upstream failures by allowing downstream Provider Edges (PEs) to consider the status of Provider-Tunnels (P-tunnels) when selecting the upstream PE for a VPN multicast flow. The fast failover is enabled by using RFC 8562 Bidirectional Forwarding Detection (BFD) for Multipoint Networks and the new BGP Attribute - BFD Discriminator. Also, the document introduces a new BGP Community, Standby PE, extending BGP Multicast VPN routing so that a C-multicast route can be advertised toward a Standby Upstream PE. Please note that it may take a couple of minutes from the time of submission until the htmlized version and diff are available at tools.ietf.org. The IETF Secretariat On Fri, Dec 18, 2020 at 6:34 PM Greg Mirsky <[email protected]> wrote: > Hi Ben, > many thanks for your kind considerations of the proposed updates. Please > find my follow-up notes in-lined under the GIM2>> tag below. > Attached is the diff that highlights all the updates including the ones > that address comments by Martin, Barry, Roman, and Murray. > I will upload the new version if you agree to the latest proposed updates. > > Regards, > Greg > > On Thu, Dec 17, 2020 at 6:02 PM Benjamin Kaduk <[email protected]> wrote: > >> Hi Greg, >> >> Also inline (though there's not much left to say) >> >> On Wed, Dec 16, 2020 at 12:17:55PM -0800, Greg Mirsky wrote: >> > Hi Ben, >> > thank you for the review, and your detailed comments and direct >> questions. >> > Please find my answers and proposed updates in-lined below under the >> GIM>> >> > tag. >> > Attached, please find, the diff highlighting the updates and the new >> > working version of the draft. >> > >> > Regards, >> > Greg >> > >> > On Mon, Dec 14, 2020 at 4:51 PM Benjamin Kaduk via Datatracker < >> > [email protected]> wrote: >> > >> > > Benjamin Kaduk has entered the following ballot position for >> > > draft-ietf-bess-mvpn-fast-failover-13: Discuss >> > > >> > > When responding, please keep the subject line intact and reply to all >> > > email addresses included in the To and CC lines. (Feel free to cut >> this >> > > introductory paragraph, however.) >> > > >> > > >> > > Please refer to >> https://www.ietf.org/iesg/statement/discuss-criteria.html >> > > for more information about IESG DISCUSS and COMMENT positions. >> > > >> > > >> > > The document, along with other ballot positions, can be found here: >> > > https://datatracker.ietf.org/doc/draft-ietf-bess-mvpn-fast-failover/ >> > > >> > > >> > > >> > > ---------------------------------------------------------------------- >> > > DISCUSS: >> > > ---------------------------------------------------------------------- >> > > >> > > Let's talk about what the requirements are for consistency across PEs >> in >> > > the algorithm for selecting the Primary Upstream PE. Section 4 notes >> > > that "all the PEs of that MVPN [are required] to follow the same UMH >> > > selection procedure", but leaves the option of non-revertive behavior >> as >> > > something that "MAY also be supported by an implementation", without >> > > requirement for consistency across all PEs. It seems to me that if >> some >> > > PEs use non-revertive behavior and others do not, then they will >> > > disagree as to which PE is the Primary (or active) PE in some cases, >> > > which seems to conflict with the initial guidance that all PEs needed >> to >> > > pick the same one. Is it perhaps that the PEs need to agree on which >> PE >> > > is to be advertised as Primary but not necessarily to actually be >> using >> > > that one for traffic? Or am I missing something? >> > > >> > GIM>> Thank you for pointing out this inconsistency. I agree that the >> text >> > needs some tightening. Below is the proposed update: >> > OLD TEXT: >> > Such behavior is referred to as >> > "revertive" behavior and MUST be supported. Non-revertive behavior >> > refers to the behavior of continuing to select the backup PE as the >> > UMH even after the Primary has come up. This non-revertive behavior >> > MAY also be supported by an implementation and would be enabled >> > through some configuration. >> > NEW TEXT: >> > Such behavior is referred to as >> > "revertive" behavior and MUST be supported. Non-revertive behavior >> > refers to the behavior of continuing to select the backup PE as the >> > UMH even after the Primary has come up. This non-revertive behavior >> > MAY also be supported by an implementation and would be enabled >> > through some configuration. Selection of the behavior, revertive or >> > non-revertive, is an operational issue, but it MUST be consistent on >> > all PEs in the given MVPN. >> >> Looks good; I'm glad this was just a simple change. >> >> > > >> > > >> > > ---------------------------------------------------------------------- >> > > COMMENT: >> > > ---------------------------------------------------------------------- >> > > >> > > Section 1 >> > > >> > > Section 3 describes local procedures allowing an egress PE (a PE >> > > connected to a receiver site) to take into account the status of >> > > P-tunnels to determine the Upstream Multicast Hop (UMH) for a given >> > > (C-S, C-G). [...] >> > > >> > > Does it also apply to (C-*, C-G)? (I'll just mention it once, but the >> > > handling seems to be somewhat inconsistent throughout the document, >> with >> > > (C-*,C-G) getting mentioned sometimes but not always, and no pattern >> > > obvious to me for when it is or is not included. I think I see some >> > > instances where (C-*, C-G) does not make sense, so it would probably >> not >> > > be a universal replacement.) >> > > >> > GIM>> Yes, it cannot be used interchangeably. We've followed the >> notation >> > as defined in the last paragraph of Section 3.1 RFC 6513: >> > ... C-group address would be a group address in a >> > VPN's address space. A C-tree is a multicast distribution tree >> > constructed and maintained by the PIM C-instances. A C-flow is a >> > stream of multicast packets with a common C-source address and a >> > common C-group address. We will use the notation "(C-S,C-G)" to >> > identify specific C-flows. If a particular C-tree is a shared tree >> > (whether unidirectional or bidirectional) rather than a source- >> > specific tree, we will sometimes speak of the entire set of flows >> > traveling that tree, identifying the set as "(C-*,C-G)". >> > It is my understanding, that the one reference to (C-*,C-G) in the >> draft is >> > in Section 3 bullet B: >> > The S-PMSI can be advertised only after the >> > Upstream PE receives a C-multicast route for (C-S, C-G)/(C-*, >> > C-G) to be carried over the advertised S-PMSI. >> > The reference is not an introduced text but an informational summary of >> > Section 5.1 RFC 6513. The text preceding that reference is intended to >> > clarify its status: >> > There are three options specified in Section 5.1 of [RFC6513] for a >> > downstream PE to select an Upstream PE. >> > Would you suggest an additional text? >> >> I don't see a need for additional text. Thanks for walking me through it >> -- I'm pretty sure I had lost track of the fact that the §3 (and only) >> occurrence of "(C-*, C-G)" was in a part that was supposed to be >> summarizing RFC 6513 by the time I stumbled upon it. With the extra >> context it all seems clear. >> >> > > >> > > Section 5 describes a "hot leaf standby" mechanism that can be used >> > > to improve failover time in MVPN. The approach combines mechanisms >> > > defined in Section 3 and Section 4 has similarities with the >> solution >> > > described in [RFC7431] to improve failover times when PIM routing >> is >> > > used in a network given some topology and metric constraints. >> > > >> > > nit: grammar issue around "has similarities with" (maybe needs a >> leading >> > > "and"?) >> > > >> > GIM>> Thank you. Updated to: >> > NEW TEXT: >> > Section 5 describes a "hot leaf standby" mechanism that can be used >> > to improve failover time in MVPN. The approach combines mechanisms >> > defined in Section 3 and Section 4, and has similarities with the >> > solution described in [RFC7431] to improve failover times when PIM >> > routing is used in a network given some topology and metric >> > constraints. >> > > >> > > >> > > VPNs. An operator would enable these mechanisms using a method >> > > discussed in Section 3 in combination with the redundancy provided >> by >> > > a standby PE connected to the source of the multicast flow, and it >> is >> > > assumed that all PEs in the network would support these mechanisms >> > > for the procedures to work. In the case that a BGP implementation >> > > >> > > Is it a matter of "the procedure will not work at all unless all PEs >> in >> > > the network support it", or "only the PEs that support it will get the >> > > benefits of it"? [The next sentence suggests an anwer...] >> > > >> > GIM>> The sentence might be too long. Yes, only PEs that support the new >> > Standby PE community and use any of UMH monitoring methods would >> converge >> > faster than PEs that don't support both features. Would the re-wording >> as >> > below make the text clear: >> > NEW TEXT: >> > An operator would enable these mechanisms using a method >> > discussed in Section 3 combined with the redundancy provided by a >> > standby PE connected to the multicast flow source. PEs that support >> > these mechanisms would converge faster and thus provide a more stable >> > multicast service. >> >> Yes, that looks pretty crisp -- thanks! >> >> > > >> > > Section 3 >> > > >> > > Section 9.1.1 of [RFC6513] are applicable when using I-PMSI >> > > P-tunnels. That document is a foundation for this document, and >> its >> > > processes all apply here. Section 9.1.1 mandates the use of >> specific >> > > procedures for sending intra-AS I-PMSI A-D Routes. >> > > >> > > (nit) the second "Section 9.1.1" is also referring to RFC 6513, not >> this >> > > document, which would be the default interpretation of a bare section >> > > reference. >> > >> > >> > > (not-nit) The referenced procedure seems to be about processing, not >> > > sending, intra-AS I-PMSI A-D routes. Am I misreading something? >> > > >> > GIM>> You are right. The sentence mischaracterizes Section 9.1.1 and >> has no >> > informational value. Removed it altogether. >> > >> > > >> > > Section 3.1 >> > > >> > > Different factors can be considered to determine the "status" of a >> > > P-tunnel and are described in the following sub-sections. The >> > > optional procedures described in this section also handle the case >> > > the downstream PEs do not all apply the same rules to define what >> the >> > > status of a P-tunnel is (please see Section 6), and some of them >> will >> > > produce a result that may be different for different downstream >> PEs. >> > > >> > > nit: I think it's better to put a word like "where" in "the case the >> > > downtream PEs". >> > > >> > GIM>> I've tried it like the following: >> > NEW TEXT: >> > The >> > optional procedures described in this section also handle the case >> > when the downstream PEs do not all apply the same rules to define >> > what the status of a P-tunnel is (please see Section 6), and some of >> > them will produce a result that may be different for different >> > downstream PEs. >> > >> > > >> > > Section 3.1.3 >> > > >> > > corresponding P-tunnel MUST be re-evaluated. If the P-tunnel >> > > transitions from Up to Down state, the Upstream PE that is the >> > > ingress of the P-tunnel MUST NOT be considered a valid UMH. >> > > >> > > (nit?) I'm not sure how much precedent there is for using "valid" in >> > > this context -- IIUC the previous discussion of this process referred >> > > only to whether a PE is a candidate for being the UMH. >> > > >> > GIM>> I agree, "candidate" is missing here. Proposed update: >> > NEW TEXT: >> > If the P-tunnel >> > transitions from Up to Down state, the Upstream PE that is the >> > ingress of the P-tunnel MUST NOT be considered as a valid candidate >> > UMH. >> > >> > > >> > > Section 3.1.5 >> > > >> > > When such a procedure is used, in the context where fast >> restoration >> > > mechanisms are used for the P-tunnels, a configurable timer MUST be >> > > set on the downstream PE to wait before updating the UMH, to let >> the >> > > P-tunnel restoration mechanism to execute its actions. An >> > > implementation SHOULD use three seconds as the default value for >> this >> > > timer. >> > > >> > > How does this interact with the value of the maximum inter-packet >> time? >> > > Suppose that I know to expect at least one packet every ten seconds. >> Do >> > > I wait ten seconds after receiving the last packet and then another >> > > three seconds, before engaging in an UMH change? >> > > >> > GIM>> This scenario is similar to the use of an active OAM detecting a >> > network failure. The role of the timer to trigger an action if a certain >> > number of packets have not arrived. Since the maximum inter-packet time >> is >> > known, a downstream PE has an expectation of receiving a packet within a >> > time interval large than the maximum inter-packet interval. In practice, >> > the timer could be set three times the maximum inter-packet interval, so >> > that it expires if three consecutive packets were not received. In the >> case >> > you've described, I think that the timer must be set larger than 10 >> > seconds, probably 30+ seconds. Would you suggest any additional text >> here? >> >> I think my uncertainty here is whether the 3 seconds default is intended >> to >> be the only waiting period or an additional waiting period after >> determining that the tunnel is "probably down" but before updating the >> UMH. >> (Looking at it again now, the latter is a bit of a strained >> interpretation.) That said, your description here suggests that the >> operative mechanism is to determine that the tunnel is probably down by >> passing a threshold number of packets that were expected but did not >> arrive, with some accomodation for jitter in the network if packets are >> supposed to be arriving very quickly so that "just wait for 3 missed >> packets" isn't long enough to account for (e.g.) interrupt handling on a >> forwarder. So the intended sentiment seems to be something like >> "Determining that a tunnel is probably down by waiting for enough packets >> to fail to arrive as expected is a heuristic and operational matter that >> depends on the maximum inter-packet time. A timeout of three seconds is a >> generally suitable default waiting period to ascertain that the tunnel is >> down, though other values would be needed for atypical conditions." > > >> I would not complain if you kept the original text, though. >> > GIM2>> I much appreciate the consideration you gave to the document and > glad to use the suggested text. > >> >> > > >> > > In cases where this mechanism is used in conjunction with the >> method >> > > described in Section 5, no prior knowledge of the rate of the >> > > multicast streams is required; downstream PEs can compare reception >> > > on the two P-tunnels to determine when one of them is down. >> > > >> > > This feels a little underspecified; is there a reference or more >> > > guidance that we could give about turning a stream of received packets >> > > on one tunnel into a maximum inter-packet time on another tunnel, >> > > supposedly carrying the same traffic? >> > > >> > GIM>> I think that this text refers to 1+1 protection, i.e., >> Active-Active >> > P-tunnels. In that scenario, the determination of the P-tunnel's state >> can >> > be done by comparing it to the reception state of the other P-tunnel in >> the >> > redundancy group. But there might be corner cases, like a significant >> delay >> > in one of P-tunnels, that may need consideration before recommending >> this >> > method. I don't think that it is in the scope of the document. Would >> > appending the paragraph with "The detailed specification of this >> mechanism >> > is outside the scope of this document" be acceptable? >> >> I think my confusion stems from the first paragraph of the section seeming >> to emphasize the dependence on the "maximum inter-packet time", but the >> best algorithm to use in the 1+1 case seems likely to be different (i.e., >> one that makes use of the full knowledge of the incoming packet >> distribution). So my suggestion would be something more along the lines >> of >> "no prior knowledge of the rate or maximum inter-packet time on the >> multicast streams is required; downstream PEs can compare actual packet >> reception statistics on the two P-tunnels to determine when one of them is >> down". (Adding a "details are out of scope of the document" to that would >> be fine, of course.) >> > GIM2>> Thanks again for the suggested text. > >> >> > > >> > > Section 3.1.6 >> > > >> > > * one octet-long field of TLV's Type value (Section 7.3) >> > > >> > > * one octet-long field of the length of the Value field in >> octets >> > > >> > > * variable length Value field. >> > > >> > > The length of a TLV MUST be multiple of four octets. >> > > >> > > I assume this is the total length, not the value in the length field? >> > > >> > GIM>> Correct. Would the following update make it clearer? >> > NEW TEXT: >> > Figure 2 presents the Optional TLV format TLV that >> > consists of: >> > >> > * Type - a one-octet-long field that characterizes the >> > interpretation of the Value field (Section 7.3) >> > >> > * Length - a one-octet-long field equal to the length of the >> > Value field in octets >> > >> > * Value - a variable-length field. >> > >> > The length of a TLV as a whole MUST be multiple of four octets. >> >> Yes, that's more clear. (But given Jeff and Alvaro's comments maybe the >> whole thing will change anyway.) >> >> > > >> > > The BFD Discriminator attribute MUST be considered malformed if its >> > > length is not a non-zero multiple of four. If the attribute >> > > considered malformed, the UPDATE message SHALL be handled using the >> > > approach of Attribute Discard per [RFC7606]. >> > > >> > > nit: s/attribute considered/attribute is considered/ >> > > >> > GIM>> Thank you! Updated text to: >> > NEW TEXT: >> > The BFD Discriminator attribute MUST be considered malformed if its >> > length is not a non-zero multiple of four. If the attribute is >> > deemed to be malformed, the UPDATE message SHALL be handled using the >> > approach of Attribute Discard per [RFC7606]. >> > >> > > >> > > Section 3.1.6.1 >> > > >> > > o MUST periodically transmit BFD Control packets over the x-PMSI >> > > P-tunnel after the P-tunnel is considered established. Note >> that >> > > the methods to declare a P-tunnel has been established are >> outside >> > > the scope of this specification. >> > > >> > > Is there a good reference for how to choose the period of >> transmission? >> > > >> > GIM>> Not really. There are many factors that an operator should >> consider >> > when configuring the frequency of BFD packets on the MultipointHead >> system. >> > One of the aspects to keep in mind is that unlike p2p BFD, there's no >> > interval negotiation phase in p2mp BFD. As a result, a tail has no >> > influence over the interval at which the head of the p2mp BFD session >> > transmits BFD Control messages. >> >> Okay. (Yes, I do remember there was a fair bit of contention in the >> reviews of the multipoint BFD document relating to the transmission >> frequency and the potential to overload the network in the absence of >> feedback ... luckily that was a case where I was able to sit back and >> watch, and not have to be taking a position.) >> >> > > >> > > If the tracking of the P-tunnel by using a P2MP BFD session is >> > > enabled after the x-PMSI A-D Route has been already advertised, the >> > > x-PMSI A-D Route MUST be re-sent with precisely the same attributes >> > > as before and the BFD Discriminator attribute included. >> > > >> > > Pedantically, it seems like "precisely the same attributes as before" >> > > is incompatible with adding the BFD Discriminator attribute. Phrasing >> > > that discusses "the only change between the previous advertisement and >> > > the new advertisement" would not suffer from such a potential issue. >> > > (And similarly for when the BFD Discriminator attribute is to be >> > > removed, a couple paragraphs later.) >> > > >> > GIM>> Great, thank you. Applied in both cases: >> > NEW TEXT: >> > If the tracking of the P-tunnel by using a P2MP BFD session is >> > enabled after the x-PMSI A-D Route has been already advertised, the >> > x-PMSI A-D Route MUST be re-sent with the only change between the >> > previous advertisement and the new advertisement to be the inclusion >> > of the BFD Discriminator attribute. >> > and >> > o x-PMSI A-D Route MUST be re-sent with the only change between the >> > previous advertisement and the new advertisement be the exclusion >> > of the BFD Discriminator attribute; >> > >> > > >> > > Section 3.1.6.2 >> > > >> > > o MUST use the source IP address of the BFD Control packet, the >> > > value of the BFD Discriminator field, and the x-PMSI Tunnel >> > > Identifier [RFC6514] the BFD Control packet was received to >> > > properly demultiplex BFD sessions. >> > > >> > > nit: missing word around "the BFD Control packet was received" (maybe >> > > "received on/in"?). >> > > >> > GIM>> "on" seems the better option. Updated accordingly. >> > >> > > >> > > According to [RFC8562], if the downstream PE receives Down or >> > > AdminDown in the State field of the BFD Control packet or >> associated >> > > with the BFD session Detection Timer expires, the BFD session is >> > > >> > > nit: "the BFD Detection Timer associated with the BFD session expires" >> > > >> > GIM>> Thank you for the helpful suggestion. Updated. >> > >> > > >> > > PE, while others are considered as Standby Upstream PEs. In such a >> > > scenario, when the P-tunnel is considered down, the downstream PE >> MAY >> > > initiate a switchover of the traffic from the Primary Upstream PE >> to >> > > the Standby Upstream PE only if the Standby Upstream PE is deemed >> > > available. >> > > >> > > I'm not sure that we've defined what it means for an Upstream PE to be >> > > deemed "available', yet. I guess it's possible that there is not an >> > > established P-Tunnel between the (selected) Standby Upstream PE and >> the >> > > donstream PE, so just using the Up/Down/not-known-to-be-Down status of >> > > that P-tunnel is not an option... >> > > >> > GIM>> The wording is sloppy, agree. I think that the intention was to >> say >> > "deemed in the Up state". That can be determined using the p2mp BFD >> session >> > with Standby Upstream PE acting as its MultipointHead. The proposed >> update >> > is as follows: >> > NEW TEXT: >> > In such a scenario, when the P-tunnel is considered >> > down, the downstream PE MAY initiate a switchover of the traffic from >> > the Primary Upstream PE to the Standby Upstream PE only if the >> > Standby Upstream PE is deemed to be in the Up state. That MAY be >> > determined from the state of a P2MP BFD session with the Standby >> > Upstream PE as the MultipointHead. >> > >> > > >> > > If the downstream PE's P-tunnel is already established when the >> > > downstream PE receives the new x-PMSI A-D Route with BFD >> > > Discriminator attribute, the downstream PE MUST associate the value >> > > of BFD Discriminator field with the P-tunnel and follow procedures >> > > listed above in this section if and only if the x-PMSI A-D Route >> was >> > > properly processed as per [RFC6514], and the BFD Discriminator >> > > attribute was validated. >> > > >> > > We did not discuss any validation of the BFD Discriminator attribute >> in >> > > §3.1.6; what procedures would this process entail? >> > > >> > GIM>> There's, so far, only one validation condition: >> > >> > The length of a TLV as a whole MUST be multiple of four octets. >> > >> > >> > > Section 4 >> > > >> > > The procedures described below are limited to the case where the >> site >> > > that contains C-S is connected to two or more PEs, though, to >> > > simplify the description, the case of dual-homing is described. >> The >> > > >> > > I suggest giving at least some considerations to how to choose between >> > > multiple standby Upstream PEs when there are more than one available. >> > > >> > GIM>> I understand your idea but that might be a whole new specification >> > similar to the selection of UMH from the list of candidates. Perhaps >> > stating that the selection might use known methods but the specifics are >> > outside the scope of this document be acceptable? For example (sorry >> for a >> > longer quote): >> > NEW TEXT: >> > The procedures described below are limited to the case where the site >> > that contains C-S is connected to two or more PEs, though, to >> > simplify the description, the case of dual-homing is described. In >> > the case where more than two PEs are connected to the C-s site, >> > selection of the Standby PE can be performed using one of the methods >> > of selecting a UMH. Details of the selection are outside the scope >> > of this document. The procedures require all the PEs of that MVPN to >> > follow the same UMH selection procedure, as specified in [RFC6513], >> > whether the PE selected based on its IP address, hashing algorithm >> > described in section 5.1.3 of [RFC6513], or Installed UMH Route. The >> > procedures assume that if a site of a given MVPN that contains C-S is >> > dual-homed to two PEs, then all the other sites of that MVPN would >> > have two unicast VPN routes (VPN-IPv4 or VPN-IPv6) to C-S, each with >> > its RD. >> >> I can understand that selecting from a list of (more than two) candidates >> might end up being complicated; I think your text here is sufficient to >> address my concern, by giving some indication of how the extension to the >> general case could be performed while acknowledging that it is not fully >> specified yet. >> >> > > >> > > procedures require all the PEs of that MVPN to follow the same UMH >> > > selection procedure, as specified in [RFC6513], whether the PE >> > > selected based on its IP address, hashing algorithm described in >> > > section 5.1.3 of [RFC6513], or Installed UMH Route. The procedures >> > > >> > > I assume that how the PEs agree on which procedure is in use does not >> > > involve something being advertised in-band, and is out of scope for >> this >> > > document. But please say so! >> > > >> > GIM>> You are right, that is an operational issue and the management >> > plane's responsibility. >> > NEW TEXT: >> > The procedures require all the PEs of that MVPN to >> > follow the same UMH selection procedure, as specified in [RFC6513], >> > whether the PE selected based on its IP address, the hashing >> > algorithm described in section 5.1.3 of [RFC6513], or Installed UMH >> > Route. The consistency of the UMH selection method used among all >> > PEs is expected to be provided by the management plane. >> > >> > > >> > > assume that if a site of a given MVPN that contains C-S is >> dual-homed >> > > to two PEs, then all the other sites of that MVPN would have two >> > > unicast VPN routes (VPN-IPv4 or VPN-IPv6) to C-S, each with its RD. >> > > >> > > nit: s/its RD/its own RD/ >> > > >> > GIM>> Ack >> > >> > > Also, please confirm that the unicast routes are *to* C-S, vs *from* >> it. >> > > >> > GIM>> Though it might be somewhat counterintuitive, in the context of >> MVPN >> > "to" is correct. >> >> Thanks. >> >> > > >> > > Section 4.1 >> > > >> > > o the NLRI is constructed as the C-multicast route with an RT that >> > > identifies the Primary Upstream PE, except that the RD is the >> same >> > > as if the C-multicast route was built using the Standby Upstream >> > > PE as the UMH (it will carry the RD associated to the unicast >> VPN >> > > route advertised by the Standby Upstream PE for S and a Route >> > > Target derived from the Standby Upstream PE's UMH route's VRF RT >> > > Import EC); >> > > >> > > This part is a bit confusing to me, since the first part says that the >> > > RT identifies the Primary Upstream PE, but the second part says that >> the >> > > RT is derived from the Standy Upstream PE's [stuff]. But I'm happy to >> > > trust you that the [stuff] makes it correct! >> > > >> > GIM>> Thank you for putting your trust in our collective thinking. >> AFAIK, >> > it works. >> > >> > > >> > > Section 4.2 >> > > >> > > when the PE determines (the use of the particular method to >> detect >> > > the failure is outside the scope of this document) that C-S is >> not >> > > reachable through some other PE, the PE SHOULD install VRF PIM >> > > >> > > It seems like a forward reference to §4.3 might be helpful. >> > > >> > GIM>> Thank you for your suggestion, the reference is added in the >> working >> > version. >> > >> > > >> > > Section 9.3.2 of [RFC6514], describes the procedures of sending a >> > > Source-Active A-D Route as a result of receiving the C-multicast >> > > route. These procedures MUST be followed for both the normal and >> > > Standby C-multicast routes. >> > > >> > > There is no section 9.3.2 in RFC 6514. There is a 9.2.3 that looks >> > > perhaps plausible, though the string "Source-Active" does not appear >> in >> > > it. >> > > >> > GIM>> Great catch, thank you! I believe that the correct section is in >> RFC >> > 6513, not RFC 6514. The former opens with: >> > The issue described in Section 9.3.1 is resolved through the use of >> > Source Active A-D routes. In the remainder this section, we provide >> > an example of how this works, along with an informal description of >> > the procedures. >> > Would you agree RFC 6513 makes sense? >> >> That does look to make a lot more sense than 6514 did! >> >> > > >> > > Section 4.4.2 >> > > >> > > Source AS carried in the C-multicast route. If the match is found, >> > > and the C-multicast route carries the Standby PE BGP Community, >> then >> > > the ASBR MUST perform as follows: >> > > >> > > (I assume that there is room for local policy to modify this "MUST", >> > > e.g., if needed to protect against some form of attack ... perhaps it >> > > even goes without saying.) >> > > >> > GIM>> Indeed. Perhaps the following change makes it more accurate: >> > NEW TEXT: >> > If the match is found, >> > and the C-multicast route carries the Standby PE BGP Community, then >> > the ASBR implementation that supports this specification MUST be >> > configurable to perform as follows: >> > >> > o if the route was received over iBGP and its LOCAL_PREF attribute >> > is set to zero, then it MUST be re-advertised in eBGP with a MED >> > attribute (MULTI_EXIT_DISC) set to the highest possible value >> > (0xffff) >> > >> > o if the route was received over eBGP and its MED attribute set to >> > 0xffff, then it MUST be re-advertised in iBGP with a LOCAL_PREF >> > attribute set to zero >> > >> > Other ASBR procedures are applied without modification and, when >> > applied, MAY modify the above-listed behavior. >> >> Works for me :) >> >> > > >> > > Section 5 >> > > >> > > o Upstream PEs use the "hot standby" optional behavior and thus >> will >> > > forward traffic for a given multicast state as soon as they have >> > > whether a (primary) BGP C-multicast route or a Standby BGP >> > > C-multicast route for that state (or both) >> > > >> > > nit: the grammar is a bit weird here, after "as soon as they have"; >> I'm >> > > not confident that I could make an accurate suggestion for a fix. >> > > >> > GIM>> Would with a minor update it all reads better: >> > NEW TEXT: >> > Upstream PEs use the "hot standby" optional behavior and thus will >> > start forwarding traffic for a given multicast state after they >> > have whether a (primary) BGP C-multicast route or a Standby BGP >> > C-multicast route for that state (or both) >> >> I think that "whether" is not needed (though "either" might work in its >> stead). >> > GIM2>> Yes, it was not the best choice. Below is the newest version: > o Upstream PEs use the "hot standby" optional behavior and thus will > start forwarding traffic for a given multicast state after they > have a (primary) BGP C-multicast route or a Standby BGP > C-multicast route for that state (or both) >> >> >> > > >> > > Section 6 >> > > >> > > I could almost see the discussion of duplicate packets as being a >> > > subsection of the security considerations, though I don't mind leaving >> > > it as-is. >> > > >> > GIM>> Thank you for agreeing. >> > >> > > >> > > Section 8 >> > > >> > > We could perhaps make some pro forma note that the BFD Discriminator >> > > attribute, like all BGP attributes, typically does not benefit from >> > > cryptographic integrity protection and thus could be spoofed so as to >> be >> > > different than what is actually used by the multipoint BFD head. That >> > > said, I'm willing to let this fall under the incorporated-by-reference >> > > BGP security considerations. >> > > >> > GIM>> Thank you. >> > >> > > >> > > Is it worth noting that operating in "hot" standby mode will increase >> > > the general level of traffic on the VPN and thus susceptibility to >> DoS? >> > > >> > GIM>> We use hot standby in the control plane only. That would add some >> BGP >> > traffic but would not as much as 1+1 protection in the data plane. I >> think >> > that the amount of the additional load in the VPN with the "hot standby" >> > defined in the draft unlikely to make PEs more volnurable to DoS. What >> do >> > you think? >> >> I think I was assuming this was "hot" in the data plane as well as the >> control plane, when I Wrote the comment. For just the control plane, your >> assessment seems reasonable. >> >> > > >> > > This document uses P2MP BFD, as defined in [RFC8562], which, in >> turn, >> > > is based on [RFC5880]. Security considerations relevant to each >> > > protocol are discussed in the respective protocol specifications. >> An >> > > implementation that supports this specification MUST use a >> mechanism >> > > to control the maximum number of P2MP BFD sessions that can be >> active >> > > at the same time. >> > > >> > > What is the objective that this control is designed to achieve? I can >> > > "control the maximum number of sessions" by asserting the maximum >> number >> > > to be an absurdly large value, but I don't think that would meet the >> > > spirit of this requirement (it does meet the letter of the >> requirement). >> > > >> > GIM>> Though this recommendation may look as too vague, I think it is >> > helpful to a developer. I imagine, as we've discussed in regard to the >> > selection of the interval between BFD Control packets, an operator will >> > consider the overall load of BFD Control packets across all active BFD >> > sessions. Do you think that a sentence that connects the number of p2mp >> BFD >> > sessions and the rate of BFD Control packets be helpful in this context? >> >> Or even another clause, maybe something like "to limit the overall amount >> of capacity used by the BFD traffic". (I think part of what triggered my >> comment is that this is "MUST use", not "MUST provide" -- the goal of a >> "MUST provide" is fairly obvious but "MUST use" with no specific bound >> could be seen as make-work.) >> > GIM2>> More thanks for this suggestion. I think that your clause is more > general and better expresses our concern in regard to the potential impact > of using p2mp BFD. Below is the new text I propose: > NEW TEXT: > This document uses P2MP BFD, as defined in [RFC8562], which, in turn, > is based on [RFC5880]. Security considerations relevant to each > protocol are discussed in the respective protocol specifications. An > implementation that supports this specification MUST provide a > mechanism to limit the overall amount of capacity used by the BFD > traffic (as the combination of the number of active P2MP BFD sessions > and the rate of BFD Control packets to process). > >> >> > > >> > > The methods described in Section 3.1 may produce false-negative >> state >> > > changes that can be the trigger for an unnecessary convergence in >> the >> > > control plane, ultimately negatively impacting the multicast >> service >> > > provided by the VPN. An operator is expected to consider the >> network >> > > environment and use available controls of the mechanism used to >> > > determine the status of a P-tunnel. >> > > >> > > We mentioned earlier (e.g., in §3.1) that similar negative effects can >> > > occur when resiliency mechanisms at different layers interact; that >> > > might be worth repeating here. >> > > >> > GIM>> One of such references is in Section 3.1.2: >> > In many cases, it is not practical to use both protection >> > methods at the same time because uncorrelated timers might cause >> > unnecessary switchovers and destabilize the network. >> > Thus we referred to Section 3.1 as the encompassing reference to all >> > possible scenarios. Would you agree with that? >> >> Wow, that looks like a real short-term memory failure on my part (§3.1 was >> mentioned just in the previous sentence!). Sorry for the noise; this is >> fine as-is. >> >> Thanks again, >> >> Ben >> >
_______________________________________________ BESS mailing list [email protected] https://www.ietf.org/mailman/listinfo/bess
