Deat authors, Please find my comments for draft-ietf-rtgwg-net2cloud-problem-statement (I have included line numbers from nits to help identify where in the document the comment is relevant):
Please update references below. == Outdated reference: A later version (-13) exists of draft-ietf-idr-sdwan-edge-discovery-12 == Outdated reference: A later version (-12) exists of draft-ietf-opsawg-ntw-attachment-circuit-08 == Outdated reference: A later version (-23) exists of draft-ietf-idr-5g-edge-service-metadata-16 == Outdated reference: A later version (-15) exists of draft-ietf-opsawg-teas-attachment-circuit-10 == Outdated reference: A later version (-14) exists of draft-ietf-add-split-horizon-authority-07 109 Cloud services are generally exposed, on-demand services that claim 110 to be scalable, highly available, and have usage-based billing. Most Jim> The above sentence is difficult to parse. Do you mean “Cloud services are generally exposed as on-demand…” rather than “Cloud services are generally exposed,…” 115 hosts services to many customers. Jim> s/to/too 137 "edge" locations. <https://cloud.google.com/learn/what- 138 is-hybrid-cloud>. Jim> Please remove the in-text reference and replace with a [] reference as either normative or informative. 144 https://en.wikipedia.org/wiki/Internet_exchange_point. Jim> Please remove in-text reference and replace with a [] reference as either normative or informative. 186 - If a Cloud Gateway (GW), a BGP speaker, receives from its BGP 187 peer a capability that it does not itself support or recognize, 188 it need to ignore that capability, and the BGP session need not Jim> As per RFC5492 it MUST ignore that capability and the BGP session MUST NOT be terminated. See section 3 of RFC5492 and correct the above text. 189 be terminated per [RFC5492]. When receiving a BGP UPDATE with a 190 malformed attribute, the revised BGP error handling procedure 191 in [RFC7606] should be followed instead of session resetting. Jim> the above paragraph seems to be confused. The first sentence is talking about BGP OPEN and how to handle capabilities, and then the second sentence talks about BGP UPDATE messages that have malformed attributes. These are two completely different things so I am struggling to understand why they are referenced in the same paragraph and what exactly they have to do with each other in the context of a Cloud Gateway?. Everything referenced is existing behavior, nothing new, so why is it here and what are the authors trying to convey? If they are trying to simply say that a Cloud Gateway should adhere to the procedure as specified in RFCs 5492 and 7606 then why not simply say that? If the authors wish to keep the text I would suggest a rewrite as follows: - If a Cloud Gateway (GW), a BGP speaker, receives from its BGP peer a BGP OPEN with a capability that it does not support or recognize, it MUST ignore that capability, and the BGP session MUST NOT be terminated, as per [RFC 5492]. - When receiving a BGP UPDATE with a malformed attribute, the revised BGP error handling procedures in [RFC 7606] should be followed instead of resetting the BGP session. 196 - When a Cloud DC eBGP session supports a limited number of 197 routes from external entities, the on-premises DCs need to set 198 up default routes and filter as many routes as practical 199 replacing them with a default in the eBGP advertisement to 200 minimize the number of routes to be exchanged with the Cloud DC 201 eBGP peers. Jim> I do not understand the above paragraph. Is a Cloud DC different to an on-premise DC? Who is advertising default to who? The scenario that you are trying to convey above is non-obvious, at least to me, so please clarify. 202 - When a Cloud GW receives inbound routes exceeding the maximum 203 routes threshold for a peer, the currently common practice is 204 generating out-of-band alerts (e.g., Syslog entries) via the 205 management system or terminating the BGP session (with cease 206 notification messages [RFC4486] being sent). Although out of 207 the scope of this document, more discussion is needed in the 208 IETF Inter-Domain Routing (IDR) Working Group for potential in- 209 band or autonomous notification directly to the peers when the 210 inbound routes exceed the maximum routes threshold. Jim> More explanation is needed here including a reference to section 4 of RFC4486 that describes the procedure for terminating a peering with a NOTIFICATION message and error code providing a reason e.g. “Maximum number of prefixes reached”. 222 Failures within a Cloud site, which can be a building, a floor, a 223 pod, or a server rack, include capacity degradation or complete out- 224 of-service failure. Here are some events that can trigger a site 225 failure: a) fiber cut for links connecting to the site or among pods 226 within the site; b) cooling failures; c) insufficient backup power 227 during a power failure; d) cyber threat attacks; e) too many changes 228 outside of the maintenance window; etc. A fiber-cut is not uncommon 229 in a Cloud site or between sites. Jim> I would suggest to say above that the types of events are not an exhaustive list but just some examples. 244 [RFC7432] specifies a mass withdrawal mechanism for EVPN to signal a 245 large number of routes being changed to remote PE nodes as quickly 246 as possible. Jim> I am not sure that RFC 7432 is relevant here or why EVPN is even mentioned. Is there a reason to mention this or should the text simply be removed? 597 premesis CPEs to a Cloud DC via a private VPN requires the private Jim> s/premesis/premise 691 necessary. Alternative encapsulations, like SRH (Segment Routing Jim> Please provide a reference to RFC 8754 (SRH) 695 6. Requirements for Networks Connecting Cloud Data Centers Jim> Why are there requirements in a problem statement document? Did the WG discuss splitting these out into a separate document? Thanks! Jim
_______________________________________________ rtgwg mailing list -- rtgwg@ietf.org To unsubscribe send an email to rtgwg-le...@ietf.org