Hi Greg, Thanks for the fast action. Appreciated.
Please see follow up on remaining open items with GV2> Once there is an updated draft I’ll spin through the document again and see where we land. Thanks again for the fast action Greg. Be well, G/ From: Greg Mirsky <gregimir...@gmail.com> Sent: Tuesday, November 26, 2024 6:53 PM To: Gunter van de Velde (Nokia) <gunter.van_de_ve...@nokia.com> Cc: draft-ietf-nvo3-geneve-...@ietf.org; nvo3-cha...@ietf.org; nvo3@ietf.org Subject: [nvo3] Re: [Shepherding AD review] review of draft-ietf-nvo3-geneve-oam-12 CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information. Hi Gunter, thank you for your thorough review and helpful suggestions. Please find my notes below tagged GIM>>. I hope that I've not missed anything. I attached the diff that highlights updates in the working version of the draft and the working version of the draft. Regards, Greg On Mon, Nov 25, 2024 at 8:52 AM Gunter van de Velde (Nokia) <gunter.van_de_ve...@nokia.com<mailto:gunter.van_de_ve...@nokia.com>> wrote: # Gunter Van de Velde, RTG AD, comments for draft-ietf-nvo3-geneve-oam-12 # the referenced line numbers are derived from the idnits tool: https://author-tools.ietf.org/api/idnits?url=https://www.ietf.org/archive/id/draft-ietf-nvo3-geneve-oam-12.txt # Many thanks for this write-up. Having proper OAM tools for Geneve is useful and operationally high important. Thank you for this work. # Many thanks to the shepherd write-up from Matthew Bocci and the directorate reviews from Stig Venaas, Paul Kyzivat and Himanshu Shah #GENERIC COMMENTS #================ ## The document handles about "active OAM", while the title of the draft just mentions "OAM". Maybe the draft title can be corrected? GIM>> Makes sense to me. Would the following updates of long and short titles be helpful: OLD TEXT: OAM for use in GENEVE NEW TEXT: Active OAM for use in GENEVE OLD TEXT: OAM in GENEVE NEW TEXT: Active OAM in GENEVE Also, since the draft's full title is still relatively short, could it be used as the short version? GV2> I suspect that would be fine for the Datatracker tools ## When reading the document i got confused between the asserted difference between active oam for geneve and STAMP or BFD. What are the differences and what is the reason of existence of current document? Could a small section be inserted to provide some guidelines on where the specific value for Active OAM, and compare with other mechanisms? GIM>> Now I am a bit confused. STAMP and BFD are examples of active OAM methods for observing networks. Could you help me by pointing out the text that left that impression? I think some editorial touches are needed to clarify that STAMP and BFD, echo request/reply, a.k.a. ping, are examples of active OAM methods. GV2> I think i found my source of confusion, leading to your confusion 😊. I am not that very familiar with the exact OAM science. Without a legacy of years and years of OAM experience as you have, i did not know what exactly active OAM represents. I was indeed missing the understanding that STAMP and BFD, echo request/reply, a.k.a. ping, are examples of active OAM methods. When i was reading the text, my assertion was that there is Active OAM and separate there is BFD and separate there is STAMP, which lead me to believe that active OAM was merely echo request/reply. If this is clarified somewhere at the start of the document it will avoid making this assertion. ## Is the active OAM a novel way to say icmp ping (and icmp trace)? is there anything additional? Maybe this needs to be explicitly mentioned somewhere early in this draft? and references added to the technologies asserted to be utilized for active OAM. I also found that it is at the very end (i.e. section 3) where ICMP and ICMPv6 is discussed for its impact with Geneve. This seems rather late in the document. If the Active OAM is solely about ICMP(v6) then maybe a helicopter perspective earlier in the document helps document readers to understand the objectives better? GIM>> RFC 7799 defined active measurement methods as o Active Methods generate packet streams. Commonly, the packet stream of interest is generated as the basis of measurement. The common understanding of Active OAM is as follows: Active OAM uses specifically constructed packets to detect, troubleshoot, localize defects, and measure performance. I've re-arranged sections in the following manner: GV2> I believe my comment was based upon unclear understanding what ective OAM is and i assumed that bfd, stamp was something different and that only icmp echo request/reply was. When that is clarified as discussed earlier, then that resolves the issue i think. 2. The Applicability of Active OAM Protocols in Geneve Networks . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Requirements for Active OAM Protocols in Geneve Networks . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2. Defect Detection and Troubleshooting in Geneve Network with Active OAM . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.1. Echo Request and Echo Reply in Geneve Tunnel . . . . 7 2.3. OAM Encapsulation in Geneve . . . . . . . . . . . . . . . 7 Does that make it clearer? GV2> difficult to say at this stage, but it seems on the right track ## As i read the document for the first time, i got confused by some sections to be formal procedures or informational sections. This could of been due to my lesser experience with OAM protocols and judge accordingly. Maybe making it more obvious for generalists like me could help making the document more easy to process? #DETAILED COMMENTS #================= 15 Abstract 16 17 This document lists a set of general requirements for active OAM 18 protocols in the Geneve overlay network. Based on these 19 requirements, the IP encapsulation of active Operations, 20 Administration, and Maintenance protocols in the Geneve protocol is 21 defined. Considerations for using ICMP and UDP-based protocols are 22 discussed. GV> What about the following alternative abstract: " Geneve (Generic Network Virtualization Encapsulation) is a flexible and extensible network virtualization overlay protocol designed to encapsulate network packets for transport across underlying physical networks. This document specifies the requirements and provides a framework for Operations, Administration, and Maintenance (OAM) in Geneve networks. It outlines the OAM functions necessary to monitor, diagnose, and troubleshoot Geneve overlay networks to ensure their proper operation and performance. The document aims to guide the implementation of OAM mechanisms within the Geneve protocol to support network operators in maintaining reliable and efficient virtualized network environments. " GIM>> Many thanks for the suggested text. 75 1. Introduction 76 77 Geneve [RFC8926] is intended to support various scenarios of network 78 virtualization. In addition to carrying multiple protocols, e.g., 79 Ethernet, IPv4/IPv6, the Geneve message includes metadata. 80 Operations, Administration, and Maintenance (OAM) protocols support 81 fault management and performance monitoring functions necessary for 82 comprehensive network operation. Active OAM protocols, as defined in 83 [RFC7799], use specially constructed packets that are injected into 84 the network. To ensure that a performance metric or a detected 85 failure are related to a particular Geneve flow, it is critical that 86 these OAM test packets share fate with overlay data packets for that 87 flow when traversing the underlay network. 88 89 A set of general requirements for active OAM protocols in the Geneve 90 overlay network is listed in Section 2. IP encapsulation conforms to 91 these requirements and is a suitable encapsulation of active OAM 92 protocols in a Geneve overlay network. Active OAM in a Geneve 93 overlay network are exchanged between two Geneve tunnel endpoints, 94 which may be an NVE (Network Virtualization Edge) or another device 95 acting as a Geneve tunnel endpoint. For simplicity, an NVE is used 96 to represent the Geneve tunnel endpoint. Please refer to [RFC7365] 97 and [RFC8014] for detailed definitions and descriptions of an NVE. 98 The IP encapsulation of Geneve OAM defined in this document applies 99 to an overlay service by introducing a Management Virtual Network 100 Identifier (VNI) that could be used in combination with various 101 values of the Protocol Type field in the Geneve header, i.e., 102 Ethertypes for IPv4 or IPv6. The analysis and definition of other 103 types of OAM encapsulation in Geneve are outside the scope of this 104 document. GV> What is unclear to me is how this aligns with ICMP, assuming active OAM is ICMP? GIM>> ICMP packets are specially constructed to detect network defects and localize them. Hence, ICMP is an example of active OAM. Would you agree? GV2> yes, agree. What i was missing is what was mentioned earlier that icmp, bfd and stamp are examples of active OAM. GV> idnits rewrite: " Geneve [RFC8926] is designed to support various scenarios of network virtualization. It encapsulates multiple protocols, such as Ethernet and IPv4/IPv6, and includes metadata within the Geneve message. Operations, Administration, and Maintenance (OAM) protocols provide fault management and performance monitoring functions necessary for comprehensive network operation. Active OAM protocols, as defined in [RFC7799], utilize specially constructed packets injected into the network. To ensure that performance metrics or detected failures are accurately related to a particular Geneve flow, it is critical that these OAM test packets share fate with the overlay data packets of that flow when traversing the underlay network. Section 2 of this document lists the general requirements for active OAM protocols in the Geneve overlay network. IP encapsulation meets these requirements and is suitable for encapsulating active OAM protocols within a Geneve overlay network. Active OAM messages in a Geneve overlay network are exchanged between two Geneve tunnel endpoints, which may be a Network Virtualization Edge (NVE) or another device acting as a Geneve tunnel endpoint. For simplicity, this document uses an NVE to represent the Geneve tunnel endpoint. Refer to [RFC7365] and [RFC8014] for detailed definitions and descriptions of an NVE. The IP encapsulation of Geneve OAM defined in this document applies to an overlay service by introducing a Management Virtual Network Identifier (VNI), which can be used in combination with various values of the Protocol Type field in the Geneve header, such as Ethertypes for IPv4 or IPv6. The analysis and definition of other types of OAM encapsulation in Geneve are outside the scope of this document. " GIM>> Excellent, thank you! 110 * In-band OAM is an active OAM or hybrid OAM method ([RFC7799]) that 111 traverses the same set of links and interfaces receiving the same 112 QoS treatment as the monitored object, i.e., a Geneve tunnel as a 113 whole or a particular tenant flow within given Geneve tunnel. GV> In a later section (section 2) this paragraph is used as reference to the in-band need for REQUIREMENT#1. It is a bit unusual that a terminology section acts a reference into formal requirement. Maybe that should be documented in a different place. GIM>> The intention was to establish the interpretation of the "in-band" term. Would removing the reference to the Terminology section (Section 1.1.1) help? GV> From readability perspective, fixing some idnits: " In-band OAM is an active or hybrid OAM method, as defined in [RFC7799], that traverses the same set of links and interfaces and receives the same Quality of Service (QoS) treatment as the monitored object. In this context, the monitored object refers to either the Geneve tunnel as a whole or a specific tenant flow within a given Geneve tunnel. " GIM>> I've noticed that that definition of "in-band" is the only place in the document where the QoS abbreviation is used. Hence, I don't use it in the update. GV2> ok 123 1.1.3. Acronyms 124 125 Geneve: Generic Network Virtualization Encapsulation 126 127 NVO3: Network Virtualization Overlays 128 129 OAM: Operations, Administration, and Maintenance 130 131 VNI: Virtual Network Identifier GV> potentially missing acronyms: VNE, ICMP, ICMPv6 GIM>> I couldn't find VNE being used in the draft. Have I missed it? GV2> My mistake. It was a fat finger typo. It should of displayed NVE (instead of wrong VNE) GIM>> ICMP seems to be on the RFC Editor's list of well-known abbreviations<https://www.rfc-editor.org/rpc/wiki/doku.php?id=abbrev_list>. GV2> ok. I did not check that list. It it is on there, then all is good GV> more accurate NVO3: Network Virtualization over Layer 3 GIM>> Thank you, I fixed it. 133 2. Active OAM Protocols in Geneve Networks GV> Would it be justified to rename this header to "Requirements for Active OAM Protocols in Geneve Networks" GIM>> I agree. Renamed. GV> What seems missing from this discussion why only "Active" is discussed and not "Passive"? The draft title is "OAM for use in GENEVE" and that leaves potentially room for both active and passive OAM. is it possible to expand a little bit about this? GIM>> According to the definition in RFC 7799 that we use in this document, passive OAM methods observe a network flow without altering or impacting that flow. SNMP query and Netcong notifications reporting, for example, counter values, are examples of passive OAM methods. I cannot find any requirement specific to using a passive OAM method in Geneve other than Don't impact the monitored Geneve tunnel. GV2> as discussed earlier. The fact that icmp, bfd and stamp are examples of active OAM i was confused. I will need to read the draft again after the updates with that aspect in mind. 146 Requirement#1: Geneve OAM test packets MUST share the fate with 147 data traffic of the monitored Geneve tunnel, i.e., be in-band 148 (Section 1.1.1) with the monitored traffic, follow the same 149 overlay and transport path as packets with data payload, in the 150 forward direction, i.e. from ingress toward egress endpoint(s) of 151 the OAM test. GV> rewrite correcting idnits: " Requirement 1: Geneve OAM test packets MUST share the same fate as the data traffic of the monitored Geneve tunnel. Specifically, the OAM test packets MUST be in-band (see Section 1.1.1) with the monitored traffic and follow the same overlay and transport path as packets carrying data payloads in the forward direction-from the ingress toward the egress endpoint(s) of the OAM test. " GV2> i assume that the (see section 1.1.1) will be removed here also? 153 An OAM protocol MAY be used to monitor the particular Geneve tunnel 154 as a whole. In that case, test packets could be in-band relative to 155 a sub-set of tenant flows transported over the Geneve tunnel. If the 156 goal is to monitor the condition experienced by the flow of a 157 particular tenant, the test packets MUST be in-band with that 158 specific flow in the Geneve tunnel. Both scenarios are discussed in 159 detail in Section 2.1. GV> What does "Geneve tunnel as a whole" exactly mean. Can this be accuratly described or referenced? Would the following describe what was intended to be outlined: " An OAM protocol MAY be employed to monitor an entire Geneve tunnel. In this case, test packets could be in-band relative to a subset of tenant flows transported over the Geneve tunnel. If the goal is to monitor the conditions experienced by the flow of a particular tenant, the test packets MUST be in-band with that specific flow within the Geneve tunnel. Both scenarios are discussed in detail in Section 2.1. " GIM>> Thank you for your thoughtful consideration of this paragraph. The proposed text is much clearer and perfectly conveys our intention. 161 Requirement#2: The encapsulation of OAM control messages and data 162 packets in the underlay network MUST be indistinguishable from 163 each other from an underlay network IP forwarding point of view. 164 165 Requirement#3: The presence of an OAM control message in the 166 Geneve packet MUST be unambiguously identifiable to Geneve 167 functionality, e.g., at endpoints of Geneve tunnels. 168 169 Requirement#4: OAM test packets MUST NOT be forwarded to a tenant 170 system. GV> Fixing small idnits: " Requirement 2: The encapsulation of OAM control messages and data packets in the underlay network MUST be indistinguishable from each other from the underlay network IP forwarding point of view. Requirement 3: The presence of an OAM control message in a Geneve packet MUST be unambiguously identifiable to Geneve functionality, such as at endpoints of Geneve tunnels. Requirement 4: OAM test packets MUST NOT be forwarded to a tenant system. " GIM>> Many thanks for these suggestions; updated accordingly. 172 A test packet generated by an active OAM protocol, either for a 173 defect detection or performance measurement, according to 174 Requirement#1, MUST be in-band (Section 1.1.1) with the tunnel or 175 data flow being monitored. In an environment where multiple paths 176 through the domain are available, underlay transport nodes can be 177 programmed to use characteristic information to balance the load 178 across known paths. It is essential that test packets follow the 179 same route, i.e., traverses the same set of nodes and links, as a 180 data packet of the monitored flow. Thus, the following requirement 181 to support OAM packet fate-sharing with the data flow: 182 183 Requirement#5: It MUST be possible to express entropy for underlay 184 Equal Cost Multipath in the Geneve encapsulation of OAM packets. GV> I am not sure why here also the (Section 1.1.1) is referenced. This section is a terminology section, It looks unusual to be a justification for a formal procedure. GIM>> As noted earlier, the reference to the Terminology section is unnecessary. I agree with removing it. GV> idnits fixing: " A test packet generated by an active OAM protocol, whether for defect detection or performance measurement, MUST be in-band with the tunnel or data flow being monitored, as specified in Requirement 1. In environments where multiple paths through the domain are available, underlay transport nodes can be programmed to use characteristic information to balance the load across known paths. It is essential that test packets follow the same route-that is, traverse the same set of nodes and links-as a data packet of the monitored flow. Therefore, the following requirement supports OAM packet fate-sharing with the data flow: Requirement 5: It MUST be possible to express entropy for underlay Equal-Cost Multipath in the Geneve encapsulation of OAM packets. " 186 2.1. Defect Detection and Troubleshooting in Geneve Network with Active 187 OAM 188 189 This section considers two scenarios where active OAM is used to 190 detect and localize defects in a Geneve network. Figure 1 presents 191 an example of a Geneve domain. 192 193 +--------+ +--------+ 194 | Tenant +--+ +----| Tenant | 195 | VNI 28 | | | | VNI 35 | 196 +--------+ | ................ | +--------+ 197 | +----+ . . +----+ | 198 | | NVE|--. .--| NVE| | 199 +--| A | . . | B |---+ 200 +----+ . . +----+ 201 / . . 202 / . Geneve . 203 +--------+ / . Network . 204 | Tenant +--+ . . 205 | VNI 35 | . . 206 +--------+ ................ 207 | 208 +----+ 209 | NVE| 210 | C | 211 +----+ 212 | 213 | 214 ===================== 215 | | 216 +--------+ +--------+ 217 | Tenant | | Tenant | 218 | VNI 28 | | VNI 35 | 219 +--------+ +--------+ 220 221 Figure 1: An example of a Geneve domain 222 223 In the first case, consider when a communication problem between 224 Network Virtualization Edge (NVE) device A and NVE C exists. Upon 225 the investigation, the operator discovers that the forwarding in the 226 IP underlay network is working accordingly. Still, the Geneve 227 connection is unstable for all NVE A and NVE C tenants. Detection, 228 troubleshooting, and localization of the problem can be done 229 regardless of the VNI value. 230 231 In the second case, traffic on VNI 35 between NVE A and NVE B has no 232 problems, as on VNI 28 between NVE A and NVE C. But traffic on VNI 233 35 between NVE A and NVE C experiences problems, for example, 234 excessive packet loss. 235 236 The first case can be detected and investigated using any VNI value, 237 whether it connects tenant systems or not; however, to conform to 238 Requirement#4 (Section 2) OAM test packets SHOULD be transmitted on a 239 VNI that doesn't have any tenants. Such a Geneve tunnel is dedicated 240 to carrying only control and management data between the tunnel 241 endpoints, hence it is referred to as a Geneve control channel and 242 that VNI is referred to as the Management VNI. A configured VNI MAY 243 be used to identify the control channel, but it is RECOMMENDED that 244 the default value 1 be used as the Management VNI. Encapsulation of 245 test packets using the Management VNI is discussed in Section 2.2. 246 247 The control channel of a Geneve tunnel MUST NOT carry tenant data. 248 As no tenants are connected using the control channel, a system that 249 supports this specification, MUST NOT forward a packet received over 250 the control channel to any tenant. A packet received over the 251 control channel MUST be forwarded if and only if it is sent onto the 252 control channel of the concatenated Geneve tunnel. Else, it MUST be 253 terminated locally. The Management VNI SHOULD be terminated on the 254 tenant-facing side of the Geneve encapsulation/decapsulation 255 functionality, not the DC-network-facing side (per definitions in 256 Section 4 of [RFC8014]) so that Geneve encap/decap functionality is 257 included in its scope. This approach causes an active OAM packet, 258 e.g., an ICMP echo request, to be decapsulated in the same fashion as 259 any other received Geneve packet. In this example, the resulting 260 ICMP packet is handed to NVE's local management functionality for the 261 processing which generates an ICMP echo reply. The ICMP echo reply 262 is encapsulated in Geneve as specified in Section 2.2. for forwarding 263 back to the NVE that sent the echo request. One advantage of this 264 approach is that a repeated ping test could detect an intermittent 265 problem in Geneve encap/decap hardware, which would not be tested if 266 the Management VNI were handled as a "special case" at the DC- 267 network-facing interface. 268 269 The second case is when a test packet is transmitted using the VNI 270 value associated with the monitored service flow. By doing that, the 271 test packet experiences network treatment as the tenant's packets. 272 Details of that use case are outside the scope of this specification
_______________________________________________ nvo3 mailing list -- nvo3@ietf.org To unsubscribe send an email to nvo3-le...@ietf.org