# Gunter Van de Velde, RTG AD, comments for draft-ietf-nvo3-geneve-oam-12 # the referenced line numbers are derived from the idnits tool: https://author-tools.ietf.org/api/idnits?url=https://www.ietf.org/archive/id/draft-ietf-nvo3-geneve-oam-12.txt
# Many thanks for this write-up. Having proper OAM tools for Geneve is useful and operationally high important. Thank you for this work. # Many thanks to the shepherd write-up from Matthew Bocci and the directorate reviews from Stig Venaas, Paul Kyzivat and Himanshu Shah #GENERIC COMMENTS #================ ## The document handles about "active OAM", while the title of the draft just mentions "OAM". Maybe the draft title can be corrected? ## When reading the document i got confused between the asserted difference between active oam for geneve and STAMP or BFD. What are the differences and what is the reason of existence of current document? Could a small section be inserted to provide some guidelines on where the specific value for Active OAM, and compare with other mechanisms? ## Is the active OAM a novel way to say icmp ping (and icmp trace)? is there anything additional? Maybe this needs to be explicitly mentioned somewhere early in this draft? and references added to the technologies asserted to be utilized for active OAM. I also found that it is at the very end (i.e. section 3) where ICMP and ICMPv6 is discussed for its impact with Geneve. This seems rather late in the document. If the Active OAM is solely about ICMP(v6) then maybe a helicopter perspective earlier in the document helps document readers to understand the objectives better? ## As i read the document for the first time, i got confused by some sections to be formal procedures or informational sections. This could of been due to my lesser experience with OAM protocols and judge accordingly. Maybe making it more obvious for generalists like me could help making the document more easy to process? #DETAILED COMMENTS #================= 15 Abstract 16 17 This document lists a set of general requirements for active OAM 18 protocols in the Geneve overlay network. Based on these 19 requirements, the IP encapsulation of active Operations, 20 Administration, and Maintenance protocols in the Geneve protocol is 21 defined. Considerations for using ICMP and UDP-based protocols are 22 discussed. GV> What about the following alternative abstract: " Geneve (Generic Network Virtualization Encapsulation) is a flexible and extensible network virtualization overlay protocol designed to encapsulate network packets for transport across underlying physical networks. This document specifies the requirements and provides a framework for Operations, Administration, and Maintenance (OAM) in Geneve networks. It outlines the OAM functions necessary to monitor, diagnose, and troubleshoot Geneve overlay networks to ensure their proper operation and performance. The document aims to guide the implementation of OAM mechanisms within the Geneve protocol to support network operators in maintaining reliable and efficient virtualized network environments. " 75 1. Introduction 76 77 Geneve [RFC8926] is intended to support various scenarios of network 78 virtualization. In addition to carrying multiple protocols, e.g., 79 Ethernet, IPv4/IPv6, the Geneve message includes metadata. 80 Operations, Administration, and Maintenance (OAM) protocols support 81 fault management and performance monitoring functions necessary for 82 comprehensive network operation. Active OAM protocols, as defined in 83 [RFC7799], use specially constructed packets that are injected into 84 the network. To ensure that a performance metric or a detected 85 failure are related to a particular Geneve flow, it is critical that 86 these OAM test packets share fate with overlay data packets for that 87 flow when traversing the underlay network. 88 89 A set of general requirements for active OAM protocols in the Geneve 90 overlay network is listed in Section 2. IP encapsulation conforms to 91 these requirements and is a suitable encapsulation of active OAM 92 protocols in a Geneve overlay network. Active OAM in a Geneve 93 overlay network are exchanged between two Geneve tunnel endpoints, 94 which may be an NVE (Network Virtualization Edge) or another device 95 acting as a Geneve tunnel endpoint. For simplicity, an NVE is used 96 to represent the Geneve tunnel endpoint. Please refer to [RFC7365] 97 and [RFC8014] for detailed definitions and descriptions of an NVE. 98 The IP encapsulation of Geneve OAM defined in this document applies 99 to an overlay service by introducing a Management Virtual Network 100 Identifier (VNI) that could be used in combination with various 101 values of the Protocol Type field in the Geneve header, i.e., 102 Ethertypes for IPv4 or IPv6. The analysis and definition of other 103 types of OAM encapsulation in Geneve are outside the scope of this 104 document. GV> What is unclear to me is how this aligns with ICMP, assuming active OAM is ICMP? GV> idnits rewrite: " Geneve [RFC8926] is designed to support various scenarios of network virtualization. It encapsulates multiple protocols, such as Ethernet and IPv4/IPv6, and includes metadata within the Geneve message. Operations, Administration, and Maintenance (OAM) protocols provide fault management and performance monitoring functions necessary for comprehensive network operation. Active OAM protocols, as defined in [RFC7799], utilize specially constructed packets injected into the network. To ensure that performance metrics or detected failures are accurately related to a particular Geneve flow, it is critical that these OAM test packets share fate with the overlay data packets of that flow when traversing the underlay network. Section 2 of this document lists the general requirements for active OAM protocols in the Geneve overlay network. IP encapsulation meets these requirements and is suitable for encapsulating active OAM protocols within a Geneve overlay network. Active OAM messages in a Geneve overlay network are exchanged between two Geneve tunnel endpoints, which may be a Network Virtualization Edge (NVE) or another device acting as a Geneve tunnel endpoint. For simplicity, this document uses an NVE to represent the Geneve tunnel endpoint. Refer to [RFC7365] and [RFC8014] for detailed definitions and descriptions of an NVE. The IP encapsulation of Geneve OAM defined in this document applies to an overlay service by introducing a Management Virtual Network Identifier (VNI), which can be used in combination with various values of the Protocol Type field in the Geneve header, such as Ethertypes for IPv4 or IPv6. The analysis and definition of other types of OAM encapsulation in Geneve are outside the scope of this document. " 110 * In-band OAM is an active OAM or hybrid OAM method ([RFC7799]) that 111 traverses the same set of links and interfaces receiving the same 112 QoS treatment as the monitored object, i.e., a Geneve tunnel as a 113 whole or a particular tenant flow within given Geneve tunnel. GV> In a later section (section 2) this paragraph is used as reference to the in-band need for REQUIREMENT#1. It is a bit unusual that a terminology section acts a reference into formal requirement. Maybe that should be documented in a different place. GV> From readability perspective, fixing some idnits: " In-band OAM is an active or hybrid OAM method, as defined in [RFC7799], that traverses the same set of links and interfaces and receives the same Quality of Service (QoS) treatment as the monitored object. In this context, the monitored object refers to either the Geneve tunnel as a whole or a specific tenant flow within a given Geneve tunnel. " 123 1.1.3. Acronyms 124 125 Geneve: Generic Network Virtualization Encapsulation 126 127 NVO3: Network Virtualization Overlays 128 129 OAM: Operations, Administration, and Maintenance 130 131 VNI: Virtual Network Identifier GV> potentially missing acronyms: VNE, ICMP, ICMPv6 GV> more accurate NVO3: Network Virtualization over Layer 3 133 2. Active OAM Protocols in Geneve Networks GV> Would it be justified to rename this header to "Requirements for Active OAM Protocols in Geneve Networks" GV> What seems missing from this discussion why only "Active" is discussed and not "Passive"? The draft title is "OAM for use in GENEVE" and that leaves potentially room for both active and passive OAM. is it possible to expand a little bit about this? 146 Requirement#1: Geneve OAM test packets MUST share the fate with 147 data traffic of the monitored Geneve tunnel, i.e., be in-band 148 (Section 1.1.1) with the monitored traffic, follow the same 149 overlay and transport path as packets with data payload, in the 150 forward direction, i.e. from ingress toward egress endpoint(s) of 151 the OAM test. GV> rewrite correcting idnits: " Requirement 1: Geneve OAM test packets MUST share the same fate as the data traffic of the monitored Geneve tunnel. Specifically, the OAM test packets MUST be in-band (see Section 1.1.1) with the monitored traffic and follow the same overlay and transport path as packets carrying data payloads in the forward direction-from the ingress toward the egress endpoint(s) of the OAM test. " 153 An OAM protocol MAY be used to monitor the particular Geneve tunnel 154 as a whole. In that case, test packets could be in-band relative to 155 a sub-set of tenant flows transported over the Geneve tunnel. If the 156 goal is to monitor the condition experienced by the flow of a 157 particular tenant, the test packets MUST be in-band with that 158 specific flow in the Geneve tunnel. Both scenarios are discussed in 159 detail in Section 2.1. GV> What does "Geneve tunnel as a whole" exactly mean. Can this be accuratly described or referenced? Would the following describe what was intended to be outlined: " An OAM protocol MAY be employed to monitor an entire Geneve tunnel. In this case, test packets could be in-band relative to a subset of tenant flows transported over the Geneve tunnel. If the goal is to monitor the conditions experienced by the flow of a particular tenant, the test packets MUST be in-band with that specific flow within the Geneve tunnel. Both scenarios are discussed in detail in Section 2.1. " 161 Requirement#2: The encapsulation of OAM control messages and data 162 packets in the underlay network MUST be indistinguishable from 163 each other from an underlay network IP forwarding point of view. 164 165 Requirement#3: The presence of an OAM control message in the 166 Geneve packet MUST be unambiguously identifiable to Geneve 167 functionality, e.g., at endpoints of Geneve tunnels. 168 169 Requirement#4: OAM test packets MUST NOT be forwarded to a tenant 170 system. GV> Fixing small idnits: " Requirement 2: The encapsulation of OAM control messages and data packets in the underlay network MUST be indistinguishable from each other from the underlay network IP forwarding point of view. Requirement 3: The presence of an OAM control message in a Geneve packet MUST be unambiguously identifiable to Geneve functionality, such as at endpoints of Geneve tunnels. Requirement 4: OAM test packets MUST NOT be forwarded to a tenant system. " 172 A test packet generated by an active OAM protocol, either for a 173 defect detection or performance measurement, according to 174 Requirement#1, MUST be in-band (Section 1.1.1) with the tunnel or 175 data flow being monitored. In an environment where multiple paths 176 through the domain are available, underlay transport nodes can be 177 programmed to use characteristic information to balance the load 178 across known paths. It is essential that test packets follow the 179 same route, i.e., traverses the same set of nodes and links, as a 180 data packet of the monitored flow. Thus, the following requirement 181 to support OAM packet fate-sharing with the data flow: 182 183 Requirement#5: It MUST be possible to express entropy for underlay 184 Equal Cost Multipath in the Geneve encapsulation of OAM packets. GV> I am not sure why here also the (Section 1.1.1) is referenced. This section is a terminology section, It looks unusual to be a justification for a formal procedure. GV> idnits fixing: " A test packet generated by an active OAM protocol, whether for defect detection or performance measurement, MUST be in-band with the tunnel or data flow being monitored, as specified in Requirement 1. In environments where multiple paths through the domain are available, underlay transport nodes can be programmed to use characteristic information to balance the load across known paths. It is essential that test packets follow the same route-that is, traverse the same set of nodes and links-as a data packet of the monitored flow. Therefore, the following requirement supports OAM packet fate-sharing with the data flow: Requirement 5: It MUST be possible to express entropy for underlay Equal-Cost Multipath in the Geneve encapsulation of OAM packets. " 186 2.1. Defect Detection and Troubleshooting in Geneve Network with Active 187 OAM 188 189 This section considers two scenarios where active OAM is used to 190 detect and localize defects in a Geneve network. Figure 1 presents 191 an example of a Geneve domain. 192 193 +--------+ +--------+ 194 | Tenant +--+ +----| Tenant | 195 | VNI 28 | | | | VNI 35 | 196 +--------+ | ................ | +--------+ 197 | +----+ . . +----+ | 198 | | NVE|--. .--| NVE| | 199 +--| A | . . | B |---+ 200 +----+ . . +----+ 201 / . . 202 / . Geneve . 203 +--------+ / . Network . 204 | Tenant +--+ . . 205 | VNI 35 | . . 206 +--------+ ................ 207 | 208 +----+ 209 | NVE| 210 | C | 211 +----+ 212 | 213 | 214 ===================== 215 | | 216 +--------+ +--------+ 217 | Tenant | | Tenant | 218 | VNI 28 | | VNI 35 | 219 +--------+ +--------+ 220 221 Figure 1: An example of a Geneve domain 222 223 In the first case, consider when a communication problem between 224 Network Virtualization Edge (NVE) device A and NVE C exists. Upon 225 the investigation, the operator discovers that the forwarding in the 226 IP underlay network is working accordingly. Still, the Geneve 227 connection is unstable for all NVE A and NVE C tenants. Detection, 228 troubleshooting, and localization of the problem can be done 229 regardless of the VNI value. 230 231 In the second case, traffic on VNI 35 between NVE A and NVE B has no 232 problems, as on VNI 28 between NVE A and NVE C. But traffic on VNI 233 35 between NVE A and NVE C experiences problems, for example, 234 excessive packet loss. 235 236 The first case can be detected and investigated using any VNI value, 237 whether it connects tenant systems or not; however, to conform to 238 Requirement#4 (Section 2) OAM test packets SHOULD be transmitted on a 239 VNI that doesn't have any tenants. Such a Geneve tunnel is dedicated 240 to carrying only control and management data between the tunnel 241 endpoints, hence it is referred to as a Geneve control channel and 242 that VNI is referred to as the Management VNI. A configured VNI MAY 243 be used to identify the control channel, but it is RECOMMENDED that 244 the default value 1 be used as the Management VNI. Encapsulation of 245 test packets using the Management VNI is discussed in Section 2.2. 246 247 The control channel of a Geneve tunnel MUST NOT carry tenant data. 248 As no tenants are connected using the control channel, a system that 249 supports this specification, MUST NOT forward a packet received over 250 the control channel to any tenant. A packet received over the 251 control channel MUST be forwarded if and only if it is sent onto the 252 control channel of the concatenated Geneve tunnel. Else, it MUST be 253 terminated locally. The Management VNI SHOULD be terminated on the 254 tenant-facing side of the Geneve encapsulation/decapsulation 255 functionality, not the DC-network-facing side (per definitions in 256 Section 4 of [RFC8014]) so that Geneve encap/decap functionality is 257 included in its scope. This approach causes an active OAM packet, 258 e.g., an ICMP echo request, to be decapsulated in the same fashion as 259 any other received Geneve packet. In this example, the resulting 260 ICMP packet is handed to NVE's local management functionality for the 261 processing which generates an ICMP echo reply. The ICMP echo reply 262 is encapsulated in Geneve as specified in Section 2.2. for forwarding 263 back to the NVE that sent the echo request. One advantage of this 264 approach is that a repeated ping test could detect an intermittent 265 problem in Geneve encap/decap hardware, which would not be tested if 266 the Management VNI were handled as a "special case" at the DC- 267 network-facing interface. 268 269 The second case is when a test packet is transmitted using the VNI 270 value associated with the monitored service flow. By doing that, the 271 test packet experiences network treatment as the tenant's packets. 272 Details of that use case are outside the scope of this specification. GV> Clarification: Are the two scenarios discussed examples or are they formal procedures outlined? GV> The text usually references the REQUIREMENTS with their sections. Is such needed? There is just few requirements. Maybe they could at least within this document be just referred to as requirement 1, 2, 3 and 4? GV> I got confused by reading "but it is RECOMMENDED that the default value of 1 be used as the Management VNI". Why is that? Is that specified in Geneve that "1" is the recommended control channel? if yes, maybe add reference. GV> It is written that "The control channel of a Geneve tunnel MUST NOT carry tenant data." While this seems rather intuitive, is there a normative reference? or is this a procedure specified by this document above and beyond the Geneve encapsulation specification? GV> Note that RFC8014 speaks about "Facing the Tenant System" and "Facing the Data-Center Network" terminology. In the current text in this document different terminology is used i.e."data-center-network-facing" and "tenant-facing side" GV> It is written "so that Geneve encap/decap functionality is included in its scope". Not sure what this exactly means? It could be that i am confused about the paragraph. Is the complete section an example or is this intended as a formal procedure? I am not sure i understand the 'its' in the text "included in its scope" refers towards. GV> Line269-272 details a use-case outside of scope. What is the use of this paragraph within this document? I am sure we can think of many more potential use-cases that are outside the scope of this document 274 2.2. OAM Encapsulation in Geneve 275 276 Active OAM over a Management VNI in the Geneve network uses an IP 277 encapsulation. Protocols such as BFD [RFC5880] and STAMP [RFC8762] 278 use UDP transport. The destination UDP port number in the inner UDP 279 header (Figure 2) identifies the OAM protocol. This approach is 280 well-known and has been used, for example, in MPLS networks 281 [RFC8029]. To use IP encapsulation for an active OAM protocol, the 282 Protocol Type field of the Geneve header MUST be set to the IPv4 283 (0x0800) or IPv6 (0x86DD) value. GV> This section could use some details of the claims of UDP vs BFD vs STAMP vs active OAM. for example, it the assumption that active OAM is different from STAMP or BFD or is it the same? GV> Maybe Figure 2 from RFC9521 could be added to detail BFD over Geneve? 318 Destination IP: The IP address MUST be set to the loopback address 319 127.0.0.1/32 for IPv4, or the loopback address ::1/128 for IPv6 320 [RFC4291]. GV> RFC6890 specifies 127.0.0.0/8 as the loopback range. Why constrain to the exact /32? Would it not be more logical to indicate that the address MUST be a loopback prefix 127.0.0.0/8 and SHOULD be 127.0.0.1/32? 324 TTL or Hop Limit: MUST be set to 255 per [RFC5082]. GV> Is TTL/hop limit not set to something different when iptrace is used? GV> I am confused what security the alleged GTSM supposed to provide? 326 3. Echo Request and Echo Reply in Geneve Tunnel 327 328 ICMP and ICMPv6 ([RFC0792] and [RFC4443] respectively) provide 329 required on-demand defect detection and failure localization. ICMP 330 control messages immediately follow the inner IP header encapsulated 331 in Geneve. ICMP extensions for Geneve networks use mechanisms 332 defined in [RFC4884]. GV> Is this the summary intend of this document to provide a formal procedure to transport ICMP echo reply/response of a Geneve tunnel? Many thanks again for this document, Kind Regards, Gunter Van de Velde, RTG AD _______________________________________________ nvo3 mailing list -- nvo3@ietf.org To unsubscribe send an email to nvo3-le...@ietf.org