Hi Greg,

Thanks for the fast action. Appreciated.

Please see follow up on remaining open items with GV2>


Once there is an updated draft I’ll spin through the document again and see 
where we land.

Thanks again for the fast action Greg.
Be well,
G/

From: Greg Mirsky <gregimir...@gmail.com>
Sent: Tuesday, November 26, 2024 6:53 PM
To: Gunter van de Velde (Nokia) <gunter.van_de_ve...@nokia.com>
Cc: draft-ietf-nvo3-geneve-...@ietf.org; nvo3-cha...@ietf.org; nvo3@ietf.org
Subject: [nvo3] Re: [Shepherding AD review] review of 
draft-ietf-nvo3-geneve-oam-12


CAUTION: This is an external email. Please be very careful when clicking links 
or opening attachments. See the URL nok.it/ext for additional information.


Hi Gunter,
thank you for your thorough review and helpful suggestions. Please find my 
notes below tagged GIM>>. I hope that I've not missed anything. I attached the 
diff that highlights updates in the working version of the draft and the 
working version of the draft.

Regards,
Greg

On Mon, Nov 25, 2024 at 8:52 AM Gunter van de Velde (Nokia) 
<gunter.van_de_ve...@nokia.com<mailto:gunter.van_de_ve...@nokia.com>> wrote:
# Gunter Van de Velde, RTG AD, comments for draft-ietf-nvo3-geneve-oam-12

# the referenced line numbers are derived from the idnits tool:
https://author-tools.ietf.org/api/idnits?url=https://www.ietf.org/archive/id/draft-ietf-nvo3-geneve-oam-12.txt

# Many thanks for this write-up. Having proper OAM tools for Geneve is useful 
and operationally high important. Thank you for this work.

# Many thanks to the shepherd write-up from Matthew Bocci and the directorate 
reviews from Stig Venaas, Paul Kyzivat and Himanshu Shah

#GENERIC COMMENTS
#================

## The document handles about "active OAM", while the title of the draft just 
mentions "OAM". Maybe the draft title can be corrected?
GIM>> Makes sense to me. Would the following updates of long and short titles 
be helpful:
OLD TEXT:
OAM for use in GENEVE
NEW TEXT:
Active OAM for use in GENEVE

OLD TEXT:
OAM in GENEVE
NEW TEXT:
Active OAM in GENEVE

Also, since the draft's full title is still relatively short, could it be used 
as the short version?

GV2> I suspect that would be fine for the Datatracker tools


## When reading the document i got confused between the asserted difference 
between active oam for geneve and STAMP or BFD. What are the differences and 
what is the reason of existence of current document? Could a small section be 
inserted to provide some guidelines on where the specific value for Active OAM, 
and compare with other mechanisms?
GIM>> Now I am a bit confused. STAMP and BFD are examples of active OAM methods 
for observing networks. Could you help me by pointing out the text that left 
that impression? I think some editorial touches are needed to clarify that 
STAMP and BFD, echo request/reply, a.k.a. ping, are examples of active OAM 
methods.

GV2> I think i found my source of confusion, leading to your confusion 😊. I am 
not that very familiar with the exact OAM science. Without a legacy of years 
and years of OAM experience as you have, i did not know what exactly active OAM 
represents. I was indeed missing the understanding that STAMP and BFD, echo 
request/reply, a.k.a. ping, are examples of active OAM methods. When i was 
reading the text, my assertion was that there is Active OAM and separate there 
is BFD and separate there is STAMP, which lead me to believe that active OAM 
was merely echo request/reply. If this is clarified somewhere at the start of 
the document it will avoid making this assertion.


## Is the active OAM a novel way to say icmp ping (and icmp trace)? is there 
anything additional? Maybe this needs to be explicitly mentioned somewhere 
early in this draft? and references added to the technologies asserted to be 
utilized for active OAM. I also found that it is at the very end (i.e. section 
3) where ICMP and ICMPv6 is discussed for its impact with Geneve. This seems 
rather late in the document. If the Active OAM is solely about ICMP(v6) then 
maybe a helicopter perspective earlier in the document helps document readers 
to understand the objectives better?
GIM>> RFC 7799 defined active measurement methods as
   o  Active Methods generate packet streams.  Commonly, the packet
      stream of interest is generated as the basis of measurement.
The common understanding of Active OAM is as follows:
Active OAM uses specifically constructed packets to detect, troubleshoot,
localize defects, and measure performance.
I've re-arranged sections in the following manner:

GV2> I believe my comment was based upon unclear understanding what ective OAM 
is and i assumed that bfd, stamp was something different and that only icmp 
echo request/reply was. When that is clarified as discussed earlier, then that 
resolves the issue i think.

   2.  The Applicability of Active OAM Protocols in Geneve
           Networks  . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.1.  Requirements for Active OAM Protocols in Geneve
           Networks  . . . . . . . . . . . . . . . . . . . . . . . .   4
     2.2.  Defect Detection and Troubleshooting in Geneve Network with
           Active OAM  . . . . . . . . . . . . . . . . . . . . . . .   5
       2.2.1.  Echo Request and Echo Reply in Geneve Tunnel  . . . .   7
     2.3.  OAM Encapsulation in Geneve . . . . . . . . . . . . . . .   7
Does that make it clearer?

GV2> difficult to say at this stage, but it seems on the right track



## As i read the document for the first time, i got confused by some sections 
to be formal procedures or informational sections. This could of been due to my 
lesser experience with OAM protocols and judge accordingly. Maybe making it 
more obvious for generalists like me could help making the document more easy 
to process?

#DETAILED COMMENTS
#=================

15      Abstract
16
17         This document lists a set of general requirements for active OAM
18         protocols in the Geneve overlay network.  Based on these
19         requirements, the IP encapsulation of active Operations,
20         Administration, and Maintenance protocols in the Geneve protocol is
21         defined.  Considerations for using ICMP and UDP-based protocols are
22         discussed.

GV> What about the following alternative abstract:

"
Geneve (Generic Network Virtualization Encapsulation) is a flexible and 
extensible network virtualization overlay protocol designed to encapsulate 
network packets for transport across underlying physical networks. This 
document specifies the requirements and provides a framework for Operations, 
Administration, and Maintenance (OAM) in Geneve networks. It outlines the OAM 
functions necessary to monitor, diagnose, and troubleshoot Geneve overlay 
networks to ensure their proper operation and performance. The document aims to 
guide the implementation of OAM mechanisms within the Geneve protocol to 
support network operators in maintaining reliable and efficient virtualized 
network environments.
"
GIM>> Many thanks for the suggested text.

75      1.  Introduction
76
77         Geneve [RFC8926] is intended to support various scenarios of network
78         virtualization.  In addition to carrying multiple protocols, e.g.,
79         Ethernet, IPv4/IPv6, the Geneve message includes metadata.
80         Operations, Administration, and Maintenance (OAM) protocols support
81         fault management and performance monitoring functions necessary for
82         comprehensive network operation.  Active OAM protocols, as defined in
83         [RFC7799], use specially constructed packets that are injected into
84         the network.  To ensure that a performance metric or a detected
85         failure are related to a particular Geneve flow, it is critical that
86         these OAM test packets share fate with overlay data packets for that
87         flow when traversing the underlay network.
88
89         A set of general requirements for active OAM protocols in the Geneve
90         overlay network is listed in Section 2.  IP encapsulation conforms to
91         these requirements and is a suitable encapsulation of active OAM
92         protocols in a Geneve overlay network.  Active OAM in a Geneve
93         overlay network are exchanged between two Geneve tunnel endpoints,
94         which may be an NVE (Network Virtualization Edge) or another device
95         acting as a Geneve tunnel endpoint.  For simplicity, an NVE is used
96         to represent the Geneve tunnel endpoint.  Please refer to [RFC7365]
97         and [RFC8014] for detailed definitions and descriptions of an NVE.
98         The IP encapsulation of Geneve OAM defined in this document applies
99         to an overlay service by introducing a Management Virtual Network
100        Identifier (VNI) that could be used in combination with various
101        values of the Protocol Type field in the Geneve header, i.e.,
102        Ethertypes for IPv4 or IPv6.  The analysis and definition of other
103        types of OAM encapsulation in Geneve are outside the scope of this
104        document.

GV> What is unclear to me is how this aligns with ICMP, assuming active OAM is 
ICMP?
GIM>> ICMP packets are specially constructed to detect network defects and 
localize them. Hence, ICMP is an example of active OAM. Would you agree?

GV2> yes, agree. What i was missing is what was mentioned earlier that icmp, 
bfd and stamp are examples of active OAM.


GV> idnits rewrite:

"
Geneve [RFC8926] is designed to support various scenarios of network 
virtualization. It encapsulates multiple protocols, such as Ethernet and 
IPv4/IPv6, and includes metadata within the Geneve message.

Operations, Administration, and Maintenance (OAM) protocols provide fault 
management and performance monitoring functions necessary for comprehensive 
network operation. Active OAM protocols, as defined in [RFC7799], utilize 
specially constructed packets injected into the network. To ensure that 
performance metrics or detected failures are accurately related to a particular 
Geneve flow, it is critical that these OAM test packets share fate with the 
overlay data packets of that flow when traversing the underlay network.

Section 2 of this document lists the general requirements for active OAM 
protocols in the Geneve overlay network. IP encapsulation meets these 
requirements and is suitable for encapsulating active OAM protocols within a 
Geneve overlay network. Active OAM messages in a Geneve overlay network are 
exchanged between two Geneve tunnel endpoints, which may be a Network 
Virtualization Edge (NVE) or another device acting as a Geneve tunnel endpoint. 
For simplicity, this document uses an NVE to represent the Geneve tunnel 
endpoint. Refer to [RFC7365] and [RFC8014] for detailed definitions and 
descriptions of an NVE.

The IP encapsulation of Geneve OAM defined in this document applies to an 
overlay service by introducing a Management Virtual Network Identifier (VNI), 
which can be used in combination with various values of the Protocol Type field 
in the Geneve header, such as Ethertypes for IPv4 or IPv6. The analysis and 
definition of other types of OAM encapsulation in Geneve are outside the scope 
of this document.
"
GIM>> Excellent, thank you!

110        *  In-band OAM is an active OAM or hybrid OAM method ([RFC7799]) that
111           traverses the same set of links and interfaces receiving the same
112           QoS treatment as the monitored object, i.e., a Geneve tunnel as a
113           whole or a particular tenant flow within given Geneve tunnel.

GV> In a later section (section 2) this paragraph is used as reference to the 
in-band need for REQUIREMENT#1. It is a bit unusual that a terminology section 
acts a reference into formal requirement. Maybe that should be documented in a 
different place.
GIM>> The intention was to establish the interpretation of the "in-band" term. 
Would removing the reference to the Terminology section (Section 1.1.1) help?

GV> From readability perspective, fixing some idnits:

"
In-band OAM is an active or hybrid OAM method, as defined in [RFC7799], that 
traverses the same set of links and interfaces and receives the same Quality of 
Service (QoS) treatment as the monitored object. In this context, the monitored 
object refers to either the Geneve tunnel as a whole or a specific tenant flow 
within a given Geneve tunnel.
"
GIM>> I've noticed that that definition of "in-band" is the only place in the 
document where the QoS abbreviation is used. Hence, I don't use it in the 
update.

GV2> ok


123     1.1.3.  Acronyms
124
125        Geneve: Generic Network Virtualization Encapsulation
126
127        NVO3: Network Virtualization Overlays
128
129        OAM: Operations, Administration, and Maintenance
130
131        VNI: Virtual Network Identifier

GV> potentially missing acronyms: VNE, ICMP, ICMPv6
GIM>> I couldn't find VNE being used in the draft. Have I missed it?

GV2> My mistake. It was a fat finger typo. It should of displayed NVE (instead 
of wrong VNE)

GIM>> ICMP seems to be on the RFC Editor's list of well-known 
abbreviations<https://www.rfc-editor.org/rpc/wiki/doku.php?id=abbrev_list>.

GV2> ok. I did not check that list. It it is on there, then all is good

GV> more accurate NVO3: Network Virtualization over Layer 3
GIM>> Thank you, I fixed it.

133     2.  Active OAM Protocols in Geneve Networks

GV> Would it be justified to rename this header to "Requirements for Active OAM 
Protocols in Geneve Networks"
GIM>> I agree. Renamed.

GV> What seems missing from this discussion why only "Active" is discussed and 
not "Passive"?  The draft title is "OAM for use in GENEVE" and that leaves 
potentially room for both active and passive OAM. is it possible to expand a 
little bit about this?
GIM>> According to the definition in RFC 7799 that we use in this document, 
passive OAM methods observe a network flow without altering or impacting that 
flow. SNMP query and Netcong notifications reporting, for example, counter 
values, are examples of passive OAM methods. I cannot find any requirement 
specific to using a passive OAM method in Geneve other than Don't impact the 
monitored Geneve tunnel.

GV2> as discussed earlier. The fact that icmp, bfd and stamp are examples of 
active OAM i was confused. I will need to read the draft again after the 
updates with that aspect in mind.


146           Requirement#1: Geneve OAM test packets MUST share the fate with
147           data traffic of the monitored Geneve tunnel, i.e., be in-band
148           (Section 1.1.1) with the monitored traffic, follow the same
149           overlay and transport path as packets with data payload, in the
150           forward direction, i.e. from ingress toward egress endpoint(s) of
151           the OAM test.

GV> rewrite correcting idnits:

"
Requirement 1: Geneve OAM test packets MUST share the same fate as the data 
traffic of the monitored Geneve tunnel. Specifically, the OAM test packets MUST 
be in-band (see Section 1.1.1) with the monitored traffic and follow the same 
overlay and transport path as packets carrying data payloads in the forward 
direction-from the ingress toward the egress endpoint(s) of the OAM test.
"

GV2> i assume that the (see section 1.1.1) will be removed here also?

153        An OAM protocol MAY be used to monitor the particular Geneve tunnel
154        as a whole.  In that case, test packets could be in-band relative to
155        a sub-set of tenant flows transported over the Geneve tunnel.  If the
156        goal is to monitor the condition experienced by the flow of a
157        particular tenant, the test packets MUST be in-band with that
158        specific flow in the Geneve tunnel.  Both scenarios are discussed in
159        detail in Section 2.1.

GV> What does "Geneve tunnel as a whole" exactly mean. Can this be accuratly 
described or referenced?
Would the following describe what was intended to be outlined:

"
An OAM protocol MAY be employed to monitor an entire Geneve tunnel. In this 
case, test packets could be in-band relative to a subset of tenant flows 
transported over the Geneve tunnel. If the goal is to monitor the conditions 
experienced by the flow of a particular tenant, the test packets MUST be 
in-band with that specific flow within the Geneve tunnel. Both scenarios are 
discussed in detail in Section 2.1.
"
GIM>> Thank you for your thoughtful consideration of this paragraph. The 
proposed text is much clearer and perfectly conveys our intention.

161           Requirement#2: The encapsulation of OAM control messages and data
162           packets in the underlay network MUST be indistinguishable from
163           each other from an underlay network IP forwarding point of view.
164
165           Requirement#3: The presence of an OAM control message in the
166           Geneve packet MUST be unambiguously identifiable to Geneve
167           functionality, e.g., at endpoints of Geneve tunnels.
168
169           Requirement#4: OAM test packets MUST NOT be forwarded to a tenant
170           system.

GV> Fixing small idnits:

"
Requirement 2: The encapsulation of OAM control messages and data packets in 
the underlay network MUST be indistinguishable from each other from the 
underlay network IP forwarding point of view.

Requirement 3: The presence of an OAM control message in a Geneve packet MUST 
be unambiguously identifiable to Geneve functionality, such as at endpoints of 
Geneve tunnels.

Requirement 4: OAM test packets MUST NOT be forwarded to a tenant system.
"
GIM>> Many thanks for these suggestions; updated accordingly.

172        A test packet generated by an active OAM protocol, either for a
173        defect detection or performance measurement, according to
174        Requirement#1, MUST be in-band (Section 1.1.1) with the tunnel or
175        data flow being monitored.  In an environment where multiple paths
176        through the domain are available, underlay transport nodes can be
177        programmed to use characteristic information to balance the load
178        across known paths.  It is essential that test packets follow the
179        same route, i.e., traverses the same set of nodes and links, as a
180        data packet of the monitored flow.  Thus, the following requirement
181        to support OAM packet fate-sharing with the data flow:
182
183           Requirement#5: It MUST be possible to express entropy for underlay
184           Equal Cost Multipath in the Geneve encapsulation of OAM packets.

GV> I am not sure why here also the (Section 1.1.1) is referenced. This section 
is a terminology section, It looks unusual to be a justification for a formal 
procedure.
GIM>> As noted earlier, the reference to the Terminology section is 
unnecessary.  I agree with removing it.


GV> idnits fixing:

"
A test packet generated by an active OAM protocol, whether for defect detection 
or performance measurement, MUST be in-band with the tunnel or data flow being 
monitored, as specified in Requirement 1. In environments where multiple paths 
through the domain are available, underlay transport nodes can be programmed to 
use characteristic information to balance the load across known paths. It is 
essential that test packets follow the same route-that is, traverse the same 
set of nodes and links-as a data packet of the monitored flow. Therefore, the 
following requirement supports OAM packet fate-sharing with the data flow:

Requirement 5: It MUST be possible to express entropy for underlay Equal-Cost 
Multipath in the Geneve encapsulation of OAM packets.
"

186     2.1.  Defect Detection and Troubleshooting in Geneve Network with Active
187           OAM
188
189        This section considers two scenarios where active OAM is used to
190        detect and localize defects in a Geneve network.  Figure 1 presents
191        an example of a Geneve domain.
192
193            +--------+                                             +--------+
194            | Tenant +--+                                     +----| Tenant |
195            | VNI 28 |  |                                     |    | VNI 35 |
196            +--------+  |          ................           |    +--------+
197                        |  +----+  .              .  +----+   |
198                        |  | NVE|--.              .--| NVE|   |
199                        +--| A  |  .              .  | B  |---+
200                           +----+  .              .  +----+
201                           /       .              .
202                          /        .     Geneve   .
203            +--------+   /         .    Network   .
204            | Tenant +--+          .              .
205            | VNI 35 |             .              .
206            +--------+             ................
207                                          |
208                                        +----+
209                                        | NVE|
210                                        | C  |
211                                        +----+
212                                          |
213                                          |
214                                =====================
215                                  |               |
216                              +--------+      +--------+
217                              | Tenant |      | Tenant |
218                              | VNI 28 |      | VNI 35 |
219                              +--------+      +--------+
220
221                    Figure 1: An example of a Geneve domain
222
223        In the first case, consider when a communication problem between
224        Network Virtualization Edge (NVE) device A and NVE C exists.  Upon
225        the investigation, the operator discovers that the forwarding in the
226        IP underlay network is working accordingly.  Still, the Geneve
227        connection is unstable for all NVE A and NVE C tenants.  Detection,
228        troubleshooting, and localization of the problem can be done
229        regardless of the VNI value.
230
231        In the second case, traffic on VNI 35 between NVE A and NVE B has no
232        problems, as on VNI 28 between NVE A and NVE C.  But traffic on VNI
233        35 between NVE A and NVE C experiences problems, for example,
234        excessive packet loss.
235
236        The first case can be detected and investigated using any VNI value,
237        whether it connects tenant systems or not; however, to conform to
238        Requirement#4 (Section 2) OAM test packets SHOULD be transmitted on a
239        VNI that doesn't have any tenants.  Such a Geneve tunnel is dedicated
240        to carrying only control and management data between the tunnel
241        endpoints, hence it is referred to as a Geneve control channel and
242        that VNI is referred to as the Management VNI.  A configured VNI MAY
243        be used to identify the control channel, but it is RECOMMENDED that
244        the default value 1 be used as the Management VNI.  Encapsulation of
245        test packets using the Management VNI is discussed in Section 2.2.
246
247        The control channel of a Geneve tunnel MUST NOT carry tenant data.
248        As no tenants are connected using the control channel, a system that
249        supports this specification, MUST NOT forward a packet received over
250        the control channel to any tenant.  A packet received over the
251        control channel MUST be forwarded if and only if it is sent onto the
252        control channel of the concatenated Geneve tunnel.  Else, it MUST be
253        terminated locally.  The Management VNI SHOULD be terminated on the
254        tenant-facing side of the Geneve encapsulation/decapsulation
255        functionality, not the DC-network-facing side (per definitions in
256        Section 4 of [RFC8014]) so that Geneve encap/decap functionality is
257        included in its scope.  This approach causes an active OAM packet,
258        e.g., an ICMP echo request, to be decapsulated in the same fashion as
259        any other received Geneve packet.  In this example, the resulting
260        ICMP packet is handed to NVE's local management functionality for the
261        processing which generates an ICMP echo reply.  The ICMP echo reply
262        is encapsulated in Geneve as specified in Section 2.2. for forwarding
263        back to the NVE that sent the echo request.  One advantage of this
264        approach is that a repeated ping test could detect an intermittent
265        problem in Geneve encap/decap hardware, which would not be tested if
266        the Management VNI were handled as a "special case" at the DC-
267        network-facing interface.
268
269        The second case is when a test packet is transmitted using the VNI
270        value associated with the monitored service flow.  By doing that, the
271        test packet experiences network treatment as the tenant's packets.
272        Details of that use case are outside the scope of this specification
_______________________________________________
nvo3 mailing list -- nvo3@ietf.org
To unsubscribe send an email to nvo3-le...@ietf.org

Reply via email to