[OPSAWG]Re: draft-ietf-opsawg-rfc5706bis-01 early Iotdir review

Dave Thaler Wed, 11 Mar 2026 08:30:05 -0700

Thanks Carlos for the thoughtful reply.  I look forward to seeing an updated 
version!


Dave

> -----Original Message-----
> From: Carlos Pignataro <[email protected]>
> Sent: Wednesday, March 11, 2026 7:05 AM
> To: Dave Thaler <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]
> Subject: Re: draft-ietf-opsawg-rfc5706bis-01 early Iotdir review
> 
> Hi, Dave,
> 
> Thank you for the thorough and constructive IoTdir early review — the overall 
> "On
> the Right Track" label is appreciated, and your comments are well-targeted. 
> We've
> reviewed each point carefully. Responses inline below.
> 
> 
> > On Jan 23, 2026, at 7:01 PM, Dave Thaler via Datatracker <[email protected]>
> wrote:
> >
> > Document: draft-ietf-opsawg-rfc5706bis
> > Title: Guidelines for Considering Operations and Management in IETF
> > Specifications
> > Reviewer: Dave Thaler
> > Review result: On the Right Track
> >
> > I am the assigned iotdir reviewer for this draft. For background on
> > iotdir, please see the [FAQ](https://wiki.ietf.org/en/group/iotdir).
> > Please resolve these comments along with any other comments you may
> receive.
> >
> > A marked-up PDF copy with my comments inline is at
> > https://1drv.ms/b/c/dc2b364f3f06fea8/IQBuO9rPPwGxRLZZi9kQSJncAT_Aktrn9
> > MuIurcZp88NRjs?e=2boyoh
> 
> Thanks for sharing this — I am applying all editorials first, and addressing 
> the main
> comments below.
> 
> >
> > I found the checklist of key questions in Appendix A to be well
> > written, useful, and widely applicable to areas across the IETF,
> > including IoT protocols.  Similarly the content in the body is well-written 
> > and
> useful.
> 
> Great to see — as wide applicability across IETF areas and usefulness are two 
> of
> our explicit goals.
> 
> >
> > I do have a bunch of comments on the body of the document however to
> > make it more widely applicable to areas across the IETF. A summary of
> > my main technical feedback follows, and many other minor editorial
> > points (which I don't expect should need WG discussion) can be found
> > in my marked up PDF copy.
> >
> > 1) Applicability of requirement: Three different places in the document make
> >   three different, contradictory, statements about which RFCs would be
> >   required to have this section.
> >      a) abstract says all "RFCs in the IETF Stream"
> >      b) 3.1 says all IETF RFCs "that document a technical specification"
> >      c) Appendix B says "all new Standards Track RFCs"
> >   I think section 3.1 is the best and the abstract and Appendix B should
> >   both be changed.
> 
> Indeed. Agreed. Section 3.1 is the most precise formulation.
> 
> >
> > 2) Architecture RFCs: Most places in the document are consistent in saying
> >   documents that specify a "New Protocol" or "Protocol Extension", but one
> >   place in section 3.1 throws in "or an architecture".  Generally speaking,
> >   an implementation does not claim conformance to an architecture/framework
> >   document, and so depending on how it is written and the content it may not
> >   be considered a “technical specification”, just a roadmap document. In
> >   that case, the previous paragraph would not require it in such an
> >   architecture document. Furthermore, elsewhere in the document, like the
> >   abstract, focused on requiring it in New Protocols and Protocol 
> > Extensions.
> >   As such, I’d remove “or an architecture”. It might be ok in the preceding
> >   paragraph to clarify that “anything an implementation would claim
> >   conformance to is considered a technical specification”, and in my view
> >   that would cover it.
> 
> 
> Partly agreed. We see the intent of including architecture documents, as they 
> can
> establish operational patterns that downstream protocol specifications must
> respect. However, your point about conformance is well-taken. We will revise 
> the
> text to clarify that architecture documents are expected to include the 
> section only
> where they introduce new operational considerations with downstream normative
> implications, and that the exemption in Section 3.2 applies otherwise. We 
> will also
> remove "or an architecture" from the second paragraph of 3.1.
> 
> 
> >
> > 3) Requirements around individual draft -00 submissions: Section 3.1 says
> >   "early revisions of Internet-Drafts are expected to include an
> >   Operational Considerations section".  I'd find it a huge process hurdle
> >   to “expect” all -00 versions of individual drafts to have such a section
> >   as that would discourage many new entrants from participating in the IETF.
> >   I might say "encouraged" instead of "expected".
> 
> Agreed. We will change "expected" to “highly encouraged" for -00 revisions. 
> The
> intent is to establish a good-practice norm, not create a barrier to 
> participation.
> 
> >
> > 4) Operator: I found much of the document, as currently worded, to be way 
> > too
> >   _network_ operator focused, for a document that creates a requirement for
> >   all areas, including IoT. Some places say "network operator" and other
> >   places just say "operator".  If you widen the term "operator" to be any
> >   person or organization responsible for managing the protocol
> >   implementations, then "operator" is fine but it should be added to the
> >   Terminology section.  E.g., is a cloud hosting service an "operator"?  Is
> >   a standalone DNS server admin an "operator"?  Is an NTP server admin an
> >   "operator"?  In a home network, is the household member who configures
> >   devices an "operator"?  I'd want the definition to be such that the answer
> >   to all of those is Yes (or else pick a different term that is generic),
> >   so that the recommendations in the document are as widely applicable and
> >   useful as possible.  The checklist in Appendix A certainly is good 
> > already.
> 
> Agreed, and this is a substantive improvement. We will add a definition of
> "Operator" to the Terminology section (Section 2) that encompasses any person 
> or
> organization responsible for deploying, configuring, and managing protocol
> implementations — explicitly including network operators, cloud service
> administrators, IoT fleet managers, home network administrators, DNS/NTP
> server administrators, and similar roles. We will then do a pass to replace 
> "network
> operator" with "operator" except where the specific context warrants the 
> narrower
> term. We will also broaden references to "their network" to include "their 
> systems
> and devices" where appropriate (e.g., Sections 4.6 and 4.7).
> 
> 
> >
> >   Similarly there are a bunch of places that only talk about "their network"
> >   (e.g., section 4.6) and "impact ... on the network" (e.g., section 4.7),
> >   rather than about "their devices and bandwidth" or whatever.   Impact on
> >   the network is good to talk about but from an operations perspective, the
> >   impact on hosts/devices is also important in my view, and largely missing
> >   it seems.
> 
> Similarly agreed.
> 
> >
> > 5) Network Operation: Section 4.5 contains a statement:
> >> If the protocol specification requires changes to end hosts, it
> >> should also indicate whether safeguards exist to protect networks
> >> from potential overload.
> >
> >   This statement seems asymmetric and biased in terms of only being from the
> >   perspective of a network operator. Shouldn’t there be a similar statement
> >   that if a protocol specification requires changes to routers it should
> >   indicate whether safeguards exist to protect hosts from potential
> >   overload? My point is really that it seems to be more about protecting
> >   one organization from entities that aren’t under their control. In some
> >   cases the hosts/servers may be more strictly managed than the network
> >   boxes (e.g., in some home networks), and indexing on host vs network is,
> >   in my view, not the right axis here if one is going to be asymmetric in
> >   recommendation. My point is consistent with the wording in 2.1.2 of
> >   RFC 5218 “Protocols that can be deployed by a single group or team … have
> >   a greater chance of success than those that require cooperation across
> >   organizations“ (which makes no distinction between network vs host per 
> > se).
> >
> 
> We will revise it to make the overload-protection guidance symmetric with 
> respect
> to both network infrastructure and end hosts/devices.
> 
> >   Section 5.4.4 (Fault Isolation) is ok but seems overly network centric.
> >   Say you have a docker container that is misbehaving in some way… the host
> >   could isolate or quarantine the container. Same for VMs. Or say you have
> >   a process in a host that is misbehaving… the kernel could isolate or
> >   quarantine the process. I’d make the wording here more generic and less
> >   network operator centric. Operations and management is about more than
> >   just network operators per se.  The guidance is good and just using more
> >   generic terminology here in terms of stating the principles would make
> >   the section stronger and more impactful in my view.
> 
> For 5.4.4, we will broaden the fault isolation discussion beyond network-layer
> isolation to include host-level isolation mechanisms (e.g., process 
> quarantine,
> container/VM isolation), consistent with the generic section title.
> 
> 
> >
> > 6) Internationalization: Section 4.8 suggests that English should be the
> >   default language in implementations for human readable messages.  I don't
> >   think this document should make any such recommendation.  I do, however,
> >   recommend adding that it must also be possible to identify which language
> >   a message intended for humans is in (e.g., via a language tag). Otherwise,
> >   it cannot be reliably displayed correctly.
> 
> 
> On Section 4.8: Agreed. We will remove the recommendation that English be the
> default language, and instead add a recommendation that human-readable
> messages include a language tag to enable correct identification and 
> rendering.
> 
> 
> >
> >   Section 5.5 also has an internationalization issue.  It cites an IAB
> >   workshop RFC (where such RFCs reflect the consensus of workshop
> >   participants, not the IAB or IETF per se), and then makes a blanket
> >   statement about configuration files that "human-readable strings should
> >   utilize UTF-8" which comes across as saying this is now an IETF consensus
> >   statement.  There is IETF consensus on UTF-8 _in protocols_, and more
> >   specifically UTF-8 with NFC (see section 2 of RFC 5198, which can be
> >   cited as a normative reference here) but not in _device-local files_.
> >   The IETF has no recommendation about files since they’re outside the
> >   scope of Protocols per se. Different OS's already diverge in terms of
> >   both normalization form and UTF-8 vs UTF-16. Hence either change the
> >   text to be about strings in protocols (not textual configuration files
> >   like the preceding sentence says) or make it clear that it is not an
> >   IETF recommendation, or else be prepared for an IETF-wide discussion that
> >   will never converge.
> >
> 
> On Section 5.5: Agreed. The current text conflates protocol strings with 
> device-
> local configuration files. We will revise the UTF-8 recommendation to apply
> specifically to strings in protocols (citing RFC 5198 as a normative 
> reference for
> UTF-8 with NFC), and either remove or clearly qualify the statement about 
> textual
> configuration files.
> 
> 
> > 7) Information Model Design: The document nicely recommends in point 1 of
> >   5.3.1 to "start with a small set of essential objects", which is great.
> >   I’ve seen cases where someone just exposes everything just because it’s
> >   there, not because there’s any need (“someone might want it”). As a
> >   result, querying all state can be burdensome since it can be large and/or
> >   expensive to query a given value, and can also disincent someone from
> >   implementing the mechanism for querying them as too burdensome to
> >   implement. To determine what is “essential”, I usually recommend
> >   determining what questions need to be answered to troubleshoot, configure,
> >   etc. and exposing the things that are needed to answer those questions.
> >   It might help to say something like this to help readers understand what
> >   is “essential” here.  And I think that's consistent with the purpose of
> >   Appendix A's checklist.
> >
> >   Point 2 in 5.3.1 says "Require that all objects be essential for
> >   management" but I don't follow what that means.  Elaborate.
> >
> >   Point 6 says "Avoid causing critical sections to be heavily instrumented"
> >   I think it’s not just “critical sections” per se, but anything that would
> >   be expensive. E.g., if someone wants to expose a summary object _rather
> >   than the components of it from which the client could do the computation_
> >   it would still meet criteria 4, but may be expensive to compute.
> 
> All very good suggestions. We will add clarifying text to Point 1 explaining 
> that
> "essential" objects are those needed to answer the diagnostic, configuration, 
> and
> operational questions the protocol is expected to support — avoiding the trap 
> of
> exposing everything that is technically accessible. We will revise Point 2 to 
> make
> its intent clearer. We will also revise Point 6 to generalise beyond 
> "critical sections"
> to anything that would be expensive to query or compute.
> 
> 
> >
> > 8) Liveness Detection: section 5.4.1 says:
> >> Protocol Designers should always build in basic testing features
> >> (e.g., ICMP echo, UDP/TCP echo service, NULL RPCs (remote procedure
> >> calls)) that can be used to test for liveness, with an option to
> >> enable and disable them.
> >
> >   I’m not convinced there aren’t exceptions, such as maybe for very
> >   constrained IoT devices. Recommend removing “always” and just leaving
> >   it as lower case “should” like other statements in this doc.
> 
> Agreed. We will remove "always" and retain the lowercase "should",
> accommodating constrained-device scenarios
> 
> >
> > 9) Configuration Management: Continuing my theme of making the document
> >   less network-operator-centric in order to apply more generally, including
> >   to IoT cases... section 5.5 (Configuration Management) comes across to
> >   me as overly network centric for the section title which is nicely
> >   generic. So if you manage a bunch of end hosts, or a bunch of Kubernetes
> >   pods, or a bunch of IoT devices, or a bunch of VMs on a cloud service,
> >   or a bunch of processes on one or more devices, this section should
> >   still apply, but it provides little or no guidance. Either change the
> >   section title and narrow the scope, or else (my preference) broaden the
> >   discussion. For example, it would be remiss to not mention Kubernetes
> >   in a general discussion of configuration management. Similarly, for
> >   IoT devices there are various centralized configuration management
> >   services such as Balena, SocketXP, Golioth, ThingsBoard, etc.  One need
> >   not name them (I wouldn't), but simply acknowledging the existence of
> >   popular centralized management platforms would seem appropriate.
> 
> Agreed. We will broaden Section 5.5 to make it applicable to diverse managed
> environments
> 
> >
> > 10) Operational Consideration section: The rest of the document already
> >   says that this section shouldn't be required in documents that aren't
> >   technical specifications and section 3.1 specifically uses process
> >   documents as an example of when they're not required.  Since this
> >   document itself is a process document, it's not required, so why is
> >   it here?  If you do keep this section, you could say that explicitly
> >   that it's not required in a document of this type, so people don’t try
> >   to use this as a precedent to create barriers that aren’t required.
> 
> The section is already written to explicitly acknowledge that no new 
> requirements
> arise ("there are no new operations or manageability requirements introduced 
> by
> this document"). However, we can add a sentence explicitly stating that this
> section is not required for a process document of this type, and that it is 
> included
> purely to illustrate the exemption mechanism described in Section 3.2, not as 
> a
> normative precedent.
> 
> >
> > 11) Network Device: This term is used in several places (e.g., section 9
> >   among others) without definition.  Is it "a device managed by a
> >   network operator"?  Is it "any device on the network, whether
> >   router or end host"?  Is it "a device that implements the New Protocol
> >   or Protocol Extension in question"?  If you use this term, an entry
> >   in the Terminology section might help.
> 
> Agreed. We will add.
> 
> >
> > 12) Password-based authentication: Section 9 (Security Considerations)
> >   says "The security implications of password-based authentication should
> >   be taken into account when designing a New Protocol or Protocol
> >   Extension."  True but this should already be stated in other RFCs,
> >   not specific to O&M considerations per se. So is this sentence really
> >   needed in _this_ document too?  It seems anachronistic to me, even
> >   though it's clearly good advice.
> >
> 
> Fair observation. The sentence is inherited from RFC 5706. We can reframe as a
> reference rather than standalone statement.
> 
> Thank you again for the high-quality review. We look forward to your 
> confirmation
> that the above resolutions are satisfactory, and will be applying them in our
> working copy.
> 
> Best,
> 
> Carlos.
> 
> 
> > Dave Thaler
> >
> >


_______________________________________________
OPSAWG mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[OPSAWG]Re: draft-ietf-opsawg-rfc5706bis-01 early Iotdir review

Reply via email to