Thanks, Dave! You can see the WIP text at https://github.com/IETF-OPSAWG-WG/draft-opsarea-rfc5706bis/pull/232, and would welcome all your added feedback and input on it.
Thanks! Carlos Pignataro (He/Him) Founder and Principal Blue Fern Consulting<https://bluefern.consulting/> Email: [email protected]<mailto:[email protected]> Mobile: +1 919 345 3028<tel:+19193453028> Mobile: +34 614 032 993<tel:+34614032993> From: Dave Thaler <[email protected]> Date: Wednesday, March 11, 2026 at 4:29 PM To: Carlos Pignataro <[email protected]>, 'Dave Thaler' <[email protected]> Cc: [email protected] <[email protected]>, [email protected] <[email protected]>, [email protected] <[email protected]> Subject: RE: draft-ietf-opsawg-rfc5706bis-01 early Iotdir review Thanks Carlos for the thoughtful reply. I look forward to seeing an updated version! Dave > -----Original Message----- > From: Carlos Pignataro <[email protected]> > Sent: Wednesday, March 11, 2026 7:05 AM > To: Dave Thaler <[email protected]> > Cc: [email protected]; [email protected]; > [email protected] > Subject: Re: draft-ietf-opsawg-rfc5706bis-01 early Iotdir review > > Hi, Dave, > > Thank you for the thorough and constructive IoTdir early review — the overall > "On > the Right Track" label is appreciated, and your comments are well-targeted. > We've > reviewed each point carefully. Responses inline below. > > > > On Jan 23, 2026, at 7:01 PM, Dave Thaler via Datatracker <[email protected]> > wrote: > > > > Document: draft-ietf-opsawg-rfc5706bis > > Title: Guidelines for Considering Operations and Management in IETF > > Specifications > > Reviewer: Dave Thaler > > Review result: On the Right Track > > > > I am the assigned iotdir reviewer for this draft. For background on > > iotdir, please see the [FAQ](https://wiki.ietf.org/en/group/iotdir). > > Please resolve these comments along with any other comments you may > receive. > > > > A marked-up PDF copy with my comments inline is at > > https://1drv.ms/b/c/dc2b364f3f06fea8/IQBuO9rPPwGxRLZZi9kQSJncAT_Aktrn9 > > MuIurcZp88NRjs?e=2boyoh > > Thanks for sharing this — I am applying all editorials first, and addressing > the main > comments below. > > > > > I found the checklist of key questions in Appendix A to be well > > written, useful, and widely applicable to areas across the IETF, > > including IoT protocols. Similarly the content in the body is well-written > > and > useful. > > Great to see — as wide applicability across IETF areas and usefulness are two > of > our explicit goals. > > > > > I do have a bunch of comments on the body of the document however to > > make it more widely applicable to areas across the IETF. A summary of > > my main technical feedback follows, and many other minor editorial > > points (which I don't expect should need WG discussion) can be found > > in my marked up PDF copy. > > > > 1) Applicability of requirement: Three different places in the document make > > three different, contradictory, statements about which RFCs would be > > required to have this section. > > a) abstract says all "RFCs in the IETF Stream" > > b) 3.1 says all IETF RFCs "that document a technical specification" > > c) Appendix B says "all new Standards Track RFCs" > > I think section 3.1 is the best and the abstract and Appendix B should > > both be changed. > > Indeed. Agreed. Section 3.1 is the most precise formulation. > > > > > 2) Architecture RFCs: Most places in the document are consistent in saying > > documents that specify a "New Protocol" or "Protocol Extension", but one > > place in section 3.1 throws in "or an architecture". Generally speaking, > > an implementation does not claim conformance to an architecture/framework > > document, and so depending on how it is written and the content it may not > > be considered a “technical specification”, just a roadmap document. In > > that case, the previous paragraph would not require it in such an > > architecture document. Furthermore, elsewhere in the document, like the > > abstract, focused on requiring it in New Protocols and Protocol > > Extensions. > > As such, I’d remove “or an architecture”. It might be ok in the preceding > > paragraph to clarify that “anything an implementation would claim > > conformance to is considered a technical specification”, and in my view > > that would cover it. > > > Partly agreed. We see the intent of including architecture documents, as they > can > establish operational patterns that downstream protocol specifications must > respect. However, your point about conformance is well-taken. We will revise > the > text to clarify that architecture documents are expected to include the > section only > where they introduce new operational considerations with downstream normative > implications, and that the exemption in Section 3.2 applies otherwise. We > will also > remove "or an architecture" from the second paragraph of 3.1. > > > > > > 3) Requirements around individual draft -00 submissions: Section 3.1 says > > "early revisions of Internet-Drafts are expected to include an > > Operational Considerations section". I'd find it a huge process hurdle > > to “expect” all -00 versions of individual drafts to have such a section > > as that would discourage many new entrants from participating in the IETF. > > I might say "encouraged" instead of "expected". > > Agreed. We will change "expected" to “highly encouraged" for -00 revisions. > The > intent is to establish a good-practice norm, not create a barrier to > participation. > > > > > 4) Operator: I found much of the document, as currently worded, to be way > > too > > _network_ operator focused, for a document that creates a requirement for > > all areas, including IoT. Some places say "network operator" and other > > places just say "operator". If you widen the term "operator" to be any > > person or organization responsible for managing the protocol > > implementations, then "operator" is fine but it should be added to the > > Terminology section. E.g., is a cloud hosting service an "operator"? Is > > a standalone DNS server admin an "operator"? Is an NTP server admin an > > "operator"? In a home network, is the household member who configures > > devices an "operator"? I'd want the definition to be such that the answer > > to all of those is Yes (or else pick a different term that is generic), > > so that the recommendations in the document are as widely applicable and > > useful as possible. The checklist in Appendix A certainly is good > > already. > > Agreed, and this is a substantive improvement. We will add a definition of > "Operator" to the Terminology section (Section 2) that encompasses any person > or > organization responsible for deploying, configuring, and managing protocol > implementations — explicitly including network operators, cloud service > administrators, IoT fleet managers, home network administrators, DNS/NTP > server administrators, and similar roles. We will then do a pass to replace > "network > operator" with "operator" except where the specific context warrants the > narrower > term. We will also broaden references to "their network" to include "their > systems > and devices" where appropriate (e.g., Sections 4.6 and 4.7). > > > > > > Similarly there are a bunch of places that only talk about "their network" > > (e.g., section 4.6) and "impact ... on the network" (e.g., section 4.7), > > rather than about "their devices and bandwidth" or whatever. Impact on > > the network is good to talk about but from an operations perspective, the > > impact on hosts/devices is also important in my view, and largely missing > > it seems. > > Similarly agreed. > > > > > 5) Network Operation: Section 4.5 contains a statement: > >> If the protocol specification requires changes to end hosts, it > >> should also indicate whether safeguards exist to protect networks > >> from potential overload. > > > > This statement seems asymmetric and biased in terms of only being from the > > perspective of a network operator. Shouldn’t there be a similar statement > > that if a protocol specification requires changes to routers it should > > indicate whether safeguards exist to protect hosts from potential > > overload? My point is really that it seems to be more about protecting > > one organization from entities that aren’t under their control. In some > > cases the hosts/servers may be more strictly managed than the network > > boxes (e.g., in some home networks), and indexing on host vs network is, > > in my view, not the right axis here if one is going to be asymmetric in > > recommendation. My point is consistent with the wording in 2.1.2 of > > RFC 5218 “Protocols that can be deployed by a single group or team … have > > a greater chance of success than those that require cooperation across > > organizations“ (which makes no distinction between network vs host per > > se). > > > > We will revise it to make the overload-protection guidance symmetric with > respect > to both network infrastructure and end hosts/devices. > > > Section 5.4.4 (Fault Isolation) is ok but seems overly network centric. > > Say you have a docker container that is misbehaving in some way… the host > > could isolate or quarantine the container. Same for VMs. Or say you have > > a process in a host that is misbehaving… the kernel could isolate or > > quarantine the process. I’d make the wording here more generic and less > > network operator centric. Operations and management is about more than > > just network operators per se. The guidance is good and just using more > > generic terminology here in terms of stating the principles would make > > the section stronger and more impactful in my view. > > For 5.4.4, we will broaden the fault isolation discussion beyond network-layer > isolation to include host-level isolation mechanisms (e.g., process > quarantine, > container/VM isolation), consistent with the generic section title. > > > > > > 6) Internationalization: Section 4.8 suggests that English should be the > > default language in implementations for human readable messages. I don't > > think this document should make any such recommendation. I do, however, > > recommend adding that it must also be possible to identify which language > > a message intended for humans is in (e.g., via a language tag). Otherwise, > > it cannot be reliably displayed correctly. > > > On Section 4.8: Agreed. We will remove the recommendation that English be the > default language, and instead add a recommendation that human-readable > messages include a language tag to enable correct identification and > rendering. > > > > > > Section 5.5 also has an internationalization issue. It cites an IAB > > workshop RFC (where such RFCs reflect the consensus of workshop > > participants, not the IAB or IETF per se), and then makes a blanket > > statement about configuration files that "human-readable strings should > > utilize UTF-8" which comes across as saying this is now an IETF consensus > > statement. There is IETF consensus on UTF-8 _in protocols_, and more > > specifically UTF-8 with NFC (see section 2 of RFC 5198, which can be > > cited as a normative reference here) but not in _device-local files_. > > The IETF has no recommendation about files since they’re outside the > > scope of Protocols per se. Different OS's already diverge in terms of > > both normalization form and UTF-8 vs UTF-16. Hence either change the > > text to be about strings in protocols (not textual configuration files > > like the preceding sentence says) or make it clear that it is not an > > IETF recommendation, or else be prepared for an IETF-wide discussion that > > will never converge. > > > > On Section 5.5: Agreed. The current text conflates protocol strings with > device- > local configuration files. We will revise the UTF-8 recommendation to apply > specifically to strings in protocols (citing RFC 5198 as a normative > reference for > UTF-8 with NFC), and either remove or clearly qualify the statement about > textual > configuration files. > > > > 7) Information Model Design: The document nicely recommends in point 1 of > > 5.3.1 to "start with a small set of essential objects", which is great. > > I’ve seen cases where someone just exposes everything just because it’s > > there, not because there’s any need (“someone might want it”). As a > > result, querying all state can be burdensome since it can be large and/or > > expensive to query a given value, and can also disincent someone from > > implementing the mechanism for querying them as too burdensome to > > implement. To determine what is “essential”, I usually recommend > > determining what questions need to be answered to troubleshoot, configure, > > etc. and exposing the things that are needed to answer those questions. > > It might help to say something like this to help readers understand what > > is “essential” here. And I think that's consistent with the purpose of > > Appendix A's checklist. > > > > Point 2 in 5.3.1 says "Require that all objects be essential for > > management" but I don't follow what that means. Elaborate. > > > > Point 6 says "Avoid causing critical sections to be heavily instrumented" > > I think it’s not just “critical sections” per se, but anything that would > > be expensive. E.g., if someone wants to expose a summary object _rather > > than the components of it from which the client could do the computation_ > > it would still meet criteria 4, but may be expensive to compute. > > All very good suggestions. We will add clarifying text to Point 1 explaining > that > "essential" objects are those needed to answer the diagnostic, configuration, > and > operational questions the protocol is expected to support — avoiding the trap > of > exposing everything that is technically accessible. We will revise Point 2 to > make > its intent clearer. We will also revise Point 6 to generalise beyond > "critical sections" > to anything that would be expensive to query or compute. > > > > > > 8) Liveness Detection: section 5.4.1 says: > >> Protocol Designers should always build in basic testing features > >> (e.g., ICMP echo, UDP/TCP echo service, NULL RPCs (remote procedure > >> calls)) that can be used to test for liveness, with an option to > >> enable and disable them. > > > > I’m not convinced there aren’t exceptions, such as maybe for very > > constrained IoT devices. Recommend removing “always” and just leaving > > it as lower case “should” like other statements in this doc. > > Agreed. We will remove "always" and retain the lowercase "should", > accommodating constrained-device scenarios > > > > > 9) Configuration Management: Continuing my theme of making the document > > less network-operator-centric in order to apply more generally, including > > to IoT cases... section 5.5 (Configuration Management) comes across to > > me as overly network centric for the section title which is nicely > > generic. So if you manage a bunch of end hosts, or a bunch of Kubernetes > > pods, or a bunch of IoT devices, or a bunch of VMs on a cloud service, > > or a bunch of processes on one or more devices, this section should > > still apply, but it provides little or no guidance. Either change the > > section title and narrow the scope, or else (my preference) broaden the > > discussion. For example, it would be remiss to not mention Kubernetes > > in a general discussion of configuration management. Similarly, for > > IoT devices there are various centralized configuration management > > services such as Balena, SocketXP, Golioth, ThingsBoard, etc. One need > > not name them (I wouldn't), but simply acknowledging the existence of > > popular centralized management platforms would seem appropriate. > > Agreed. We will broaden Section 5.5 to make it applicable to diverse managed > environments > > > > > 10) Operational Consideration section: The rest of the document already > > says that this section shouldn't be required in documents that aren't > > technical specifications and section 3.1 specifically uses process > > documents as an example of when they're not required. Since this > > document itself is a process document, it's not required, so why is > > it here? If you do keep this section, you could say that explicitly > > that it's not required in a document of this type, so people don’t try > > to use this as a precedent to create barriers that aren’t required. > > The section is already written to explicitly acknowledge that no new > requirements > arise ("there are no new operations or manageability requirements introduced > by > this document"). However, we can add a sentence explicitly stating that this > section is not required for a process document of this type, and that it is > included > purely to illustrate the exemption mechanism described in Section 3.2, not as > a > normative precedent. > > > > > 11) Network Device: This term is used in several places (e.g., section 9 > > among others) without definition. Is it "a device managed by a > > network operator"? Is it "any device on the network, whether > > router or end host"? Is it "a device that implements the New Protocol > > or Protocol Extension in question"? If you use this term, an entry > > in the Terminology section might help. > > Agreed. We will add. > > > > > 12) Password-based authentication: Section 9 (Security Considerations) > > says "The security implications of password-based authentication should > > be taken into account when designing a New Protocol or Protocol > > Extension." True but this should already be stated in other RFCs, > > not specific to O&M considerations per se. So is this sentence really > > needed in _this_ document too? It seems anachronistic to me, even > > though it's clearly good advice. > > > > Fair observation. The sentence is inherited from RFC 5706. We can reframe as a > reference rather than standalone statement. > > Thank you again for the high-quality review. We look forward to your > confirmation > that the above resolutions are satisfactory, and will be applying them in our > working copy. > > Best, > > Carlos. > > > > Dave Thaler > > > >
_______________________________________________ OPSAWG mailing list -- [email protected] To unsubscribe send an email to [email protected]
