Document: draft-ietf-intarea-proxy-config Title: Communicating Proxy Configurations in Provisioning Domains Reviewer: Dale Worley Review result: Ready with Issues
I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at <https://wiki.ietf.org/en/group/gen/GenArtFAQ>. Document: draft-ietf-intarea-proxy-config-11 Reviewer: Dale R. Worley Review Date: 2026-03-03 IETF LC End Date: 2026-03-03 IESG Telechat date: [not known] Summary: This draft is on the right track but has open issues, described in the review. As far as I can tell, all of these issues are expository, but it's possible that some of them center on technical details that need further thought. It seems to me that there's an overall organizational issue to the I-D: It is written by people who are very deeply involved in this area of proxy configuration and so the document doesn't surface the underlying conceptual structure, and the naive reader has to reconstruct that from all of the details, making it harder to read correctly. Additionally, this mechanism touches a lot of other mechanisms, so there are a lot of details, many of which have their own references. I think a way to fix this is to provide a glossary that defines the relevant terms and their interrelationships and gives suitable references. (In some cases, a good reference seems to be simply a forward reference to one of the tables later in the document, but there are a number of places where it would have helped to have glanced over a later table to have the background for an earlier part of the document.) The terms that seem to be most important are: provisioning domain provisioning domain information Provisioning Domain Additional Information, PvD The idea seems to be that sections of network and the hosts therein can be a "provisioning domain", and that the provisioning domain has a document, "provisioning domain information", that describes its facilities. The official phrasing is "Provisioning Domains (PvDs) are defined in [PVD] as consistent sets of network configuration information", which is unclear to me. Within a "provisioning domain information" document, there is a section "Provisioning Domain Additional Information", although I haven't been able to track down exactly what that is; it's not mentioned in RFC 7556. Does "Provisioning Domain Additional Information" just mean "elements in the structure that have keys that are not defined in RFC 7556"? "PvD" seems to be used indifferently for several of these terms, which is confusing. One subtle implication is that all sources of provisioning domain information for a single provisioning domain will (up to version skew) return the *same* provisioning domain information. Is this assured? Proxy Proxy URI and implied by the phrases: "not just defined as hostnames and ports, but can use URI templates" "Each proxy is defined by a proxy protocol and a proxy location (i.e., a hostname and port or a URI template [URITEMPLATE])" Naively I think of a proxy as a "box" which might have multiple interfaces and multiple ports on those interfaces that provide various proxy services. This document seems to use a narrower definition, a specific hostname and port (and possibly other parts of a URI) that provide a specific proxy service. That's OK, but it brings out that a "proxy" has several important attributes: name/identifier protocol (what I think of as "the sort of proxying it does") hostname port URI template Naively I would expect each proxy to have a distinct name, but the details of the document say that several proxies can share the same name. That ought to be foregrounded and there should be a term for the thing that an "identifier" refers to, such as "proxy group". And there are statements that aren't strictly correct; e.g. sec. 3.1 states "The value of identifier key is a string that can be used to refer to a particular proxy from other dictionaries". The protocols are listed in Table 2, and it would help if that was referenced in the glossary definition of "proxy". There are a large number of mechanisms that are referenced only a few places in the draft, where there references are given. My reflex is that it would be better to mention all of them in the glossary along with their references so that the glossary would be an index of all of the mechanisms that are touched by this draft. But that might make the glossary overwhelmingly large. In many situations, including URIs, either DNS hostnames or IP addresses can be used. It should be foregrounded whether that is true when specifying proxies. The access information for proxies, the "Proxy Location Format" in table 2, seems to differ based on the protocol. It would be useful to state up-front that here are two forms: host and port, and URI template. Also, although table 2 gives this information a distinct name, "Proxy Location Format", the document doesn't seem to have a consistent name (such as "proxy location string") for this attribute of a proxy. There are a lot of uses of the phrase 'the well-known PvD URI (".well-known/pvd")'. But ".well-known/pvd" is not a URI, it's just the path part of a URI. And it's not even that, since the syntactic "path-abempty" part starts with a slash: "/.well-known/pvd". So better phrasing is 'the well-known PvD path ("/.well-known/pvd")'. There are a lot of places where dictionary keys are used in running text but are not quoted. This can sometimes be confusing and it seems to me to be better if the document consistently quoted such keys. E.g., change The values for the protocol key are ... to The values for the "protocol" key are ... Items related to specific sections of the draft: 1. Introduction In order to make use of multiple related proxies, clients need a way to understand which proxies are associated with one another, and which protocols can be used to communicate with the proxies. Additionally, this document outlines how these mechanisms might be used to discover proxies associated with a network (Section 5). However, this approach is not described for the purpose of generic proxy discovery, and requires careful security considerations for clients to limit usage to trusted scenarios. Using this mechanism a client can learn that a legacy insecure HTTP proxy that the client is configured with is also accessible using HTTPS. In this way, clients can upgrade to a more secure connection to the proxy. 2. A way to list one or more proxy URIs in a PvD, allowing clients to learn about other proxy options given a known proxy (Section 3). sec. 3: the proxies array describes equivalent proxies (potentially supporting other protocols) that can be used in addition to the known proxy. sec. 4.2: Entries listed in a proxy-match object MUST NOT expand the set of destinations that a client is willing to send to a particular proxy. The array can only narrow the set of destinations that the client is willing to send through the proxy. For example, if the client has a sec. 5: However, client systems MUST NOT automatically send traffic over proxies advertised in this way without explicit configuration, policy, or user permission. For example, a client can use this mechanism to choose between known proxies, such as if the client was already proxying traffic and has multiple options to choose between. The document seems to be unclear and perhaps inconsistent regarding how a client is expected to use PvD proxy information. Naively, I would expect that a client would use whatever information it was configured with, together with PvD proxy information from router advertisements and DNS records, and recursively adding PvD proxy information queried from the hosts of proxies that it already knows about, to accumulate a set of proxies. But the text of sec. 4.2 suggests that PvD information can only narrow the traffic that a client is willing to send to a particular proxy relative to what the client was previously configured to send to the proxy. And the second paragraph above seems to hint that PvD *could* be used for proxy discovery but shouldn't be. That policy seems to drastically reduce the usefulness of the PvD mechanism. (In particular, if the client is configured to know a proxy can be contacted by non-TLS HTTP, then how can it learn from PvD that the same host:port can be contacted by HTTPS for the same function without in practice extending the set of proxies that it is allowed to utilize?) Also, section 3 seems to suggest that if a client knows of a proxy, or perhaps is configured to allow use of a proxy, then the proxies listed in the PvD fetched from the proxy's hosts are implicitly allowed to be used. OTOH, PvDs obtained through DNS records or router advertisements don't seem to have the same implied permission. It seems to me that the philosophy to be used needs to be thought through and stated clearly in all of the places I've referenced. Provisioning Domains (PvDs) are defined in [PVD] as consistent sets of network configuration information, which can include proxy configuration details Section 2 of [PVD]. The final "Section 2 of PVD]" reads awkwardly in this context, but I expect the RFC Editor will provide a better form. 1.1. Background Other non-standard mechanisms for proxy configuration and discovery have been used historically, some of which are described in [RFC3040]. ... These common (but non-standard) mechanisms only support defining proxies by hostname and port, and do not support configuring a full URI template [URITEMPLATE]. To clarify to the reader how these mechanisms are non-standard, I suggest amending "[RFC3040]" in the first paragraph to "the informational [RFC3040]". I would clarify these paragraphs by: (1) Starting the text with "Non-standard mechanisms..." because there is no previous standard mechanism for "Other" to reference here. (2) Changing "[RFC3040]" to "the informational [RFC3040]" so the reader is clear that RFC 3040 is not a standard. (3) Replacing the period at the end of the first sentence with a colon, to make it clear that the first sentence introduces the next several sentences. (4) Collapsing all of these paragraphs into one, so that it's clear that the first sentence governs the next four sentences. ... executing Javascript scripts, which are prone to implementation-specific inconsistencies and can open up security vulnerabilities. Probably could be abbreviated to "...prone to implementation inconsistencies and security vulnerabilities." 1.2. Requirements Since this section doesn't give requirements (indeed, the requirements are informally described in sec. 1 and 1.1, probably better to title this section "Requirements keywords" or something like that. 2. Fetching PvD Additional Information for proxies In order to fetch PvD Additional Information associated with a proxy, a client issues an HTTP GET request for the well-known PvD URI (".well-known/pvd") as defined in Section 4.1 of [PVDDATA] and the host authority of the proxy. This is applicable for both proxies that are identified by a host and port only (such as SOCKS proxies and HTTP CONNECT proxies) and proxies that are identified by a URI or URI template. The fetch MUST use the "https" scheme. By default, the fetch SHOULD use the standard port for HTTP over TLS (443) and the ".well-known/pvd" path. However, both the port and the path MAY be overridden by local configuration policy on the client. I had a hard time reading this. It wasn't as clear about "HTTP vs. HTTPS" as I think it should have been given the critical security involved, and the use of ports was not as clear as I'd like. I would prefer something more like: In order to fetch PvD Additional Information associated with a proxy, a client issues an HTTPS GET request for the well-known PvD path ("/.well-known/pvd") (as defined in Section 4.1 of [PVDDATA]) to the host of the proxy and the standard HTTPS port (443). This sentence is redundant but might be valuable for clarity: This is applicable for both proxies that are identified by a host and port only (such as SOCKS proxies and HTTP CONNECT proxies) and proxies that are identified by a URI or URI template. This sentence seems to be redundant: The fetch MUST use the "https" scheme. I would leave these sentences unchanged: By default, the fetch SHOULD use the standard port for HTTP over TLS (443) and the "/.well-known/pvd" path. However, both the port and the path MAY be overridden by local configuration policy on the client. As a technical matter, I think you want to require that the GET use the absolute-URI form for the request-URI, that is, include the scheme and authority, so that the PvD server knows the host name by which the client is accessing it and can customize the PvD accordingly. It is not necessary for the client to re-fetch PvD Additional Information unless one of the following conditions occurs: * The current time is beyond the "expires" value defined in Section 4.3 of [PVDDATA] * A new Sequence Number for that PvD is received in a Router Advertisement (RA) This statement is correct as far as it goes, but it only states the requirements on the client implicitly. What it really means is closer to: A client MAY cache the information it obtained from PvD Additional Information, but it MUST discard cached information if: * The current time is beyond the "expires" value defined in Section 4.3 of [PVDDATA] * A new Sequence Number for that PvD is received in a Router Advertisement (RA) 2.1. Discovery via HTTPS/SVCB Records this key generically provides a hint that PvD Additional Information is available, and can be used for use cases unrelated to proxies. This would be clarified by this key generically provides a hint that PvD Additional Information is available, which may include information unrelated to proxies. 3. Enumerating proxies within a PvD the proxies array describes equivalent proxies (potentially supporting other protocols) that can be used in addition to the known proxy. In what sense are two proxies "equivalent" if they support different protocols? 3.1. Proxy dictionary keys Other optional keys can be added to the dictionary to further define or restrict the use of a proxy. For clarity, this should forward-reference sec. 3.2: "can be added to the dictionary as described in sec. 3.2 to further". +==========+========+=============+=======+===========================+ |JSON Key |Optional|Description |Type |Example | +==========+========+=============+=======+===========================+ It if can be made to fit, it is nicer to have the column *values* be "optional" and "required", rather than "Yes" and "No", requiring the reader to refer to the column header to know the polarity of the datum. It might be too late to change this, but the word "proxy" isn't a good word to use as a key because the word's meaning is so broad. Given the column header in table 2, a better key would be "proxy_location_string" (or "proxy_locator"?). It might be worth stating explicitly that port numbers are mandatory in the "host:port" proxy location formats, as they are often omittable in URIs. 3.2. Proprietary keys in proxy configurations For example, "acme_tech_authmode" could be a proprietary key indicating an authentication mode defined by a vendor named "acme_tech". You probably want s/acme_tech/example_tech/. Also the vendor doesn't actually have the name "acme_tech", but rather is probably "Example Technology, Inc.": For example, "example_tech_authmode" could be a proprietary key indicating an authentication mode defined by a vendor named Example Technology. 4. Destination accessibility information for proxies 4.1. Destination Rule Keys Each key MUST correspond to an array with at least one entry. Don't you want to say "Each key's value must be an array with at least one entry."? Entries that include the wildcard prefix also MUST be treated as if they match an FQDN that only contains the string after the prefix, with no subdomain. Simpler to say "Entries that include the wildcard prefix also match an FQDN that only contains the string after the prefix, with no subdomain." A trailing dot (".") at the end of a domain name is not required; the matching logic is the same regardless of its presence or absence. "A string MAY have a trailing dot ("."); it does not affect the matching logic." 4.2. Using Destination Rules My experience is that describing how rules like this work is treacherous. The temptation is to describe various features of the rules and how they work, but often the reader must interpolate some of the overall algorithm (especially the ordering of operations), leading to divergent implementations. I think a better approach would be to give the algorithm explicitly so one can examine it holistically. In this case, something like: Given a PvD document, a destination host (IP address or DNS name), port, and proxy protocol (e.g., "connect-udp"), the client determines which proxies described by the document may be used for proxying by: Examine the dictionaries in the "proxy-match" value in order. If a value contains a proprietary key that the client does not understand, it skips the value and continues with the next. A rule matches the destination if all of the following are true: - if the proxies list is empty, or if at least one of the proxies in the proxy groups named by "proxies" has that as its "protocol" value, - if the destination is given by a host name and the "domains" value is present, then the host name matches one of the entries of "domains", - if the destination is given by an IP address and the "subnets" value is present, then the host name matches one of the entries of "subnets", - if the "ports" value is present, then the destination port matches one of the entries of "ports", and - if the tests specified by any proprietary keys that are present are satisfied. If a rule matches and the "proxies" value is empty, then this PvD does not allow any of its proxies to be used to reach the destination. If a rule matches and the "proxies" value is not empty, no further entries are examined. This PvD allows the client to use any of the proxies in the proxy groups listed in the "proxies" value that have the matching "protocol" value; the proxy groups should be prioritized in the order they are listed, but the proxies within any one group are not prioritized. This points out some complexities: The client may possess several PvD documents, and a single destination may obtain different proxies from each one. The assumption is that a destination is specified by either a FQDN or IP address but not both, and filtering on each is done separately -- and it's possible that the results of filtering on FQDN may be different than filtering on IP address. There is also the question of exactly how domain and IP address filters interact. Clearly if both are absent, then that part of the match always passes. But if an entry contains only "subnets" and the connection request is specified by IP address, does that automatically pass (since the client may not know the address)? Conversely for "domains". And what if "subnets" and "domains" are both present, or is that forbidden? It may be that you want those rules to be: - if the "domains" value is present, then the destination is given by a host name and the host name matches one of the entries of "domains", - if the "subnets" value is present, then the destination is given by an IP address then the host name matches one of the entries of "subnets", That change has the interesting consequence that if both "subnets" and "domains" are present, the rule matches *no* connection attempts, as a connection attempt specifies an FQDN or an IP address but not both. There is an additional complexity: If a rule matches regarding subnet/domain/port/protocol but all of the proxies listed are ruled out because their entries contain a "mandatory" value that the client doesn't understand. The rule doesn't provide the client with a usable proxy, but also the rule seems to terminate rule testing because it matches. This corner case should probably be handled explicitly. -- In order to match a destination rule in the proxy-match array, all properties MUST apply. Be careful to specify "all properties that are present MUST apply". 4.3. Proprietary Keys in Destination Rules This handling applies uniformly across all match rules, including fallback rules. What is a "fallback rule"? 4.4. Examples In the next example, two proxies are defined with a separate identifier, and there are three destination rules: I think you mean "two proxies are defined with distinct identifiers". [END] _______________________________________________ Gen-art mailing list -- [email protected] To unsubscribe send an email to [email protected]
