Document: draft-ietf-intarea-proxy-config
Title: Communicating Proxy Configurations in Provisioning Domains
Reviewer: Dale Worley
Review result: Ready with Issues

I am the assigned Gen-ART reviewer for this draft.  The General Area
Review Team (Gen-ART) reviews all IETF documents being processed by
the IESG for the IETF Chair.  Please treat these comments just like
any other last call comments.

For more information, please see the FAQ at

<https://wiki.ietf.org/en/group/gen/GenArtFAQ>.

Document:  draft-ietf-intarea-proxy-config-11
Reviewer:  Dale R. Worley
Review Date:  2026-03-03
IETF LC End Date:  2026-03-03
IESG Telechat date:  [not known]

Summary:

    This draft is on the right track but has open issues, described in
    the review.

As far as I can tell, all of these issues are expository, but it's
possible that some of them center on technical details that need
further thought.

It seems to me that there's an overall organizational issue to the
I-D:  It is written by people who are very deeply involved in this
area of proxy configuration and so the document doesn't surface the
underlying conceptual structure, and the naive reader has to
reconstruct that from all of the details, making it harder to read
correctly.

Additionally, this mechanism touches a lot of other mechanisms, so
there are a lot of details, many of which have their own references.

I think a way to fix this is to provide a glossary that defines the
relevant terms and their interrelationships and gives suitable
references.  (In some cases, a good reference seems to be simply a
forward reference to one of the tables later in the document, but
there are a number of places where it would have helped to have
glanced over a later table to have the background for an earlier part
of the document.)

The terms that seem to be most important are:

    provisioning domain
    provisioning domain information
    Provisioning Domain Additional Information,
    PvD

The idea seems to be that sections of network and the hosts therein
can be a "provisioning domain", and that the provisioning domain has a
document, "provisioning domain information", that describes its
facilities.  The official phrasing is "Provisioning Domains (PvDs) are
defined in [PVD] as consistent sets of network configuration
information", which is unclear to me.

Within a "provisioning domain information" document, there is a
section "Provisioning Domain Additional Information", although I
haven't been able to track down exactly what that is; it's not
mentioned in RFC 7556.  Does "Provisioning Domain Additional
Information" just mean "elements in the structure that have keys that
are not defined in RFC 7556"?

"PvD" seems to be used indifferently for several of these terms, which
is confusing.

One subtle implication is that all sources of provisioning domain
information for a single provisioning domain will (up to version skew)
return the *same* provisioning domain information.  Is this assured?

   Proxy
   Proxy URI
and implied by the phrases:
   "not just defined as hostnames and ports, but can use URI templates"
   "Each proxy is defined by a proxy protocol and a proxy
   location (i.e., a hostname and port or a URI template [URITEMPLATE])"

Naively I think of a proxy as a "box" which might have multiple
interfaces and multiple ports on those interfaces that provide various
proxy services.  This document seems to use a narrower definition, a
specific hostname and port (and possibly other parts of a URI) that
provide a specific proxy service.  That's OK, but it brings out that a
"proxy" has several important attributes:

   name/identifier
   protocol (what I think of as "the sort of proxying it does")
   hostname
   port
   URI template

Naively I would expect each proxy to have a distinct name, but the
details of the document say that several proxies can share the same
name.  That ought to be foregrounded and there should be a term for
the thing that an "identifier" refers to, such as "proxy group".  And
there are statements that aren't strictly correct; e.g. sec. 3.1
states "The value of identifier key is a string that can be used to
refer to a particular proxy from other dictionaries".

The protocols are listed in Table 2, and it would help if that was
referenced in the glossary definition of "proxy".

There are a large number of mechanisms that are referenced only a few
places in the draft, where there references are given.  My reflex is
that it would be better to mention all of them in the glossary along
with their references so that the glossary would be an index of all of
the mechanisms that are touched by this draft.  But that might make
the glossary overwhelmingly large.

In many situations, including URIs, either DNS hostnames or IP
addresses can be used.  It should be foregrounded whether that is true
when specifying proxies.

The access information for proxies, the "Proxy Location Format" in
table 2, seems to differ based on the protocol.  It would be useful to
state up-front that here are two forms:  host and port, and URI
template.  Also, although table 2 gives this information a distinct
name, "Proxy Location Format", the document doesn't seem to have a
consistent name (such as "proxy location string") for this attribute
of a proxy.

There are a lot of uses of the phrase 'the well-known PvD URI
(".well-known/pvd")'.  But ".well-known/pvd" is not a URI, it's just
the path part of a URI.  And it's not even that, since the syntactic
"path-abempty" part starts with a slash:  "/.well-known/pvd".  So
better phrasing is 'the well-known PvD path ("/.well-known/pvd")'.

There are a lot of places where dictionary keys are used in running
text but are not quoted.  This can sometimes be confusing and it seems
to me to be better if the document consistently quoted such keys.
E.g., change

   The values for the protocol key are ...

to

   The values for the "protocol" key are ...

Items related to specific sections of the draft:

1.  Introduction

   In order to make use of multiple related proxies, clients need a way
   to understand which proxies are associated with one another, and
   which protocols can be used to communicate with the proxies.

   Additionally, this document outlines how these mechanisms might be
   used to discover proxies associated with a network (Section 5).
   However, this approach is not described for the purpose of generic
   proxy discovery, and requires careful security considerations for
   clients to limit usage to trusted scenarios.

   Using this mechanism a client can learn that a legacy insecure HTTP
   proxy that the client is configured with is also accessible using
   HTTPS.  In this way, clients can upgrade to a more secure connection
   to the proxy.

   2.  A way to list one or more proxy URIs in a PvD, allowing clients
       to learn about other proxy options given a known proxy
       (Section 3).

   sec. 3:
   the proxies array
   describes equivalent proxies (potentially supporting other protocols)
   that can be used in addition to the known proxy.

   sec. 4.2:
   Entries listed in a proxy-match object MUST NOT expand the set of
   destinations that a client is willing to send to a particular proxy.
   The array can only narrow the set of destinations that the client is
   willing to send through the proxy.  For example, if the client has a

   sec. 5:
   However, client systems MUST NOT automatically send traffic over
   proxies advertised in this way without explicit configuration,
   policy, or user permission.  For example, a client can use this
   mechanism to choose between known proxies, such as if the client was
   already proxying traffic and has multiple options to choose between.

The document seems to be unclear and perhaps inconsistent regarding
how a client is expected to use PvD proxy information.  Naively, I
would expect that a client would use whatever information it was
configured with, together with PvD proxy information from router
advertisements and DNS records, and recursively adding PvD proxy
information queried from the hosts of proxies that it already knows
about, to accumulate a set of proxies.

But the text of sec. 4.2 suggests that PvD information can only narrow
the traffic that a client is willing to send to a particular proxy
relative to what the client was previously configured to send to the
proxy.  And the second paragraph above seems to hint that PvD *could*
be used for proxy discovery but shouldn't be.  That policy seems to
drastically reduce the usefulness of the PvD mechanism.

(In particular, if the client is configured to know a proxy can be
contacted by non-TLS HTTP, then how can it learn from PvD that the
same host:port can be contacted by HTTPS for the same function without
in practice extending the set of proxies that it is allowed to
utilize?)

Also, section 3 seems to suggest that if a client knows of a proxy, or
perhaps is configured to allow use of a proxy, then the proxies
listed in the PvD fetched from the proxy's hosts are implicitly allowed
to be used.  OTOH, PvDs obtained through DNS records or router
advertisements don't seem to have the same implied permission.

It seems to me that the philosophy to be used needs to be thought
through and stated clearly in all of the places I've referenced.

   Provisioning Domains (PvDs) are defined in
   [PVD] as consistent sets of network configuration information, which
   can include proxy configuration details Section 2 of [PVD].

The final "Section 2 of PVD]" reads awkwardly in this context, but I
expect the RFC Editor will provide a better form.

1.1.  Background

   Other non-standard mechanisms for proxy configuration and discovery
   have been used historically, some of which are described in
   [RFC3040].
   ...
   These common (but non-standard) mechanisms only support defining
   proxies by hostname and port, and do not support configuring a full
   URI template [URITEMPLATE].

To clarify to the reader how these mechanisms are non-standard, I
suggest amending "[RFC3040]" in the first paragraph to "the
informational [RFC3040]".

I would clarify these paragraphs by:  (1) Starting the text with
"Non-standard mechanisms..." because there is no previous standard
mechanism for "Other" to reference here.  (2) Changing "[RFC3040]" to
"the informational [RFC3040]" so the reader is clear that RFC 3040 is
not a standard.  (3) Replacing the period at the end of the first
sentence with a colon, to make it clear that the first sentence
introduces the next several sentences.  (4) Collapsing all of these
paragraphs into one, so that it's clear that the first sentence
governs the next four sentences.

   ... executing Javascript scripts, which are prone to
   implementation-specific inconsistencies and can open up security
   vulnerabilities.

Probably could be abbreviated to "...prone to implementation
inconsistencies and security vulnerabilities."

1.2.  Requirements

Since this section doesn't give requirements (indeed, the requirements
are informally described in sec. 1 and 1.1, probably better to title
this section "Requirements keywords" or something like that.

2.  Fetching PvD Additional Information for proxies

   In order to fetch PvD Additional Information associated with a proxy,
   a client issues an HTTP GET request for the well-known PvD URI
   (".well-known/pvd") as defined in Section 4.1 of [PVDDATA] and the
   host authority of the proxy.  This is applicable for both proxies
   that are identified by a host and port only (such as SOCKS proxies
   and HTTP CONNECT proxies) and proxies that are identified by a URI or
   URI template.  The fetch MUST use the "https" scheme.  By default,
   the fetch SHOULD use the standard port for HTTP over TLS (443) and
   the ".well-known/pvd" path.  However, both the port and the path MAY
   be overridden by local configuration policy on the client.

I had a hard time reading this.  It wasn't as clear about "HTTP
vs. HTTPS" as I think it should have been given the critical security
involved, and the use of ports was not as clear as I'd like.  I would
prefer something more like:

   In order to fetch PvD Additional Information associated with a proxy,
   a client issues an HTTPS GET request for the well-known PvD path
   ("/.well-known/pvd") (as defined in Section 4.1 of [PVDDATA]) to
   the host of the proxy and the standard HTTPS port (443).
This sentence is redundant but might be valuable for clarity:
   This is applicable for both proxies
   that are identified by a host and port only (such as SOCKS proxies
   and HTTP CONNECT proxies) and proxies that are identified by a URI or
   URI template.
This sentence seems to be redundant:
   The fetch MUST use the "https" scheme.
I would leave these sentences unchanged:
   By default,
   the fetch SHOULD use the standard port for HTTP over TLS (443) and
   the "/.well-known/pvd" path.  However, both the port and the path MAY
   be overridden by local configuration policy on the client.

As a technical matter, I think you want to require that the GET use
the absolute-URI form for the request-URI, that is, include the scheme
and authority, so that the PvD server knows the host name by which the
client is accessing it and can customize the PvD accordingly.

   It is not necessary for the client to re-fetch PvD Additional
   Information unless one of the following conditions occurs:

   *  The current time is beyond the "expires" value defined in
      Section 4.3 of [PVDDATA]

   *  A new Sequence Number for that PvD is received in a Router
      Advertisement (RA)

This statement is correct as far as it goes, but it only states the
requirements on the client implicitly.  What it really means is closer
to:

   A client MAY cache the information it obtained from PvD Additional
   Information, but it MUST discard cached information if:

   *  The current time is beyond the "expires" value defined in
      Section 4.3 of [PVDDATA]

   *  A new Sequence Number for that PvD is received in a Router
      Advertisement (RA)

2.1.  Discovery via HTTPS/SVCB Records

   this key generically provides a hint that PvD Additional Information
   is available, and can be used for use cases unrelated to proxies.

This would be clarified by

   this key generically provides a hint that PvD Additional Information
   is available, which may include information unrelated to proxies.

3.  Enumerating proxies within a PvD

   the proxies array
   describes equivalent proxies (potentially supporting other protocols)
   that can be used in addition to the known proxy.

In what sense are two proxies "equivalent" if they support different
protocols?

3.1.  Proxy dictionary keys

   Other optional keys can
   be added to the dictionary to further define or restrict the use of a
   proxy.

For clarity, this should forward-reference sec. 3.2:  "can be added to
the dictionary as described in sec. 3.2 to further".

   +==========+========+=============+=======+===========================+
   |JSON Key  |Optional|Description  |Type   |Example                    |
   +==========+========+=============+=======+===========================+

It if can be made to fit, it is nicer to have the column *values* be
"optional" and "required", rather than "Yes" and "No", requiring the
reader to refer to the column header to know the polarity of the
datum.

It might be too late to change this, but the word "proxy" isn't a good
word to use as a key because the word's meaning is so broad.  Given
the column header in table 2, a better key would be
"proxy_location_string" (or "proxy_locator"?).

It might be worth stating explicitly that port numbers are mandatory
in the "host:port" proxy location formats, as they are often omittable
in URIs.

3.2.  Proprietary keys in proxy configurations

   For example, "acme_tech_authmode" could be a proprietary key
   indicating an authentication mode defined by a vendor named
   "acme_tech".

You probably want s/acme_tech/example_tech/.  Also the vendor doesn't
actually have the name "acme_tech", but rather is probably "Example
Technology, Inc.":

   For example, "example_tech_authmode" could be a proprietary key
   indicating an authentication mode defined by a vendor named
   Example Technology.

4.  Destination accessibility information for proxies

4.1.  Destination Rule Keys

   Each key MUST correspond to an
   array with at least one entry.

Don't you want to say "Each key's value must be an array with at least
one entry."?

   Entries that include the wildcard prefix also MUST be treated as if
   they match an FQDN that only contains the string after the prefix,
   with no subdomain.

Simpler to say "Entries that include the wildcard prefix also
match an FQDN that only contains the string after the prefix, with no
subdomain."

   A
   trailing dot (".") at the end of a domain name is not required; the
   matching logic is the same regardless of its presence or absence.

"A string MAY have a trailing dot ("."); it does not affect the
matching logic."

4.2.  Using Destination Rules

My experience is that describing how rules like this work is
treacherous.  The temptation is to describe various features of the
rules and how they work, but often the reader must interpolate some of
the overall algorithm (especially the ordering of operations), leading
to divergent implementations.  I think a better approach would be to
give the algorithm explicitly so one can examine it holistically.  In
this case, something like:

    Given a PvD document, a destination host (IP address or DNS name),
    port, and proxy protocol (e.g., "connect-udp"), the client
    determines which proxies described by the document may be used for
    proxying by:

    Examine the dictionaries in the "proxy-match" value in order.

    If a value contains a proprietary key that the client does not
    understand, it skips the value and continues with the next.

    A rule matches the destination if all of the following are true:

    - if the proxies list is empty, or if at least one of the proxies
      in the proxy groups named by "proxies" has that as its
      "protocol" value,

    - if the destination is given by a host name and the "domains" value
      is present, then the host name matches one of the entries of
      "domains",

    - if the destination is given by an IP address and the "subnets" value
      is present, then the host name matches one of the entries of
      "subnets",

    - if the "ports" value is present, then the destination port matches
      one of the entries of "ports", and

    - if the tests specified by any proprietary keys that are present are
      satisfied.

    If a rule matches and the "proxies" value is empty, then this PvD does
    not allow any of its proxies to be used to reach the destination.

    If a rule matches and the "proxies" value is not empty, no further
    entries are examined.  This PvD allows the client to use any of
    the proxies in the proxy groups listed in the "proxies" value that
    have the matching "protocol" value; the proxy groups should be
    prioritized in the order they are listed, but the proxies within
    any one group are not prioritized.

This points out some complexities:  The client may possess several PvD
documents, and a single destination may obtain different proxies from
each one.  The assumption is that a destination is specified by either
a FQDN or IP address but not both, and filtering on each is done
separately -- and it's possible that the results of filtering on FQDN
may be different than filtering on IP address.

There is also the question of exactly how domain and IP address
filters interact.  Clearly if both are absent, then that part of the
match always passes.  But if an entry contains only "subnets" and the
connection request is specified by IP address, does that automatically
pass (since the client may not know the address)?  Conversely for
"domains".  And what if "subnets" and "domains" are both present, or
is that forbidden?  It may be that you want those rules to be:

    - if the "domains" value is present, then the destination is given
      by a host name and the host name matches one of the entries of
      "domains",

    - if the "subnets" value is present, then the destination is given
      by an IP address then the host name matches one of the entries
      of "subnets",

That change has the interesting consequence that if both "subnets" and
"domains" are present, the rule matches *no* connection attempts, as a
connection attempt specifies an FQDN or an IP address but not both.

There is an additional complexity:  If a rule matches regarding
subnet/domain/port/protocol but all of the proxies listed are ruled
out because their entries contain a "mandatory" value that the client
doesn't understand.  The rule doesn't provide the client with a usable
proxy, but also the rule seems to terminate rule testing because it
matches.  This corner case should probably be handled explicitly.

--

   In order to match a destination rule in the proxy-match array, all
   properties MUST apply.

Be careful to specify "all properties that are present MUST apply".

4.3.  Proprietary Keys in Destination Rules

   This handling applies uniformly across all match rules, including
   fallback rules.

What is a "fallback rule"?

4.4.  Examples

   In the next example, two proxies are defined with a separate
   identifier, and there are three destination rules:

I think you mean "two proxies are defined with distinct identifiers".

[END]



_______________________________________________
Gen-art mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to