[TLS] Re: Trust Anchor IDs and PQ

David Benjamin Tue, 04 Feb 2025 16:56:13 -0800

Thanks for the thoughts!

> To that end, perhaps it's most useful to focus in on the post-quantum
case, as I think that's the one that the WG finds most compelling.

That's certainly not the use case I find most compelling. It's one among a
class of PKI scenarios, just as PQ is not the only reason to do NamedGroup
negotiation. From both feedback we've gotten about the draft, and the
interim discussion where the WG decided to take on the problem, my sense is
there's interest in a broad space of problems that this building block
applies to.

But to focus on just this narrow slice of why trust anchor negotiation is
important:

> This means that there (again simplifying) there are at least four kinds
of clients. [...]

Yeah, I think this is a good division, though I would add a bit of color:

A client may be type (2) for some trust anchors and type (3) for others.
CAs may not spin up PQ roots at the same time, clients may not decide to
trust (i.e. become vulnerable to) them at the same rate, etc. In the long
run, there may also be CAs for which T_i (traditional) doesn't exist, only
Tp_i (PQ).

So when we talk about a "type (3)" client, we're really talking about one
aspect of that client.

>  It allows servers to distinguish clients type (2) and (3), so that they
can elide the extra certificate for type (3).

It's a bit more than that. I think modeling this scenario as intermediate
*elision* starts from a baseline that isn't actually the right state for
anyone. In trust anchor negotiation, the intermediate isn't compressed out.
It isn't involved at all.

Ultimately, the goal in the transition would be to chain up to some Tp_i
when the client supports it, and some T_i when the client does not. Like
you say, this is true even for type (3) clients. (Even if the type (3)
client is still vulnerable to T_i, serving Tp_i is valuable. We can look to
other algorithm transitions for why this is important[1][2].)

Now, if the client is of type (2), sending T_i -> Tp_i -> ... is not
useful. It does not model the cost of PQ, because the extra T_i -> Tp_i
certificate costs another PQ key. Nor has it helped the server's transition
because that client doesn't support its choice of PQ CA. The server may
even be misled into thinking it has completed the transition for that
client, when it hasn't. The server has no way to know that the cross-sign
was load-bearing. It would be better to simply send that client a purely
traditional chain, which the server probably has available because clients
of type (1) still exist and will not support that anyway.

Likewise, sending that chain to clients of type (3) is bad because now you
waste a whole PQ key's worth of bytes. The true client categorization is
(1+2) vs (3+4), not the (1) vs (2+3+4) you get from sigalgs. The cross-sign
comes from misclassifying (2) as (3)-ish and needing to inefficiently
correct for it.

Now, a CA might issue that cross-sign anyway. As long as some subscriber
cannot correctly classify some client, the cross-sign can work around the
misclassification. But that's ultimately a workaround. To take that
workaround as a baseline and then "compress" it out means that everyone
(servers, type (3) clients, and type (4) clients) in the ecosystem must be
aware of *all* intermediates in the form of pre-distributed compression
information. Per-connection data is more expensive than per-endpoint data,
but the latter isn't free and adds up. E.g. consider a more
resource-constrained, non-browser client with size limits on its TLS stack.

Trust anchor negotiation classifies (2) vs (3) correctly from the
beginning, so the cross-sign isn't involved at all.

> Have I missed other important value propositions for TA for the PQ
transition?

Still only focused on this very narrow slice of PQ-specific use cases, I'll
add a couple more :

- Post-transition, but still PQ, there is the common practice of going from
long-lived key to medium-lived key (same operator and same algorithm) in
the form of an intermediate. This particular use case is a more direct
duality between trust anchor negotiation and intermediate elision, because
we can generally expect clients to recognize both keys. Even so, solving it
with trust anchor negotiation avoids needing to ship extra expensive PQ
signatures because trust anchors don't need signatures. (See RFC 5914 and
draft-davidben-x509-alg-none for two ways to do that.)

- Depending on what algorithms we end up having available (we're navigating
some tricky trade-offs with PQ sigalgs), it also enables other strategies
with signature schemes that trade off small signatures for large keys. I
sketched out that idea here:
https://github.com/davidben/tls-trust-expressions/blob/main/pki-transition-strategies.md#trust-anchor-negotiation-with-parallel-issuance-2

- This TA work was extracted out of Merkle Tree Certificates, where trust
anchor negotiation is a critical dependency. It's extracted because it's
applicable to more than just MTCs, but what I've learned working on this
over the last while is that this problem is practically universal among PKI
systems. Systems that maintain trusted entities are necessarily dynamic. Of
course, while nice and tidy, there's no hard requirement that
X.509-trust-anchor-negotiation and MTC-trust-anchor-negotiation (or
CT-log-negotiation) use the same mechanism. The important thing is that the
problems are analogous, with the same ecosystem ramifications, so the
thinking was to focus the WG discussion on the X.509 version, where the
need is already very clear, and the problem space most concrete. Beyond
just MTCs, I think it's clear there will be a lot of ideas to explore here,
and I don't think we'll get away from *some* kind of lists of trusted
entities and all that that entails.

David

[1] When we dropped SHA-1 X.509 signatures and later SHA-1 TLS signatures
(RFC 9155) in Chrome, it was crucial that servers would pick SHA-2 over
SHA-1 during the period where the client supported both. Without it, we
cannot monitor anything, make sure the SHA-2 paths work, etc. Yet, from a
purely performance perspective, it makes sense for the server to prioritize
SHA-1. Indeed there were SHA-1-preferring servers out there, and they
indeed complicated the transition. This was for an easy transition to just
swap out a hash, done so late that SHA-2 support was practically
ubiquitous. The PQ transition is dramatically more complex, and I do not
think we can bank on reaching the same levels of ubiquity.

[2] Regarding the HSTS analogs, that seems orthogonal to the discussion of
negotiation, and more for the application than TLS itself. Dynamic HSTS, in
particular, has storage requirements and application-specific privacy
implications as a result. From experience working with both static and
dynamic HSTS, Bas's sketch of certificate-based design is much more
compelling.

On Sat, Feb 1, 2025 at 1:01 PM Eric Rescorla <e...@rtfm.com> wrote:

> Starting a new thread to keep it off the adoption call thread.
>
> I'm still forming my opinion on this topic. To that end, perhaps it's
> most useful to focus in on the post-quantum case, as I think that's
> the one that the WG finds most compelling. This message tries to work
> through that case and the impact of TAI.
>
> I apologize in advance for the length of this message, but I wanted
> show my thinking, as well as make it easier to pinpoint where I may
> have gone wrong if people disagree with this analysis.
>
>
> CURRENT SETUP
> Here's what I take as the setting now:
>
> 1. We have a set of existing CAs, numbered, 1, 2, 3...
> 2. CA_i has a trust anchor TA_i which is embedded in clients and then
>    used to sign an intermediate certificate I_i.
> 3. Servers have end-entity certificates signed by intermediates,
>    so we can denote server s's certificate signed by CAI i as
>    EE_s_i. The chain for this certificate is (proceeding from the
>    root): T_i -> I_i -> EE_s_i
>
> These all use traditional algorithms (assume there's just one
> traditional algorithm for simplicity).
>
>
> ADDING PQ
> When the CA wants to roll out PQ certificates, the following happens.
>
> 1. It generates a new separate PQ trust hierarchy, that looks like:
>    Tp_i -> Ip_i -> EEp_s_i.
> 2. It cross-signs its own PQ trust anchor with its traditional trust
>    anchor.
>
> So abusing notation a bit, a server would have two certificate chains:
>
> - Traditional: T_i -> I_i -> EE_s_i
> - PQ:          T_i -> Tp_i -> Ip_i -> EEp_s_i
>
> Note that I'm assuming that there's just one CA, but of course
> there could be two CAs, in which case the chains will be entirely
> distinct:
>
> - Traditional: T_i -> I_i -> EE_s_i
> - PQ:          T_j -> Tp_j -> Kp_j -> EEp_s_j
>
> This actually doesn't matter (I think) for the purposes of this
> analysis because the server can only send one EE cert.
>
>
> CERTIFICATE CHAIN NEGOTIATION
> When the client connects, it signals which algorithms it supports in
> signature_algorithms. The server then selects either the traditional
> chain or the PQ chain and sends it to the client depending on the
> algorithm. This is how we've done previous transitions so there
> shouldn't be anything new here.
>
> The entire logic above is rooted in trusting whatever traditional
> algorithm is in T_i. But the reason we want to deploy PQ certificates
> is not for efficiency (as with EC) but because we want to stop
> trusting the traditional algorithms. We do that by a two-step process
> of:
>
> 1. Clients embed Tp_i in their trust list.
> 2. At some point in the (probably distant) future, they just deprecate
>    support for existing traditional trust anchors.
>
> This means that there (again simplifying) there are at least four kinds of
> clients.
>
> 1. Trust T_i. No PQ support.
> 2. Trust T_i. Traditional and PQ support.
> 3. Trust T_i and Tp_i. Traditional and PQ support.
> 4. Trust Tp_i. No traditional support.
>
> However, the server only gets the "signature_algorithms" extension,
> which looks like so:
>
>               Client Status                 signature_algorithms
>     Algorithms               Trust Anchors
>     --------------------------------------  --------------------
> 1.  Traditional              T_i            traditional
> 2.  Traditional + PQ         T_i            traditional + pq
> 3.  Traditional + PQ         T_i + Tp_i     traditional + pq
> 4.  PQ                       Tp_i           pq
>
>
> Cases (1) and (4) are straightforward, because the server only has one
> option. However, the server can't distinguish (2) and (3). There are
> two possibilities here:
>
> * The server wants to use a traditional certificate chain (e.g.,
>   for performance reasons). In this case, there isn't an issue
>   wants to use a traditional certificate because it can just send
>   traditional chain.
>
> * The server wants to use a PQ chain. In this case, because it
>   can't distingish (2) and (3) it has to send the cross-signed Tp_i,
>   even though the client may already have it.
>
> On the more global scale, the server has no way of measuring how many
> clients trust Tp_i, and so isn't able to determine when it's largely
> safe to unilaterally elide T_i when using the PQ chain. Note that the
> server *can* determine when it's safe to stop presenting a traditional
> EE cert at all by measuring the rate at which clients offer PQ
> algorithms in signature_algorithms, because those clients are either
> type (2) or type (3) and will in any case accept the longer chain.
>
> As far as I can tell, none of this is relevant to the question of
> security against quantum computers, because what provides that
> property is that clients refuse to accept traditional algorithms at
> all (type (4)), which is easily determined from signature_algorithms.
>
>
> TRUST ANCHOR IDENTIFIERS
> As far as I can tell, TAI changes the situation in two main ways:
>
> 1. It allows servers to distinguish clients type (2) and (3), so that
>    they can elide the extra certificate for type (3). This is
>    effectively an alternative to the approach provided by
>    draft-ietf-tls-cert-abridge (I see that S 7.5 provides
>    a comparison of these mechanisms, but I'm not going to get
>    into detail in this message).
>
> 2. It allows clients to safely force the server to offer a PQ chain
>    even if the client actually is type (3). Normally it wouldn't be
>    safe to only advertise PQ algorithms in signature_algorithms, but
>    if the server advertises a PQ TA, then the client can safely
>    provide only that TA in the ClientHello while offering a wider set
>    of TAs to other servers. This can also be used on the client
>    side to measure PQ support on servers.
>
> Have I missed other important value propositions for TA for the PQ
> transition?
>
> Thanks,
> -Ekr
>
>
>
>
>
> _______________________________________________
> TLS mailing list -- tls@ietf.org
> To unsubscribe send an email to tls-le...@ietf.org
>

_______________________________________________
TLS mailing list -- tls@ietf.org
To unsubscribe send an email to tls-le...@ietf.org

[TLS] Re: Trust Anchor IDs and PQ

Reply via email to