Re: [TLS] ESNI PRs #124 and #125: Improving ESNI Robustness and GREASE

Mike Bishop Wed, 13 Feb 2019 17:01:59 -0800

Perhaps a better way to phrase it is that a server which successfully 
authenticates as the public_name but does not support ESNI has securely 
disabled ESNI for that origin, subject to the same rules as if it had supplied 
a new ESNI key (i.e. use for now, but don't persist).  Leave as an 
implementation detail heuristics to extrapolate that the current network will 
securely disable ESNI for all origins.

-----Original Message-----
From: TLS <tls-boun...@ietf.org> On Behalf Of Stephen Farrell
Sent: Wednesday, February 13, 2019 4:24 PM
To: David Benjamin <david...@chromium.org>
Cc: TLS WG <tls@ietf.org>
Subject: Re: [TLS] ESNI PRs #124 and #125: Improving ESNI Robustness and GREASE


Hiya,

Various bits'n'pieces below...

On 13/02/2019 23:55, David Benjamin wrote:
> Thanks for the comments! Responses inline.
> 
> On Wed, Feb 13, 2019 at 5:00 PM Stephen Farrell 
> <stephen.farr...@cs.tcd.ie>
> wrote:
> 
>>
>> Hiya,
>>
>> On 13/02/2019 22:15, Christopher Wood wrote:
>>>
>>>
>>> [1] https://github.com/tlswg/draft-ietf-tls-esni/pull/124
>>
>> I just re-read that. I've a question: why tie the version change to 
>> this PR and not the I-D rev? I'd prefer if we make the change to 
>> 0xff02 when -03 is published. (I don't care which PR does that, but 
>> wouldn't like to see every PR bump that number.)
>>
> 
> I bumped it because the PR was making a wire-incompatible change, but 
> batching them up to I-D revs makes sense too. No preferences here. 
> (Spec editors, do you care? I can remove that part of the change.)
> 
> 
>> Aside from that I think it's ok (though I may find other issues when 
>> implementing later) with two caveats that can be handled now or 
>> later:
>>
>> 1. The DNS RR structure still needs fixing but that is likely better 
>> handled separately. (By "fixing" I mean a new RR type but also that 
>> it needs to be designed for DNS admins as well and not only for TLS 
>> implementers which is the current case.
>> (I'm not saying what might or might not make it more dnsops friendly, 
>> just that that needs to be checked/done sometime.)
>>
> 
> I'm guessing this is to do with the various discussions around 
> multi-CDN and whatnot? That seems orthogonal to that PR. (I don't 
> personally have particularly strong opinions or experience there.)

Not just that. ESNIKeys is a fairly complex structure that has a number 
sets/sequences of values. There are various possible deployment models. The 
current structure assumes that it's ok to encode all that stuff up in one TLS 
style blob and then put that in the DNS. That might not make sense for some 
deployments in particular where the DNS operator is not closely tied to the TLS 
server admin or where there's
1 DNS operator and N TLS server admins (for the same origin).
And then there's the use of the TXT RR. But I agree this is orthogonal - I was 
mostly saying that in agreeing with this PR I continue to think the DNS RR 
stuff needs other work.

> 
> 
>> 2. I don't think the text below ought be included, it's not the job 
>> of this WG to design MITM attacks. (Or maybe I've misinterpreted it?)
>>
>> "The public name, however, makes this protocol compatible with 
>> deployments that use correctly-implemented MITM proxies. If the 
>> client has cached an ESNIKey for the origin server, the MITM proxy 
>> will process the cleartext SNI field and terminate a connection to 
>> the public name instead. If the client is configured to trust the 
>> proxy's certificate, it will accept the connection as valid for the 
>> public name and retry with ESNI disabled."
>>
> 
> That text isn't prescribing any particular behavior. 

Then delete it:-)

> It's just describing
> the effect the rest of the text has on "trusted MITM proxies" and the like.

Disagree. "Just describing" does not match the use of a meaningless marketing 
term like "trusted blah proxy" sorry. Better to avoid all that fuss really, no?

> Note this is specifically about "MITM proxies" which control a root 
> certificate that the client is configured to trust. I will refrain 
> from commenting on whether this deployment model is at all wise, but 
> it does exist (antivirus, etc). The text itself is just an updated 
> version of existing bits here:
> https://tools.ietf.org/html/draft-ietf-tls-esni-02#section-6.2

I don't recall that having a concept of it being possible to "correctly" 
implement breaking the TLS protocol? (As is inherent in being an MITM.)

> 
> As for where this effect comes from, it's a consequence of the 
> rollback and partial deployment robustness mechanism in the PR:
> 
>> If the server negotiates an earlier version of TLS, or if it does not 
>> provide an "encrypted_server_name" extension in EncryptedExtensions, 
>> the client proceeds with the handshake, verifying the certificate 
>> against ESNIKeys.public_name as described in {{verify-public-name}}. 
>> The client
> MUST
>> NOT enable the False Start optimization {{RFC7918}}. If the handshake
> completes
>> successfully, the client MUST abort the connection with an
> "esni_required" alert.
>> It then SHOULD retry the handshake with a new transport connection 
>> and
> ESNI
>> disabled."
> 
> Or, in other words, IF the server is able to speak authoritatively for 
> the public name (relatively to the client's trust anchors) but appears 
> to not understand ESNI at all, that is a signal to the client that 
> ESNI got turned off or hasn't fully been turned on yet. ESNI involves 
> client state and, without this, it's risky for a server operator to 
> deploy ESNI. Deployments are complex systems, and complex systems 
> react to change unpredictably. For most changes, if some show-stopping 
> problem is found, it is safe to rollback the change. For instance, 
> clients naturally tolerate servers turning TLS 1.3 on and off, a fact 
> I'm sure every early adopter of TLS 1.3 has used many times. ESNI, a 
> priori, breaks this property. The intent of this text is to restore 
> it, without totally breaking it (the stop signal is authenticated 
> under the public name, the same standard this PR sets for replacing the key).

It may be ok to live with such fallback whilst ESNI is being introduced but I 
think ongoing support for it would be unwise.
And such a fallback requires no discussion of MITM attacks.
(Other than as an attack.)

> 
> It turns out that a user moving in and out of one of these "trusted MITMs"

I object to that term again, sorry:-) It's marketing meaninglessness.

> looks the same as a rollback, so we get robustness to that scenario too.
> The current draft attempted to handle this case too, but did so with 
> extra machinery. From the link above:
> 
>>   A Web client client can securely detect case (2) because it will
>>   result in a connection which has an invalid identity (most likely)
>>   but which is signed by a certificate which does not chain to a
>>   publicly known trust anchor.  The client can detect this case and
>>   disable ESNI while in that network configuration
> 
> Here, there is no name check, so any publicly known root that the 
> client trusts can be used to trigger that fallback, even if it's just, 
> say, an intranet server and not a MITM proxy. The PR fixes this by 
> checking the public name. It only counts as a signal if the server can 
> speak for the service's public name, not any arbitrary name.
> 
> I also don't think I'd implement that fallback in my openssl
>> fork. Earlier text says to complete the h/s but to not make that 
>> visible to the application layer and the above seems to conflict with 
>> that. I could hold my nose were that in -03 as it has no effect 
>> really, but I'll whine about it later for sure;-)
>>
> 
> I'm guessing you're referring to this?
> 
>> Note that verifying a connection for the public name does not verify 
>> it
> for the
>> origin. The TLS implementation MUST NOT surface such connections as
> successful to
>> the application.
> 
> That just says you can't report it as successful. 

"MUST NOT surface" seems like bad phrasing then.

> Whether you report it to
> the application at all depends on your API shape. For a low-level TLS 
> library that doesn't create transport connections, I'm envisioning it 
> report failure with a special error code, which higher-level logic can 
> interpret as a retry.

That would be ok. Saying that requires no discussion of MITM attacks.
(My openssl code would actually make that visible to the application if the 
application called a status API, but the TLS h/s would have failed from the 
application POV even if the h/s looks like it worked on the wire.)

> (This is how we handle 0-RTT rejects in BoringSSL.
> 0-RTT rejects affect connection usage, such as if you thought you were 
> speaking HTTP/2, but the server rejected and then negotiated HTTP/1.1. 
> It's ultimately a very funny retry.) For higher-level code that knows 
> how to make transport connections, one option then is for the retry to 
> be handled more transparently.
> 
> 
>>> [2] https://github.com/tlswg/draft-ietf-tls-esni/pull/125
>>
>> I like this one. But the "SHOULD send" and "MUST pad to 260"
>> seem a bit OTT to me, though I could live with 'em for now.
>> So consider these nits, not objections:
>>
> 
> (260 is just a SHOULD rather than a MUST. More below.)
> 
> 
>> - Rather than "SHOULD send", it may be better to say something like 
>> "SHOULD frequently send" and leave it to clients to decide how often 
>> to grease. Wouldn't that have the same effect?
>>
> 
> Is there any reason not to just do it in every ClientHello?

I'd guess that some implementers may be put off by the wasted bandwidth. Others 
won't care but if it's the case that sending a bogus ESNI x% of the time gets 
as good a result as sending all the time (which is how I interpret what the 
SHOULD calls for) then it seems better to encourage x% and not 100%.

> 
> 
>> - Maybe add that even TLS1.3 clients that don't really do ESNI MAY 
>> grease ESNI? (A bit weird but hey why not?:-)
>>
> 
> Indeed! The PR currently says:
> 
>> If the client attempts to connect to a server and does not have an
> ESNIKeys
>> structure available for the server,

"The client" in almost all IETF specs means a thing that really implements the 
spec. So in this case, yeah, I think it might be good to say that a TLS client 
implementer who only has time to grease and not do the real thing is still 
doing good overall by just doing that. (That said, I've no idea if there'd ever 
be such an implementer, though when we did DKIM there were many mail receivers 
who parsed but didn't actually check signatures - maybe mail implementers are 
more slapdash though:-)

> 
> Knowing what ESNI is but not really doing it I think counts as "does 
> not have an ESNIKeys structure available". Do you think it should be 
> called out more explicitly?
> 
> 
>> - Padding to 260 is IIRC the max that works. A CH with such padding 
>> is ~600 bytes. Seems like a waste of bytes to me and it could stick 
>> out depending on real ESNI padding_lengths.
>> Maybe consider something like "pad to a length that matches recent 
>> traffic" and leave it to clients to figure out what might stick out 
>> less. (I'm not sure every server will ask for padding to 260 as CF 
>> do;-)
>>
> 
> 260 comes from elsewhere which says:
> 
>> The length to pad the ServerNameList value to prior to encryption.
>> This value SHOULD be set to the largest ServerNameList the server 
>> expects to support rounded up the nearest multiple of 16. If the 
>> server supports wildcard names, it SHOULD set this value to 260.

Yeah, and I disagree with that too:-)

> I figured that assuming, by default, that the server supports wildcard 
> names seemed reasonable so the text mirrored the other SHOULD. Though 
> maybe that's a bit much? I don't have strong feelings here. What do 
> you think the text should say? Perhaps it should pick a random 
> multiple-of-16 length between 64 and 260? Or maybe we should decrease 
> the wildcard value... a 260-byte SNI name is kind of excessive, honestly.

Regardless of the above disagreement about 260, if a client has been sending a 
bunch of CH's including real ESNI that are about
400 bytes because some server's ESNIKeys says to pad to 160, and that client 
then sends 600 bytes CH's to everyone else, I think it'd be crystal clear which 
is grease and which not.

Cheers,
S.

> 
> David
> 
_______________________________________________
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls

Re: [TLS] ESNI PRs #124 and #125: Improving ESNI Robustness and GREASE

Reply via email to