Re: [TLS] Separate APIs for 0-RTT

Benjamin Kaduk Wed, 14 Jun 2017 18:29:47 -0700

On 06/14/2017 02:55 PM, David Benjamin wrote:
> On Wed, Jun 14, 2017 at 2:17 AM Petr Špaček <petr.spa...@nic.cz
> <mailto:petr.spa...@nic.cz>> wrote:
>
>
>
>     On 13.6.2017 22:55, Ilari Liusvaara wrote:
>     > On Tue, Jun 13, 2017 at 06:57:05PM +0000, Andrei Popov wrote:
>     >> Regarding RFC language, I think we could be more specific:
>     >>
>     >>
>     >>
>     >> 1. A TLS implementation SHOULD/MUST only send 0-RTT application
>     data if the application has explicitly opted in;
>     >>
>     >> 2. A TLS implementation SHOULD/MUST only accept 0-RTT
>     application data if the application has explicitly opted in;
>     >>
>     >> 3. When delivering 0-RTT application data to the application, a
>     TLS implementation SHOULD/MUST provide a way for the application
>     to distinguish it from the rest of the application data.
>     >
>     > First of these has to be MUST, or you get problems like I outlined
>     > earlier.
>     >
>     > And to implement checking for client only sending "safe" data,
>     you need
>     > the second and third.
>
>     I support MUST for the three points above.
>
>
> The third one is not practical as one moves up layers. Instead, I
> believe we can get the same benefit with a simpler signal.
>  
> TLS fundamentally is a transformation from a vaguely TCP-like
> transport to another vaguely TCP-like transport. Consider TLS records:
> TLS could have decided record boundaries were meaningful and
> applications can use it in their framing layers, but instead TLS
> exposes a byte stream, because it intentionally looks like TCP.
>


It is good to keep this in mind, yes.

> Of course, 0-RTT unavoidably must stretch the “vaguely”. Suppose there
> is a semantically meaningful difference between 0-RTT and 1-RTT data.
> I can see why this is attractive. It moves the problem out of TLS. But
> this signal is pointless if applications don’t use it. If everyone did
> the following, we haven’t solved anything:
>  
> if (InEarlyData()) {
>   return EarlyDataRead(...);
> } else {
>   return NormalRead(...);
> }
>  

Sure, that's not very useful.
But, you can also come at it from a different point of view, namely that
right now, all applications just do NormalRead(), and EarlyDataRead()
for all protocols is simply not defined right now.  It "doesn't exist",
as it were.  We can of course learn from the ongoing experiments that
are using EarlyDataRead() for http, as part of designing what the actual
formal semantics should be.

> So we must consider this signal’s uses. Consider HTTP/2. The goal may
> be to tag requests as “0-RTT”, because we wish to reject 0-RTT POSTs
> or so.
>  
> What if the server receives data with the 0-RTT boundary spanning an
> HTTP/2 frame? Is that a 0-RTT request? 1-RTT? Invalid? If I’m parsing
> that, I have to concatenate, and we’re back to that if/else strawman
> above. HTTP2 is arguably an easy case. Maybe my protocol is a
> compressed stream. Carrying a low-level byte boundary through layers
> of application data parsing and processing is not practical.
>  

I think it would be valid to call it a 0-RTT request and reject it, and
also valid to call it a 1-RTT request and handle it -- it can be at the
implementation's discretion, and I don't know that TLS itself needs to
specify which.

> We could say that the application profile should modify the protocol
> to reject such cases. Now, we’re taking on complexity in every
> protocol specification and parser.
>  

But we're *already* taking on complexity in every protocol spec, by
requiring an application profile.  If you start from the assumption that
there is just one data stream and a transition point, then yes, this is
a lot of complexity on each application.  But I see the starting point
as that there only is NormalRead(), and adding EarlyDataRead() requires
a protocol extension and implementations of that extension.

So, we should be talking about how much complexity is required, not just
complaining that we can't drop ReadEarlyOrNormal() in wherever we
currently have NormalRead().

> It also brings complexity on the sender side. Perhaps I am halfway
> through writing an HTTP/2 frame and, in parallel, I receive the
> ServerHello. We moved 0-RTT data out of a ClientHello extension way
> back in draft -07 so 0-RTT data can be streamed while waiting for
> ServerHello. This is especially useful for HTTP/2 where reads and
> writes flow in parallel. This means the sender must synchronize with
> the TLS socket to delay the 1-RTT transition.
>  
> Now suppose the TLS stack receives that atomic data and it doesn’t fit
> in the 0-RTT send limit. That won’t work, so the sender must query the
> early data size and send over 1-RTT if it doesn’t fit.
>  
> Now suppose this HTTP request takes multiple frames to send. One can
> send multiple HEADERS frames in HTTP/2. That won’t work, so we
> actually need the synchronization to cover the entire request. Maybe
> my request has a response body. We need to cover that too, and we need
> to know the size for send limit purposes.
>  
> Now suppose assembling the HTTP request takes an asynchronous step
> after connection binding. Maybe I’m signing something for tokbind.
> Maybe I have some weird browser extension API. In this worldview, all
> the while, the HTTP/2 multiplexer must lock out all other requests and
> the client Finished, even if the ServerHello has already come in. That
> last one is particularly nasty if the server is delaying an
> already-sent request until the 1-RTT point (cf.
> https://www.ietf.org/mail-archive/web/tls/current/msg21486.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ietf.org_mail-2Darchive_web_tls_current_msg21486.html&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=sssDLkeEEBWNIXmTsdpw8TZ3tAJx-Job4p1unc7rOhM&m=3V6ikOaK-5gWCk_BRcwmJlyqMlEcsRJK52cDehFYT3c&s=LNWu_RYl22MMryxWXGi7Srs23cM0aZzmfiOGWNn8D0k&e=>).
>  
> Perhaps I keep the request assembling logic, HTTP/2 multiplexers, and
> TLS sockets in different threads or processes. Now this
> synchronization must span all of these. As one adds layers, the
> complexity grows.
>  

Yes, this hypothetical situation is probably an unreasonable amount of
complexity.  But it presumes some sense of what HTTP/2 will want to
permit in early data, a concept which currently is formally undefined --
if HTTP/2 *really* wants to keep requests from splitting the 0/1-RTT
boundary, they could define some extremely restricted subset of
functionality that only allows a single self-contained frame in early
data.  I don't expect that to happen, but it would be a valid
application profile as far as TLS is concerned.

One might also consider an RPC protocol that wants to switch to using
TLS 1.3 for security benefits.  It would be natural for the application
profile for that to say that only certain RPCs are permitted in 0-RTT,
and a given marshalled request must not cross the 0/1-RTT boundary. 
This is clearly a silly example, as it's not really using a "TCP-like
stream" of data from TLS, but it is arguably less complex to have the
strict separation of semantics between 0- and 1-RTT data.
On the other hand, I'm not actually convinced that there are any
security benefits gained from the strict separation, so it remains at
the core an argument on aesthetics, and not much different from the
arguments that have been presented so far.

> The root problem here is we’ve changed TLS’s core abstraction. One
> might argue all this complexity is warranted for 0-RTT. We are trying
> to solve a problem here. But I think we can solve it simpler:
>  
> Our problem is a server wishes not to process some HTTP requests (or
> other protocol units) at 0-RTT and needs to detect this case. So check
> a boolean signal for whether the connection has currently passed the
> 1-RTT point before any unsafe processing. A “1-RTT request” will
> always return 1-RTT. A “0-RTT request” will usually return 0-RTT, but
> if it spans the boundary or the processing pipeline was just slow,
> perhaps we don’t query until after the client Finished. That’s
> actually fine. We get the replay protection we need out of the client
> Finished.
>  

I agree with Andrei that this scheme satisfies point #3 (or that if it
does not, we should rewrite condition #3 to make it a satisfactory
option).  Your proposal later in the thread seems about as good as the
original #3 to me (though I might wordsmith either a bit more).

> This solves our problem without the complexity. Two APIs or an
> explicit boundary is one way to expose this boolean, but this boolean,
> unlike the hard boundary, is much easier to carry across higher-level
> protocol layers, and does not imply impractical sender constraints.
>

It is easier to carry across layers, yes.  I don't think the TLS spec
should specify one of the two options under discussion (or a different
one); we should describe the properties that need to be available for
the application to make safe decisions.

I suspect that what Brian was trying to say is that this single boolean
being easier to carry across layers may also mean it is easier to write
code that ends up being unsafe.  I don't know; I don't think anyone
*can* know, and all we have is supposition and gut instincts.  I'm very
worried that no matter what APIs libraries offer, some high-profile
application/implementation is going to do something insecure and we have
another named vulnerability.  I'm also not sure that there's any useful
we discussion we can have right now on that front, since there's not
really hard data.  The closest we have seems to be Colm's security
analysis of 0-RTT data (and it would be great to have more such
analysis), which leaves me thinking that we should strictly disallow
"unlimited" (millions) of replays and set some specific limit.  (We'd of
course have to argue about what that limit is, which would not be fun,
but probably would be better than millions of replays.)


If I could try to synthesize something concise from these discussions,
it seems that a balance of flexibility and simplicity is to provide
semantics that let an application write data with a boolean to ndicate
that either it must be 1-RTT or the stack should feel free to use either
0- or 1-RTT as it sees fit, and to read data with a boolean to indicate
"whether it was 0-RTT or 1-RTT", most likely based on whether the client
Finished has been verified but potentially based on record-layer
information.

Circling back to my earlier question about how much complexity burden we
place on application profiles and implementations, this seems not too
bad.  An application currently uses a write API that is presumed to be
"must be 1-RTT" until changed, and locations writing 0-RTT-safe data can
switch to using a new API in a targetted fashion, leaving the TLS stack
to hold on to 0-RTT-unsafe data (and all subsequent data) until the
handshake is finished.  The reading story is a little less simple, as
the current API doesn't really have the requisite semantics, so the
application is living in a gray zone.  It seems that a prerequisite for
making the API call that says "enable 0-RTT on this server socket" is to
convert all read calls to the new form and reject 0-RTT versions of
structures that are not permitted by the profile, the converse of the
write side.  But, it seems that doing anything less is unsafe, and
perhaps this is the minimum viable level of complexity.  I think it at
least addresses the concerns you raise in the context of your HTTP/2
example.  Libraries and applications will of course be free to use
separate functions for read/write-early/regular if they want, but I
think it is fine for TLS to structure the requirements on
implementations in such a way that the simpler API is permitted.


It is probably also useful to provide guidance to authors of application
profiles for how to incorporate 0-RTT data into their protocol.  That
could be something like "philosophically, TLS provides a TCP-like stream
of data to the application, but until the client Finished is verified
there are weaker security properties than afterwards, and certain
classes of traffic should be deferred until after that point".  It could
also be something like "TLS provides two qualitatively different streams
of data with different security properties; one has a finite length
limit dictated by the server and should only receive a limited class of
protocol messages of bounded size, and the other resembles historical
TLS and is safe for all traffic."  I don't actually know what consensus
we could come to on that point, or if the chairs are interested in
running a contentious consensus call on that question -- after all, a
TLS 1.3 spec could be considered complete without such guidance.  But,
leaving it out might just lead to the argument being rehashed over and
over again for each protocol.  The deciding point is probably whether we
want to ensure that we can support complicated protocols that will not
want to limit themselves to the stringent requirements of the latter
formulation, such as your HTTP/2 example above.  If we can do that
safely (in terms of not leaving applications wide open to future
security holes), we would end up in the first (single-stream with state
change) option.  I'm just not sure how I can be comfortably convinced
that we can do so safely.  Right now, I'm something like uncomfortably
mostly convinced, which is not a very pleasant feeling.  In a world of
millions of replays, maybe it's safer to tell applications that 0-RTT is
a distinct separate thing that needs to go in its own bucket.

-Ben

_______________________________________________
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls

Re: [TLS] Separate APIs for 0-RTT

Reply via email to