I wanted to start a separate thread on this, just to make some small aspects of replay mitigating clear, because I'd like to make a case for TLS providing a single-stream, which is what people seem to be doing anyway.
Let's look at the DKG attack. There are two forms of the attack, one is as follows: "Client sends a request with a 0-RTT section. The attacker lets the server receive it, but suppresses the server responses, so the client downgrades and retries as a 1-RTT request over a new connection. Repeating the request". In this case, server-side signaling such as the X-header trick doesn't work at all. But thankfully this attack is equivalent to an ordinary socket interference attack. E.g. if an attacker today suppressed a server response to a HTTP request, then the client will do its retry logic. It's the same, and I think everything is compatible with today's behavior. Next is the more interesting form of the attack: "Client sends a request with a 0-RTT section. For some reason the server can't reach the strike register or single use cache, and falls back to 1-RTT. Server accepts the request over 1-RTT. Then a short time later, the attacker replays the original 0-RTT section." In this case, server side signaling to the application (such as the neat X- header trick) also doesn't work, and is not backwards compatible or secure by default. It doesn't work because the server application can't be made idempotent from "outside" the application, so any signaling is insufficient, and is equivalent to the Exactly-Once message delivery problem in distributed systems. Since a request might be retried as in case 1, it needs an application-level idempotency key, or a delay-and-retry strategy (but replay will break this). There's some detail on all this in the review. End result is that a server-side application that was never designed to reach duplicates may suddenly be getting exactly one duplicate (that's all the attack allows, if servers reject duplicate 0-RTT). What is actually needed here, I think, is client-side signaling. Careful clients need to be made aware of the original 0-RTT failure. So for example, an SDK that writes to an eventually consistent data store may treat any 0-RTT failure as a hard failure, and *not* proceed to sending the request over 1-RTT. Instead it might wait its retry period, do a poll, and only then retry the request. If the TLS implementation signals the original 0-RTT failure to the client, as if it were a connection error, everything is backwards compatible again. Well mostly; to be properly defensive, the client's retry time or polling interval needs to be greater than the potential replay window, because only then can it reason about whether the original request succeeded or not. If there is a strict maximum replay window, then this behavior is enforceable in a TLS implementation: by delaying the original failure notification to the client application by that amount. Of course browsers won't do this, and that's ok. Browsers have decided that aggressive retries is best for their application space. But careful clients /need/ this; and it's not just about backwards compatibility. It is a fundamental first-principles requirement for something that uses an eventually consistent data store. We can say don't use 0-RTT, but that's not practical, for reasons also in the review. So if we want to fully mitigate DKG attacks, I think it is useful to hard cap the replay, say that it MUST be at most 10 seconds. And then worst case, a client that needs to be careful can wait 10 seconds. Note that the TLS implementation can do this on the client's behalf, by inserting a delay. Of course that means that for these kinds of applications, this means that 0-RTT delivers speed most of the time, but may occasionally slow things down by 10 seconds. I think that's an ok trade-off to make for backwards compatibility. But it also has implications for middle-boxes: a TLS reverse proxy needs to either not use 0-RTT on the "backend" side, or it needs to use it in very careful way; accepting 0-RTT from the original client, only if the backend also accepts a 0-RTT section from the proxy. This is to avoid the case where the client can't reason about the potential for a replay between the proxy and the backend. It's doable, but gnarly, and slows 0-RTT acceptance down to the round trip between the client and the backend, via the proxy. That's one reason why the review suggests something else too: just lock careful applications out, but in a mechanistic way rather than a "good intentions" way, by having TLS implementations *intentionally* duplicate 0-RTT sections. O.k. so all of that the above might be a bit hairy: but I want to take away from it at this stage is that splitting the early_data and application_data at application level isn't particularly helpful; the server-side can't really use this information anyway, because of the Exactly-Once problem. Client side signaling does help though, and a simple safe-by-default mechanism there is to behave as if the connection has failed, but after writing the first section of data. E.g. in s2n this would be ... conn = s2n_connect(); // Client makes a connection r = s2n_write(conn, "GET / HTTP/1.1 ... "); // Client writes some data, we stuff it in the 0-RTT and send it. This write succeeds. From the client's perspective, it may or may not have been received; that's normal. /* At this point the 0-RTT data is rejected by the server, and so it might be replayable ... iff the server side strike-register or cache had a problem. A pedantically correct TLS library might then pause here for 10 seconds, or if it's non-blocking, then set a timer so that nothing can happen on conn for the next 10 seconds. Browsers could turn this behavior off, since they retry aggressively anyway. But it's a secure default that is backwards compatible. */ r = s2n_read()/s2n_write()/s2n_shutdown(); // At this point, s2n returns failure. It's as if the connection failed. The client can implement its retry strategy, if any, safely; the request won't be replayed at this point. r = s2n_connect(); // Client makes a new connection for a retry. A slightly higher level API is probably more realistic, because there's a potential to optimize for connection re-use. There's really no need to tear down the whole connection and start-over. It's safe to proceed to 1-RTT if the delay has expired. A higher level API would fix that, but this is just the "safe by default" API I'm outlining. Again, all I want to take away is that all of this is doable safely with a single stream. -- Colm
_______________________________________________ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls