[TLS] Re: DTLS 1.3 ACKs near the version transition
Ah fun, another issue in this document. So not only are write epoch lifetimes unspecified and complex with 0-RTT, but read epoch lifetimes *are* specified but *wrong*. Section 4.2.1 says: > Because DTLS records could be reordered, a record from epoch M may be received after epoch N (where N > M) has begun. Implementations SHOULD discard records from earlier epochs but MAY choose to retain keying material from previous epochs for up to the default MSL specified for TCP [RFC0793] to allow for packet reordering. (Note that the intention here is that implementers use the current guidance from the IETF for MSL, as specified in [RFC0793] or successors, not that they attempt to interrogate the MSL that the system TCP stack is using.) https://www.rfc-editor.org/rfc/rfc9147.html#section-4.2.1 First, it's a bit weird to say you SHOULD discard *records* but MAY retain *keying material*. I assume that meant SHOULD discard records but MAY process records anyway up to MSL. Anyway, this model implies that only one read epoch is active at once, but this isn't true. You basically have to read epoch 1 (early data) as unordered relative to epoches 0 and 2. Consider a DTLS 1.3 server: 1. The server reads ClientHello with early_data extension at epoch 0 and accepts early data. 2. The server sends ServerHello (epoch 0), EE..Finished (epoch 2), and activates write epoch 3 for half-RTT application data. 3. The server reads early data (epoch 1) from the client. The RFC would lead you to think the server can close read epoch 0 now, but... 4. ServerHello gets lost and, if we are to believe https://www.rfc-editor.org/rfc/rfc9147.html#section-7.1-8, the client might send an empty plaintext ACK to trigger a retransmit. This ACK will be at epoch 0. This only works if the server keeps read epoch 0 open! 5. Client eventually gets the ServerHello but now it only gets half of the epoch 2 data. It sends an ACK to trigger another retransmit. This ACK will come at epoch 2. 6. Server receives that ACK at epoch 2 and retransmits. The RFC would lead you to think the server can close read epoch 1 now, but... 7. Let's say that retransmit is lost again, or hasn't arrived yet. From the client's perspective, it has a connection that has yet to reach the 1-RTT point, so any data from the calling application will still be sent as early data. That means the client will continue to send early data at epoch 1. This only works if the server keeps read epoch 1 open! 8. The handshake progresses and the server finally gets 1-RTT data at epoch 3 from the client. *Now* the spirit of the rule in the text applies to epoch 1 and the server can close the epoch (after optionally waiting a spell for reordering) So the rule is actually that we close according to a partially ordered set: - 0 (unencrypted) < 2 (handshake) < 3 (first app data) < 4 < 5 < ... - 1 (early data) < 3 (first app data) < 4 < 5 < ... - 1 is not ordered relative to 0 and 2. On Wed, Sep 18, 2024 at 3:47 PM David Benjamin wrote: > One more wriggle if we wish to allow unencrypted ACKs, though it is > fixable. Section 7, says: > > > During the handshake, ACK records MUST be sent with an epoch which is > equal to or higher than the record which is being acknowledged. [...] > Implementations SHOULD simply use the highest current sending epoch, which > will generally be the highest available. After the handshake, > implementations MUST use the highest available sending epoch. > > Taken at face value, that text implies that a client sending 0-RTT data > should send its ACKs at the highest current sending epoch, epoch 1 (0-RTT). > But if the server has rejected 0-RTT data, it will not (and cannot) > instantiate epoch 1 at all, so it won't get the ACKs! That guidance needs a > special case: if you would have ACKed at epoch 1, you should ACK at epoch 0 > instead. > > Alternatively, one might interpret that situation as 0 being the sending > epoch and 1 being some magical epoch on the side. This isn't supported by > the document, but honestly no interpretation is supported by the document > because the document never tells you what a "current sending epoch" even > is. While 4.2.1 gives some rough guidance on when to close out receiving > epochs, I could not find any text on send epoch management at all. > Reasoning through the protocol, you might arrive at this *almost* correct > rule: > > A write epoch may be discarded IF: > 1. It is not the highest available epoch. AND > 2. There are no unacked, outgoing messages at that epoch > > That rule, however, does not work in 0-RTT. If the highest epoch is 1, you > cannot discard 0. The server might reject 0-RTT and then send > HelloRetryRequest, at which point you will need to discard epoch 1 and > reactivate epoch 0, maintaining continuity of sequence numbers. The > 0-RTT/1-RTT transition is also interesting on the write side, though I'll > start a separate thread for that. > > All this is subtle enough that it should not be left as an exercise to the > r
[TLS] Re: DTLS 1.3 ACKs near the version transition
On Thu, Sep 19, 2024 at 1:31 PM David Benjamin wrote: > Ah fun, another issue in this document. So not only are write epoch > lifetimes unspecified and complex with 0-RTT, but read epoch lifetimes > *are* specified but *wrong*. > > Section 4.2.1 says: > > > Because DTLS records could be reordered, a record from epoch M may be > received after epoch N (where N > M) has begun. Implementations SHOULD > discard records from earlier epochs but MAY choose to retain keying > material from previous epochs for up to the default MSL specified for TCP > [RFC0793] to allow for packet reordering. (Note that the intention here is > that implementers use the current guidance from the IETF for MSL, as > specified in [RFC0793] or successors, not that they attempt to interrogate > the MSL that the system TCP stack is using.) > > https://www.rfc-editor.org/rfc/rfc9147.html#section-4.2.1 > > First, it's a bit weird to say you SHOULD discard *records* but MAY > retain *keying material*. I assume that meant SHOULD discard records but > MAY process records anyway up to MSL. Anyway, this model implies that only > one read epoch is active at once, but this isn't true. You basically have > to read epoch 1 (early data) as unordered relative to epoches 0 and 2. > Consider a DTLS 1.3 server: > > 1. The server reads ClientHello with early_data extension at epoch 0 and > accepts early data. > 2. The server sends ServerHello (epoch 0), EE..Finished (epoch 2), and > activates write epoch 3 for half-RTT application data. > 3. The server reads early data (epoch 1) from the client. The RFC would > lead you to think the server can close read epoch 0 now, but... > 4. ServerHello gets lost and, if we are to believe > https://www.rfc-editor.org/rfc/rfc9147.html#section-7.1-8, the client > might send an empty plaintext ACK to trigger a retransmit. This ACK will be > at epoch 0. This only works if the server keeps read epoch 0 open! > 5. Client eventually gets the ServerHello but now it only gets half of the > epoch 2 data. It sends an ACK to trigger another retransmit. This ACK will > come at epoch 2. > 6. Server receives that ACK at epoch 2 and retransmits. The RFC would lead > you to think the server can close read epoch 1 now, but... > 7. Let's say that retransmit is lost again, or hasn't arrived yet. From > the client's perspective, it has a connection that has yet to reach the > 1-RTT point, so any data from the calling application will still be sent as > early data. That means the client will continue to send early data at epoch > 1. This only works if the server keeps read epoch 1 open! > 8. The handshake progresses and the server finally gets 1-RTT data at > epoch 3 from the client. *Now* the spirit of the rule in the text applies > to epoch 1 and the server can close the epoch (after optionally waiting a > spell for reordering) > Ah right, Nick Harper points out that servers really should close read epoch 1 [up to a delay to accommodate reordering] as soon as they receive the Finished message (epoch 2) and complete the handshake, not wait for an epoch 3 record. (But it must specifically be on handshake completion, not *any* epoch 2 record. Record-layer only logic cannot assume 1 < 2 because 2 might contain pre-Finished ACKs.) All this is missing from the specification. :-) I think we need to rewrite the spec text on epochs to more explicitly discuss their lifetimes. > So the rule is actually that we close according to a partially ordered set: > - 0 (unencrypted) < 2 (handshake) < 3 (first app data) < 4 < 5 < ... > - 1 (early data) < 3 (first app data) < 4 < 5 < ... > - 1 is not ordered relative to 0 and 2. > > > On Wed, Sep 18, 2024 at 3:47 PM David Benjamin > wrote: > >> One more wriggle if we wish to allow unencrypted ACKs, though it is >> fixable. Section 7, says: >> >> > During the handshake, ACK records MUST be sent with an epoch which is >> equal to or higher than the record which is being acknowledged. [...] >> Implementations SHOULD simply use the highest current sending epoch, which >> will generally be the highest available. After the handshake, >> implementations MUST use the highest available sending epoch. >> >> Taken at face value, that text implies that a client sending 0-RTT data >> should send its ACKs at the highest current sending epoch, epoch 1 (0-RTT). >> But if the server has rejected 0-RTT data, it will not (and cannot) >> instantiate epoch 1 at all, so it won't get the ACKs! That guidance needs a >> special case: if you would have ACKed at epoch 1, you should ACK at epoch 0 >> instead. >> >> Alternatively, one might interpret that situation as 0 being the sending >> epoch and 1 being some magical epoch on the side. This isn't supported by >> the document, but honestly no interpretation is supported by the document >> because the document never tells you what a "current sending epoch" even >> is. While 4.2.1 gives some rough guidance on when to close out receiving >> epochs, I could not find any text on send