[TLS] Re: DTLS 1.3 ACKs near the version transition

Bob Beck Tue, 17 Sep 2024 21:40:26 -0700


> On Sep 17, 2024, at 5:28 PM, David Benjamin 
> <davidben=40google....@dmarc.ietf.org> wrote:
> 
> Ah, I just noticed this text at the end of Section 7.1:
> 
> > Note that in some cases it may be necessary to send an ACK which does not 
> > contain any record numbers. For instance, a client might receive an 
> > EncryptedExtensions message prior to receiving a ServerHello. Because it 
> > cannot decrypt the EncryptedExtensions, it cannot safely acknowledge it (as 
> > it might be damaged). If the client does not send an ACK, the server will 
> > eventually retransmit its first flight, but this might take far longer than 
> > the actual round trip time between client and server. Having the client 
> > send an empty ACK shortcuts this process.
> 
> https://www.rfc-editor.org/rfc/rfc9147.html#section-7.1-8
> 
> I guess then the intent is indeed that if you receive some random encrypted 
> DTLS 1.3 header, even though you don't know it's DTLS 1.3 yet, you interpret 
> as activating the ACKing mechanism? But that seems to prompt more questions 
> than it answers. For instance, what happens if you do that, but then finally 
> receive the ServerHello and it turns out this was just some junk packet and 
> we're really negotiation DTLS 1.2? Do you check that the ACK mechanism has 
> been activated and return an error? Do you just pause the ACK mechanism and 
> hope you're in an OK state? This seems quite prune to send the implementation 
> into unexpected and untested states.
> 
>



Yeah, I think this has missed a nasty corner case here for implementations that 
support both. 

I think I also lean towards option A) (from below) here. Anyone else who has 
gotten at least their hands mildly dirty in a DTLS implementation that supports 
both 1.2 and 1.3 care to chime in as well? 


> On Thu, Sep 12, 2024 at 4:31 PM David Benjamin <david...@google.com> wrote:
> Hi all,
> 
> I noticed another issue with the DTLS 1.3 ACK design. :-)
> 
> So, DTLS 1.3 uses ACKs. DTLS 1.2 does not use ACKs. But you only learn what 
> version you're speaking partway through the lifetime of the connection, so 
> there are some interesting corner cases to answer. As an illustrative 
> example, I believe the diagram in section 6 is [probably] incorrect:
> https://www.rfc-editor.org/rfc/rfc9147.html#section-6
> 
> If the client loses the first packet, it never sees the ServerHello and thus 
> learns it's speaking DTLS 1.3. While it does see the second packet, that 
> packet only contains ciphertext that it cannot decrypt. Unless it decides to 
> say "this looks like a 1.3 record header, therefore I will turn on the 1.3 
> state machine", which isn't supported by the RFC (maybe TLS 1.4 will use the 
> same record header but redo ACKs once again), it shouldn't activate the 1.3 
> state machine yet. I expect what will actually happen is that the client will 
> wait for the retransmission timeout a la DTLS 1.2.
> 
> More generally, I believe these are the situations to worry about:
> 
> 1. If a DTLS 1.2 (i.e. does not implement RFC 9147 at all) implementation 
> receives an ACK record for whatever reason, what happens? This decision we 
> don't get to change. Rather, it is a design constraint. Both OpenSSL and 
> BoringSSL treat unexpected record types as a fatal error. I haven't checked 
> other implementations. So I think we must take as a constraint that you 
> cannot send an ACK unless you know the peer is 1.3-capable.
> 
> 2. Do plaintext ACKs exist? Or is the plaintext epoch permanently at the old 
> state machine? Honestly, I wish the answer here was "no". That would have 
> avoided so many problems, because then epochs never change state machines. 
> Unfortunately, the RFC does not support this interpretation. Section 4.1 
> talks about how to demux a plaintext ACK, and section 6, though wrong, 
> clearly depicts a plaintext ACK. So instead we get to worry about the 
> transition within an epoch. Keep in mind that transitions happen at different 
> times on both sides. Keep in mind that there is a portion of the plaintext 
> epoch that lasts after version negotiation in HelloRetryRequest handshakes.
> 
> 3. If a 1.3-capable server receives half of a ClientHello, does it send an 
> ACK? I believe (1) means the answer must be "no". If you haven't read the 
> ClientHello, you haven't selected the version, so you don't know if the 
> client is 1.3-capable or not. If the client is not 1.3-capable, sending an 
> ACK may be incompatible.
> 
> 4. Is it possible for a 1.3-capable client to receive an ACK before it 
> receives a ServerHello? If so, how does the client respond? I believe the 
> answer to this question, if plaintext ACKs exist, is unavoidably "yes". 
> Suppose the server receives a 1.3 ClientHello and then negotiates DTLS 1.3. 
> That is a complete flight, so Section 7.1 discourages ACKing explicitly (you 
> can ACK implicitly), but it does not forbid an explicit ACK. An explicit ACK 
> may be sent if the server cannot generate its responding flight immediately. 
> That means a server could well send ACK followed by ServerHello. Now suppose 
> ServerHello is lost but the ACK gets through. Now the client must decide what 
> it's doing. Rejecting the ACK would result in connection failure, so we must 
> either drop the ACK on the floor, or process it. While processing it would be 
> more efficient (you don't need to retransmit the whole ClientHello), it means 
> the plaintext epoch must support this hybrid state where 1.3 ACKs are 
> processed but never sent! Or perhaps receiving that ACK transitions you to 
> the 1.3 state machine even though you don't know the version yet. That all 
> sounds like a mess, so I would advocate you simply drop it on the floor.
> 
> 5. If a 1.3-capable client receives half of the server's first message (HRR 
> or ServerHello), does it send an ACK? Again, because of (1), I believe the 
> answer must be "no". If you don't know the server's selected version, the 
> server may not be 1.3-capable and may not be compatible with the ACK.
> 
> 6. What does a 1.3-capable server do if it receives an ACK prior to picking 
> the TLS version? Unlike (4), I believe this is impossible. If the client has 
> something to ACK, the server must have sent something, which the server will 
> only do once it's received the full ClientHello and thus picked the version. 
> However, given (4), I suspect an implementation will naturally just drop that 
> ACK. In this state error vs drop is kinda academic.
> 
> From what I can tell, RFC 9147 is silent on all of this. I think it should 
> say something. I believe these are the plausible options:
> 
> OPTION A -- There are no ACKs in epoch 0.
> 
> We avoid this ridiculous transition point and say that ACKs only exist 
> starting epoch 1. Epoch 0 uses the old DTLS 1.2 state machine. This is very 
> attractive from a simplicity perspective, but since RFC 9147 was already 
> published with this ambiguity, I think we need to, at minimum, say that DTLS 
> 1.3 implementations drop epoch 0 ACKs on the floor. It also means that packet 
> loss in HelloRetryRequest flows may be less efficient. That said, if your 
> HelloRetryRequest is stateless (not applicable to all DTLS uses), you're 
> probably not doing anything with ACKs anyway. Saying those ACKs avoids having 
> to think about that case, at the cost of a worse transport for stateful 
> HelloRetryRequest.
> 
> OPTION B -- Epoch 0 enables ACKing once the version is learned.
> 
> Once you know the version, you start sending and processing ACKs. Before you 
> know the version, you drop ACKs on the floor and never send them. This 
> requires convincing ourselves that the transition point works out, notably 
> when one side is still ACK-less and the other side is still ACK-ful, but I 
> believe it works out.
> 
> OPTION C -- Epoch 0 always receives and acts on ACKs, but it doesn't send 
> ACKs until the version is learned.
> 
> This is the same as above, but instead of dropping ACKs, you go ahead and let 
> that drive your state machine. But you don't send them. This makes reasoning 
> about the protocol even more complicated because there are even more states 
> you can be in w.r.t. your known version vs the state of your transport. It 
> does improve behavior around packet loss, but I think it only helps this edge 
> case in question (4) above, which is already a case where servers aren't 
> expected to send ACKs anyway.
> 
> I think I lean towards Option A for simplicity, even though it decidedly 
> contradicts a lot of text in the RFC right now. That will be hard to encode 
> in an erratum as a few things need to change. But I also have 7 other eratta 
> open against this document, so maybe it's time for rfc9147bis.
> 
> David
> _______________________________________________
> TLS mailing list -- tls@ietf.org
> To unsubscribe send an email to tls-le...@ietf.org

_______________________________________________
TLS mailing list -- tls@ietf.org
To unsubscribe send an email to tls-le...@ietf.org

[TLS] Re: DTLS 1.3 ACKs near the version transition

Reply via email to