Re: [TLS] DTLS 1.3 epochs vs message_seq overflow

2024-04-16 Thread Tschofenig, Hannes
Hi David,

thanks for your reviews and the details comments.

Let me search for the exchanges that lead to the increase in the epoch space. I 
recall that this was very late in the process based on feedback from John, who 
noticed that the smaller epoch space helps IoT communication but not the DTLS 
use in SCTP.

Regarding your statement: “65K epochs should be enough for anybody, perhaps 
DTLS 1.4 should update the RecordNumber structure accordingly and save a few 
bytes in the ACKs“. Possibly correct. I am going to ask the SCTP community for 
feedback to find out whether that is also true for them.

Ciao
Hannes

From: TLS  On Behalf Of David Benjamin
Sent: Friday, April 12, 2024 1:16 AM
To:  
Cc: Nick Harper 
Subject: Re: [TLS] DTLS 1.3 epochs vs message_seq overflow

On Thu, Apr 11, 2024 at 7:12 PM David Benjamin 
mailto:david...@chromium.org>> wrote:
Hi all,

In reviewing RFC 9147, I noticed something a bit funny. DTLS 1.3 changed the 
epoch number from 16 bits to 64 bits, though with a requirement that you not 
exceed 2^48-1. I assume this was so that you're able to rekey more than 65K 
times if you really wanted to.

I'm not sure we actually achieved this. In order to change epochs, you need to 
do a KeyUpdate, which involves sending a handshake message. That means burning 
a handshake message sequence number. However, section 5.2 says:

> Note: In DTLS 1.2, the message_seq was reset to zero in case of a rehandshake 
> (i.e., renegotiation). On the surface, a rehandshake in DTLS 1.2 shares 
> similarities with a post-handshake message exchange in DTLS 1.3. However, in 
> DTLS 1.3 the message_seq is not reset, to allow distinguishing a 
> retransmission from a previously sent post-handshake message from a newly 
> sent post-handshake message.

This means that the message_seq space is never reset for the lifetime of the 
connection. But message_seq is a 16-bit field! So I think you would overflow 
message_seq before you manage to overflow a 16-bit epoch.

Now, I think the change here was correct because DTLS 1.2's resetting on 
rehandshake was a mistake. In DTLS 1.2, the end of the previous handshake and 
the start of the next handshake happen in the same epoch, which meant that 
things were ambiguous and you needed knowledge of the handshake state machine 
to resolve things. However, given the wider epoch, perhaps we should have said 
that message_seq resets on each epoch or something. (Too late now, of course... 
DTLS 1.4 I suppose?)

Alternatively, if we think 65K epochs should be enough for anybody, perhaps 
DTLS 1.4 should update the RecordNumber structure accordingly and save a few 
bytes in the ACKs. :-)

Does all that check out, or did I miss something?

David
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] DTLS 1.3 sequence number lengths and lack of ACKs

2024-04-16 Thread Tschofenig, Hannes
Hi David,

thanks again for these comments.

Speaking for myself, this exchange was not designed based on QUIC. I believe it 
pre-dated the corresponding work in QUIC.

Anyway, there are different usage environments and, as you said, there is a 
difference in the amount of messages that may be lost. For some environments 
the loss 255 messages amounts to losing the message exchanges of several days, 
potentially weeks. As such, for those use cases the shorter sequence number 
space is perfectly fine. For other environments this is obviously an issue and 
you have to select the bigger sequence number space.

More explanation about this aspect never hurts. Of course, nobody raised the 
need for such text so far and hence we didn’t add anything. As a way forward, 
we could add text to the UTA document. In the UTA document(s) we already talk 
about other configurable parameters, such as the timeout.

Ciao
Hannes

From: TLS  On Behalf Of David Benjamin
Sent: Friday, April 12, 2024 11:36 PM
To:  
Cc: Nick Harper 
Subject: [TLS] DTLS 1.3 sequence number lengths and lack of ACKs

Hi all,

Here's another issue we noticed with RFC 9147: (There's going to be a few of 
these emails. :-) )

DTLS 1.3 allows senders to pick an 8-bit or 16-bit sequence number. But, unless 
I missed it, there isn't any discussion or guidance on which to use. The draft 
simply says:

> Implementations MAY mix sequence numbers of different lengths on the same 
> connection

I assume this was patterned after QUIC, but looking at QUIC suggests an issue 
with the DTLS 1.3 formulation. QUIC uses ACKs to pick the minimum number of 
bytes needed for the peer to recover the sequence number:
https://www.rfc-editor.org/rfc/rfc9000.html#packet-encoding

But the bulk of DTLS records, app data, are unreliable and not ACKed. DTLS 
leaves all that to application. This means a DTLS implementation does not have 
enough information to make this decision. It would need to be integrated into 
the application-protocol-specific reliability story, if the application 
protocol even maintains that information.

Without ACK feedback, it is hard to size the sequence number safely. Suppose a 
DTLS 1.3 stack unconditionally picked the 1-byte sequence number because it's 
smaller, and the draft didn't say not to do it. That means after getting out of 
sync by 256 records, either via reordering or loss, the connection breaks. For 
example, if there was a blip in connectivity and you happened to lose 256 
records, your connection is stuck and cannot recover. All future records will 
be at higher and higher sequence numbers. A burst of 256 lost packets seems 
within the range of situations one would expect an application to handle.

(The 2-byte sequence number fails at 65K losses, which is hopefully high enough 
to be fine?  Though it's far far less than what QUIC's 1-4-byte sequence number 
can accommodate. It was also odd to see no discussion of this anywhere.)

David
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] Issues with buffered, ACKed KeyUpdates in DTLS 1.3

2024-04-16 Thread Tschofenig, Hannes
Hi David,

this is great feedback. Give me a few days to respond to this issue with my 
suggestion for moving forward.

Ciao
Hannes

From: TLS  On Behalf Of David Benjamin
Sent: Saturday, April 13, 2024 7:59 PM
To:  
Cc: Nick Harper 
Subject: Re: [TLS] Issues with buffered, ACKed KeyUpdates in DTLS 1.3

Another issues with DTLS 1.3's state machine duplication scheme:

Section 8 says implementation must not send new KeyUpdate until the KeyUpdate 
is ACKed, but it says nothing about other post-handshake messages. Suppose 
KeyUpdate(5) in flight and the implementation decides to send NewSessionTicket. 
(E.g. the application called some "send NewSessionTicket" API.) The new epoch 
doesn't exist yet, so naively one would start sending NewSessionTicket(6) in 
the current epoch. Now the peer ACKs KeyUpdate(5), so we transition to the new 
epoch. But retransmissions must retain their original epoch:

> Implementations MUST send retransmissions of lost messages using the same 
> epoch and keying material as the original transmission.
https://www.rfc-editor.org/rfc/rfc9147.html#section-4.2.1-3

This means we must keep sending the NST at the old epoch. But the peer may have 
no idea there's a message at that epoch due to packet loss! Section 8 does ask 
the peer to keep the old epoch around for a spell, but eventually the peer will 
discard the old epoch. If NST(6) didn't get through before then, the entire 
post-handshake stream is now wedged!

I think this means we need to amend Section 8 to forbid sending *any* 
post-handshake message after KeyUpdate. That is, rather than saying you cannot 
send a new KeyUpdate, a KeyUpdate terminates the post-handshake stream at that 
epoch and all new post-handshake messages, be they KeyUpdate or anything else, 
must be enqueued for the new epoch. This is a little unfortunate because a TLS 
library which transparently KeyUpdates will then inadvertently introduce 
hiccups where post-handshake messages triggered by the application, like 
post-handshake auth, are blocked.

That then suggests some more options for fixing the original problem.

7. Fix the sender's KeyUpdate criteria

We tell the sender to wait for all previous messages to be ACKed too. Fix the 
first paragraph of section 8 to say:

> As with other handshake messages with no built-in response, KeyUpdates MUST 
> be acknowledged. Acknowledgements are used to both control retransmission and 
> transition to the next epoch. Implementations MUST NOT send records with the 
> new keys until the KeyUpdate and all preceding messages have been 
> acknowledged. This facilitates epoch reconstruction (Section 4.2.2) and 
> avoids too many epochs in active use, by ensuring the peer has processed the 
> KeyUpdate and started receiving at the new epoch.
>
> A KeyUpdate message terminates the post-handshake stream in an epoch. After 
> sending KeyUpdate in an epoch, implementations MUST NOT send any new 
> post-handshake messages in that epoch. Note that, if the implementation has 
> sent KeyUpdate but is waiting for an ACK, the next epoch is not yet active. 
> In this case, subsequent post-handshake messages may not be sent until 
> receiving the ACK.

And then on the receiver side, we leave things as-is. If the sender implemented 
the old semantics AND had multiple post-handshake transactions in parallel, it 
might update keys too early and then we get into the situation described in 
(1). We then declare that, if this happens, and the sender gets confused as a 
result, that's the sender's fault. Hopefully this is not rare enough (did 
anyone even implement 5.8.4, or does everyone just serialize their 
post-handshake transitions?) to not be a serious protocol break? That risk 
aside, this option seems the most in spirit with the current design to me.

8. Decouple post-handshake retransmissions from epochs

If we instead say that the same epoch rule only applies for the handshake, and 
not post-handshake messages, I think option 5 (process KeyUpdate out of order) 
might become viable? I'm not sure. Either way, this seems like a significant 
protocol break, so I don't think this is an option until some hypothetical DTLS 
1.4.


On Fri, Apr 12, 2024 at 6:59 PM David Benjamin 
mailto:david...@chromium.org>> wrote:
Hi all,

This is going to be a bit long. In short, DTLS 1.3 KeyUpdates seem to conflate 
the peer receiving the KeyUpdate with the peer processing the KeyUpdate, in 
ways that appear to break some assumptions made by the protocol design.

When to switch keys in KeyUpdate

So, first, DTLS 1.3, unlike TLS 1.3, applies the KeyUpdate on the ACK, not when 
the KeyUpdate is sent. This makes sense because KeyUpdate records are not 
intrinsically ordered with app data records sent after them:

> As with other handshake messages with no built-in response, KeyUpdates MUST 
> be acknowledged. In order to facilitate epoch reconstruction (Section 4.2.2), 
> implementations MUST NOT send records with the new keys or send a new 
> KeyUpdate

Re: [TLS] TLS 1.3, Raw Public Keys, and Misbinding Attacks

2024-04-16 Thread Tschofenig, Hannes
Hi John,

I missed this email exchange and I largely agree with what has been said by 
others before.

I disagree with your conclusion since the “identity” in the raw public key case 
is the public key.
With the self-signed certificate there would the danger that the self-asserted 
identity in the certificate is actually used for anything.

Ciao
Hannes


From: TLS  On Behalf Of John Mattsson
Sent: Thursday, March 28, 2024 4:22 PM
To: TLS@ietf.org
Subject: [TLS] TLS 1.3, Raw Public Keys, and Misbinding Attacks

Hi,

I looked into what RFC 8446(bis) says about Raw Public Keys. As correctly 
stated in RFC 8446, TLS 1.3 with signatures and certificates is an 
implementation of SIGMA-I:

SIGMA does however require that the identities of the endpoints (called A and B 
in [SIGMA]) are included in the messages. This is not true for TLS 1.3 with 
RPKs and TLS 1.3 with RPKs is therefore not SIGMA. TLS 1.3 with RPKs is 
vulnerable to what Krawczyk’s SIGMA paper calls misbinding attacks:

“This attack, to which we refer as an “identity misbinding attack”, applies to 
many seemingly natural and intuitive protocols. Avoiding this form of attack 
and guaranteeing a consistent binding between a session key and the peers to 
the session is a central element in the design of SIGMA.”

“Even more significantly we show here that the misbinding attack applies to 
this protocol in any scenario where parties can register public keys without 
proving knowledge of the corresponding signature key.”

As stated in Appendix E.1, at the completion of the handshake, each side 
outputs its view of the identities of the communicating parties. On of the TLS 
1.3 security properties are “Peer Authentication”, which says that the client’s 
and server’s view of the identities match. TLS 1.3 with PRKs does not fulfill 
this unless the out-of-band mechanism to register public keys proved knowledge 
of the private key. RFC 7250 does not say anything about this either.

I think this needs to be clarified in RFC8446bis. The only reason to ever use 
an RPK is in constrained IoT environments. Otherwise a self-signed certificate 
is a much better choice. TLS 1.3 with self-signed certificates is SIGMA-I.

It is worrying to find comments like this:

“I'd like to be able to use wireguard/ssh-style authentication for my app. This 
is possible currently with self-signed certificates, but the proper solution is 
RFC 7250, which is also part of TLS 1.3.”
https://github.com/openssl/openssl/issues/6929

RPKs are not the proper solution.

(Talking about misbinding, does RFC 8446 say anything about how to avoid selfie 
attacks where an entity using PSK authentication ends up talking to itself?)

Cheers,
John Preuß Mattsson

[SIGMA] https://link.springer.com/chapter/10.1007/978-3-540-45146-4_24

___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] Deprecating Static DH certificates in the obsolete key exchange document

2024-04-16 Thread Filippo Valsorda
2024-04-15 20:14 GMT+02:00 Joseph Salowey :
> Should the draft deprecate these ClientCertificateTypes and mark the entries 
> (rsa_fixed_dh, dss_fixed_dh, rsa_fixed_ecdh, ecdsa_fixed_ecdh) as 'D' 
> discouraged?

Oh, yes.
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] Dnsdir early review of draft-ietf-tls-wkech-04

2024-04-16 Thread Stephen Farrell


Hi David,

Thanks for taking the time to review this.

On 15/04/2024 23:44, David Blacka via Datatracker wrote:

Reviewer: David Blacka
Review result: Ready with Issues

This is an early review, so the actual status simply means that I didn't find
anything alarming in this draft.


Ta. The authors do agree that it's not actually ready though
so no need to be so nice:-)


At its core, this I-D is a registration for a well-known URI, using the
criteria described in RFC 8615.  The use of this well-known URI is that a
separate software component from the web server itself can poll the URI and,
based on the response, update DNS RRsets.  This seems pretty straightforward.

The JSON format is encoding of the SVCB ServiceParams, plus the priority,
target, and "regeninterval" fields.  This makes sense, since we are asking the
"Zone Factory" to generate a SVCB or HTTPS record from the data.  This leads to
some obvious questions:

* What happens if there are unknown keys in the JSON?  (e.g, is the response
considered invalid? 


Yeah, that's TBD for now. Authors need to chat about it and
probably reflect on how we've seen new tags for SVCB being
added over the last while.


Or does the Zone Factory ignore them and create the RRs
anyway?) * how are changes to the underlying SVCB service parameter registry
handled?  This I-D asks IANA to create another registry for the JSON fields.
Does this have to "keep up" with the SVCB IANA registry?


Good questions that'll need answering. Will create GH issues
for those (and for issues raised in Martin Thomson's review);
I plan to create those this week when I get time.


This I-D talks about web servers running in "split mode".  Is this a common
term in the web world?  Is there a reference to this practice?  


It's not common in the web world, but is part of the design
of ECH and defined in that draft. I added a sentence saying
that to my local copy.


If not, could
it be described more completely? I found the abbreviation "BE" to be jarring,
possibly more so because it is used without any English articles.


We already changed some terminology based on Martin's review
so those all now say "backend" rather than "BE."


Since I don't really understand "split-mode" (which is presumed to be the norm
based on the example), I don't understand why the distinction is relevant to
the proposal?  Does the Zone Factory behave differently if the web server is in
"split-mode"?  Section 5 suggests that is does, but I'm not sure exactly what
is going on there.


Yeah, there're a few things still to be figured out wrt split
mode but we'll be working on it next week or so - probably ok
to keep an open issue for that.


I found the term "Zone Factory" a bit odd as well, but I couldn't think of a
better name.  "Zone Agent"?  "SVCB Update Client"?


ZF still seems better to me, but we'll doubtless get feedback
as the draft progresses in the WG and gets further dnsdir
review as things settle down.


The I-D in section 6 says:

 ZF SHOULD set a DNS TTL short enough so that any cached DNS resource
 records are likely to have expired before the JSON object's content is
 likely to have changed. The ZF MUST attempt to refresh the JSON object and
 regenerate the zone before this time. This aims to ensure that ECHConfig
 values are not used longer than intended by BE.

This could be couched more precisely in terms of "regeninterval".  We might
want to avoid being overly prescriptive, though.  Something like "The ZF SHOULD
set a DNS TTL less than 'regeninterval'", perhaps.


WFM. Made that change locally.


In Section 6 (and maybe section 3), it isn't spelled out how the Zone Factory
determines the "owner" of the SVCB and or HTTPS records.  I only ask about this
because, if it isn't the domain part of the well-known URI used, then it should
be accounted for in the JSON format.

I'll also note that this early I-D does have a number of obvious typos, at
least one was noticed by the ART reviewer:

* "For many applications, that requires publication of ECHConflgList data
structures in the DNS" -- there is an ell masquerading as an i. * "Zone factory
(ZF): an entity that has write-accsss to the DNS" -- should be "access".


Ta. Fixed locally.


There are likely others.


Doubtless:-)

Thanks again,
S.

PS: In case someone cares - the draft may well expire before we
get -05 out, but it should be a short interregnum:-)








OpenPGP_0xE4D8E9F997A833DD.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] Deprecating Static DH certificates in the obsolete key exchange document

2024-04-16 Thread Peter Gutmann
Joseph Salowey  writes:

>At IETF 119 we had discussion that static DH certificates lead to static key
>exchange which is undesirable.

Has anyone every seen one of these things, meaning a legitimate CA-issued one
rather than something someone ran up in their basement for fun?  If you have,
can I have a copy for the archives?

The only time I've ever seen one was some custom-created ones for S/MIME when
the RSA patent was still in force and we were supposed to pretend to use
static-ephemeral DH for key transport instead of RSA.

Peter.

___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] DTLS 1.3 sequence number lengths and lack of ACKs

2024-04-16 Thread David Benjamin
Regarding UTA or elsewhere, let's see how the buffered KeyUpdates issue
pans out. If I haven't missed something, that one seems severe enough to
warrants an rfc9147bis, or at least a slew of significant errata, in which
case we may as well put the fixups into the main document where they'll be
easier for an implementator to find.

Certainly, as someone reading the document now to plan an implementation, I
would have found it much, much less helpful to put crucial information like
this in a separate UTA document instead of the main one, as these details
influence how and whether to expose the 8- vs 16-bit choice to Applications
Using TLS at all.

David



On Tue, Apr 16, 2024, 05:17 Tschofenig, Hannes <
hannes.tschofe...@siemens.com> wrote:

> Hi David,
>
>
>
> thanks again for these comments.
>
>
>
> Speaking for myself, this exchange was not designed based on QUIC. I
> believe it pre-dated the corresponding work in QUIC.
>
>
>
> Anyway, there are different usage environments and, as you said, there is
> a difference in the amount of messages that may be lost. For some
> environments the loss 255 messages amounts to losing the message exchanges
> of several days, potentially weeks. As such, for those use cases the
> shorter sequence number space is perfectly fine. For other environments
> this is obviously an issue and you have to select the bigger sequence
> number space.
>
>
>
> More explanation about this aspect never hurts. Of course, nobody raised
> the need for such text so far and hence we didn’t add anything. As a way
> forward, we could add text to the UTA document. In the UTA document(s) we
> already talk about other configurable parameters, such as the timeout.
>
>
>
> Ciao
>
> Hannes
>
>
>
> *From:* TLS  *On Behalf Of *David Benjamin
> *Sent:* Friday, April 12, 2024 11:36 PM
> *To:*  
> *Cc:* Nick Harper 
> *Subject:* [TLS] DTLS 1.3 sequence number lengths and lack of ACKs
>
>
>
> Hi all,
>
>
>
> Here's another issue we noticed with RFC 9147: (There's going to be a few
> of these emails. :-) )
>
>
>
> DTLS 1.3 allows senders to pick an 8-bit or 16-bit sequence number. But,
> unless I missed it, there isn't any discussion or guidance on which to use.
> The draft simply says:
>
>
>
> > Implementations MAY mix sequence numbers of different lengths on the
> same connection
>
>
>
> I assume this was patterned after QUIC, but looking at QUIC suggests an
> issue with the DTLS 1.3 formulation. QUIC uses ACKs to pick the minimum
> number of bytes needed for the peer to recover the sequence number:
>
> https://www.rfc-editor.org/rfc/rfc9000.html#packet-encoding
>
>
>
> But the bulk of DTLS records, app data, are unreliable and not ACKed. DTLS
> leaves all that to application. This means a DTLS implementation does not
> have enough information to make this decision. It would need to be
> integrated into the application-protocol-specific reliability story, if the
> application protocol even maintains that information.
>
>
>
> Without ACK feedback, it is hard to size the sequence number safely.
> Suppose a DTLS 1.3 stack unconditionally picked the 1-byte sequence number
> because it's smaller, and the draft didn't say not to do it. That means
> after getting out of sync by 256 records, either via reordering or loss,
> the connection breaks. For example, if there was a blip in connectivity and
> you happened to lose 256 records, your connection is stuck and cannot
> recover. All future records will be at higher and higher sequence numbers.
> A burst of 256 lost packets seems within the range of situations one would
> expect an application to handle.
>
>
>
> (The 2-byte sequence number fails at 65K losses, which is hopefully high
> enough to be fine?  Though it's far far less than what QUIC's 1-4-byte
> sequence number can accommodate. It was also odd to see no discussion of
> this anywhere.)
>
>
>
> David
>
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] DTLS 1.3 sequence number lengths and lack of ACKs

2024-04-16 Thread Tschofenig, Hannes
Fair enough. I don’t have a strong preference as long as we document it 
somewhere.

Ciao
Hannes

From: David Benjamin 
Sent: Tuesday, April 16, 2024 3:18 PM
To: Tschofenig, Hannes (T CST SEA-DE) 
Cc:  ; Nick Harper 
Subject: Re: [TLS] DTLS 1.3 sequence number lengths and lack of ACKs

Regarding UTA or elsewhere, let's see how the buffered KeyUpdates issue pans 
out. If I haven't missed something, that one seems severe enough to warrants an 
rfc9147bis, or at least a slew of significant errata, in which case we may as 
well put the fixups into the main document where they'll be easier for an 
implementator to find.

Certainly, as someone reading the document now to plan an implementation, I 
would have found it much, much less helpful to put crucial information like 
this in a separate UTA document instead of the main one, as these details 
influence how and whether to expose the 8- vs 16-bit choice to Applications 
Using TLS at all.

David



On Tue, Apr 16, 2024, 05:17 Tschofenig, Hannes 
mailto:hannes.tschofe...@siemens.com>> wrote:
Hi David,

thanks again for these comments.

Speaking for myself, this exchange was not designed based on QUIC. I believe it 
pre-dated the corresponding work in QUIC.

Anyway, there are different usage environments and, as you said, there is a 
difference in the amount of messages that may be lost. For some environments 
the loss 255 messages amounts to losing the message exchanges of several days, 
potentially weeks. As such, for those use cases the shorter sequence number 
space is perfectly fine. For other environments this is obviously an issue and 
you have to select the bigger sequence number space.

More explanation about this aspect never hurts. Of course, nobody raised the 
need for such text so far and hence we didn’t add anything. As a way forward, 
we could add text to the UTA document. In the UTA document(s) we already talk 
about other configurable parameters, such as the timeout.

Ciao
Hannes

From: TLS mailto:tls-boun...@ietf.org>> On Behalf Of 
David Benjamin
Sent: Friday, April 12, 2024 11:36 PM
To: mailto:tls@ietf.org>> mailto:tls@ietf.org>>
Cc: Nick Harper mailto:nhar...@chromium.org>>
Subject: [TLS] DTLS 1.3 sequence number lengths and lack of ACKs

Hi all,

Here's another issue we noticed with RFC 9147: (There's going to be a few of 
these emails. :-) )

DTLS 1.3 allows senders to pick an 8-bit or 16-bit sequence number. But, unless 
I missed it, there isn't any discussion or guidance on which to use. The draft 
simply says:

> Implementations MAY mix sequence numbers of different lengths on the same 
> connection

I assume this was patterned after QUIC, but looking at QUIC suggests an issue 
with the DTLS 1.3 formulation. QUIC uses ACKs to pick the minimum number of 
bytes needed for the peer to recover the sequence number:
https://www.rfc-editor.org/rfc/rfc9000.html#packet-encoding

But the bulk of DTLS records, app data, are unreliable and not ACKed. DTLS 
leaves all that to application. This means a DTLS implementation does not have 
enough information to make this decision. It would need to be integrated into 
the application-protocol-specific reliability story, if the application 
protocol even maintains that information.

Without ACK feedback, it is hard to size the sequence number safely. Suppose a 
DTLS 1.3 stack unconditionally picked the 1-byte sequence number because it's 
smaller, and the draft didn't say not to do it. That means after getting out of 
sync by 256 records, either via reordering or loss, the connection breaks. For 
example, if there was a blip in connectivity and you happened to lose 256 
records, your connection is stuck and cannot recover. All future records will 
be at higher and higher sequence numbers. A burst of 256 lost packets seems 
within the range of situations one would expect an application to handle.

(The 2-byte sequence number fails at 65K losses, which is hopefully high enough 
to be fine?  Though it's far far less than what QUIC's 1-4-byte sequence number 
can accommodate. It was also odd to see no discussion of this anywhere.)

David
___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls


Re: [TLS] Issues with buffered, ACKed KeyUpdates in DTLS 1.3

2024-04-16 Thread David Benjamin
Thanks, Hannes!

Since it was buried in there (my understanding of the issue evolved as I
described it), I currently favor option 7. I.e. the sender-only fix to the
KeyUpdate criteria.

At first I thought we should also change the receiver to mitigate unfixed
senders, but this situation should be pretty rare (most senders will send
NewSessionTicket well before they KeyUpdate), DTLS 1.3 isn't very widely
deployed yet, and ultimately, it's on the sender implementation to make
sure all states they can get into are coherent.

If the sender crashed, that's unambiguously on the sender to fix. If the
sender still correctly retransmits the missing messages, the connection
will perform suboptimally for a blip but still recover.

David


On Tue, Apr 16, 2024, 05:19 Tschofenig, Hannes <
hannes.tschofe...@siemens.com> wrote:

> Hi David,
>
>
>
> this is great feedback. Give me a few days to respond to this issue with
> my suggestion for moving forward.
>
>
>
> Ciao
>
> Hannes
>
>
>
> *From:* TLS  *On Behalf Of *David Benjamin
> *Sent:* Saturday, April 13, 2024 7:59 PM
> *To:*  
> *Cc:* Nick Harper 
> *Subject:* Re: [TLS] Issues with buffered, ACKed KeyUpdates in DTLS 1.3
>
>
>
> Another issues with DTLS 1.3's state machine duplication scheme:
>
>
>
> Section 8 says implementation must not send new KeyUpdate until the
> KeyUpdate is ACKed, but it says nothing about other post-handshake
> messages. Suppose KeyUpdate(5) in flight and the implementation decides to
> send NewSessionTicket. (E.g. the application called some
> "send NewSessionTicket" API.) The new epoch doesn't exist yet, so naively
> one would start sending NewSessionTicket(6) in the current epoch. Now the
> peer ACKs KeyUpdate(5), so we transition to the new epoch. But
> retransmissions must retain their original epoch:
>
>
>
> > Implementations MUST send retransmissions of lost messages using the
> same epoch and keying material as the original transmission.
>
> https://www.rfc-editor.org/rfc/rfc9147.html#section-4.2.1-3
>
>
>
> This means we must keep sending the NST at the old epoch. But the peer may
> have no idea there's a message at that epoch due to packet loss! Section 8
> does ask the peer to keep the old epoch around for a spell, but eventually
> the peer will discard the old epoch. If NST(6) didn't get through before
> then, the entire post-handshake stream is now wedged!
>
>
>
> I think this means we need to amend Section 8 to forbid sending *any*
> post-handshake message after KeyUpdate. That is, rather than saying you
> cannot send a new KeyUpdate, a KeyUpdate terminates the post-handshake
> stream at that epoch and all new post-handshake messages, be they KeyUpdate
> or anything else, must be enqueued for the new epoch. This is a little
> unfortunate because a TLS library which transparently KeyUpdates will then
> inadvertently introduce hiccups where post-handshake messages triggered by
> the application, like post-handshake auth, are blocked.
>
>
>
> That then suggests some more options for fixing the original problem.
>
>
>
> *7. Fix the sender's KeyUpdate criteria*
>
>
>
> We tell the sender to wait for all previous messages to be ACKed too. Fix
> the first paragraph of section 8 to say:
>
>
>
> > As with other handshake messages with no built-in response, KeyUpdates
> MUST be acknowledged. Acknowledgements are used to both control
> retransmission and transition to the next epoch. Implementations MUST NOT
> send records with the new keys until the KeyUpdate *and all preceding
> messages* have been acknowledged. This facilitates epoch reconstruction
> (Section 4.2.2) and avoids too many epochs in active use, by ensuring the
> peer has processed the KeyUpdate and started receiving at the new epoch.
>
> >
>
> > A KeyUpdate message terminates the post-handshake stream in an epoch.
> After sending KeyUpdate in an epoch, implementations MUST NOT send any new
> post-handshake messages in that epoch. Note that, if the implementation has
> sent KeyUpdate but is waiting for an ACK, the next epoch is not yet active.
> In this case, subsequent post-handshake messages may not be sent until
> receiving the ACK.
>
>
>
> And then on the receiver side, we leave things as-is. If the sender
> implemented the old semantics AND had multiple post-handshake transactions
> in parallel, it might update keys too early and then we get into the
> situation described in (1). We then declare that, if this happens, and the
> sender gets confused as a result, that's the sender's fault. Hopefully this
> is not rare enough (did anyone even implement 5.8.4, or does everyone just
> serialize their post-handshake transitions?) to not be a serious protocol
> break? That risk aside, this option seems the most in spirit with the
> current design to me.
>
>
>
> *8. Decouple post-handshake retransmissions from epochs*
>
>
>
> If we instead say that the same epoch rule only applies for the handshake,
> and not post-handshake messages, I think option 5 (process KeyUpdate out of
>

Re: [TLS] [rfc9147] Clarification on DTLS 1.3 CID Retirement and Usage

2024-04-16 Thread Tschofenig, Hannes
Hi Kristijan,

searching through the mailing list I found this mail. So, sorry for the late 
response.

The CID design in DTLS 1.3 has not been focused on multi-homing use cases. It 
was not a design goal; you have to design on an extension in the style of what 
is currently happening with QUIC or what was previously done with MOBIKE.

Ciao
Hannes

From: TLS  On Behalf Of Kristijan Sedlak
Sent: Sunday, December 10, 2023 11:50 AM
To:  
Subject: [TLS] [rfc9147] Clarification on DTLS 1.3 CID Retirement and Usage

Dear IETF TLS Working Group,

I am reaching out to seek clarification on specific aspects of Connection ID 
(CID) management in DTLS 1.3, as detailed in RFC 9147.

The current specification delineates the process for issuing new CIDs via a 
NewConnectionId message. However, the methodology for retiring old CIDs seems 
subject to various interpretations.

Is it correct to assume that an endpoint dictates the number of active CIDs it 
manages and that CIDs should be utilized in the sequence they are provided? For 
example, if the initial negotiated CID is 0 and an endpoint subsequently issues 
NewConnectionId with CIDs 1, 2, and 3, my interpretation is that upon receiving 
the first datagram from a new path (which is also applicable for an existing 
path), the records should ideally be tagged with the next CID (1, 2, or 3) 
rather than CID 0. This approach suggests that upon the reception of a higher 
CID, lower CIDs should be considered retired and later removed.

This understanding implies that CIDs in DTLS 1.3 are not designed for multipath 
operations, and it is anticipated that only one path (one CID) would be active 
at a given time. Could you please confirm if this interpretation is in 
alignment with the intended specifications, or offer additional insights into 
the appropriate management of CIDs in DTLS 1.3? Including such clarification in 
the RFC would be invaluable in mitigating potential confusion.

Thank you.
Kristijan

___
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls