Ilari Liusvaara writes:
> Security review of X-wing only needs to be done once.

"Of course we hope that any particular piece of security review can be
done just once and that's the end of it (OK Google, please read once
through the Chrome source code and remove the buffer overflows), but the
bigger picture is that many security failures are easily explained by
reviewer overload, so minimizing the complexity of what has to be
reviewed is a useful default policy. Minimizing TCBs is another example
of the same approach."

> Generic combiner would likely lead to lots of combinations.

What does "generic" mean here? What's the mechanism that will have a
"generic" combiner triggering "lots of combinations", and why _won't_
the same mechanism apply to QSF---the construction presented in the
X-Wing paper, used in X-Wing, and labeled in the paper as "generic"?

Some arguments seem to be starting from the idea that X-Wing _isn't_
using a "generic" combiner---but the paper says it is. People seem to
be using multiple incompatible meanings of the word "generic" here. How
about we just stop using the word? Any valid argument will survive
elimination of ambiguous terminology.

The actual technical difference is between (A) combiners making security
guarantees assuming the input KEM is IND-CCA2 and (B) combiners making
security guarantees assuming the input KEM is IND-CCA2 and "ciphertext
collision resistant". (Plus further assumptions in B if the goals of
https://eprint.iacr.org/2021/708 and https://eprint.iacr.org/2021/1351
are within scope. The specific A proposals that we're looking at handle
those goals automatically, since they hash the full public keys.)

> And I think it much easier to understand integrated thing that is unsafe
> to change than that the component KEMs must be <some-gibberish> (oh,
> and then being used with stuff that fails the requirement).

I agree that what's presented to applications (or to higher-level
protocols such as TLS) should be a full combined KEM reviewed by CFRG.
All of the proposals on the table (original X-Wing; X-Wing with the
modifications that I recommended; Chempat-X) are full combined KEMs.

CFRG approval (or approval by a WG such as TLS) is always for exactly
the specified mechanism. I don't object to systematically saying so in
CFRG RFCs---but I also see no evidence that _implementors who read the
RFCs_ are misunderstanding this in the first place.

Meanwhile there's ample evidence of implementors doing things that specs
_don't_ allow. Paying attention to how this happens often allows it to
be predicted and mitigated. This is better---for the end users, which is
what matters!---than blaming implementors.

Example: there are books and papers stating variable-width ECC ladders
(for obvious reasons: this works for all scalars without the width as an
extra parameter). Telling implementors to use fixed-width ladders
sometimes works, but what if the implementors aren't listening?

X25519 arranges ECC secret keys to have the leading 1 at a fixed
position. Ed25519 uses double-length nonces. Sure, these protections
aren't guaranteed to be used (implementors without good tests aren't
_forced_ to use the 1; implementors often reduce nonces mod the group
order), but common sense says that they reduce risks, and subsequent
ecosystem observations (https://blog.cr.yp.to/20191024-eddsa.html)
showed that this proactive strategy was successful.

See https://cr.yp.to/talks.html#2015.01.07 for more examples, starting
with the Sony PS3 disaster, which was predicted by Rivest in 1992: "The
poor user is given enough rope with which to hang himself---something a
standard should not do." Sure, NIST never admitted responsibility for
the disaster (NIST hides behind the standard clearly saying nonces have
to be random), but we can and should do better than that.

Similarly, making sure that combiners hash the full transcript, so that
implementors plugging in KEMs that don't do this are protected, is more
robust than warning implementors not to plug in those KEMs.

> And things are made worse by different protocols tending to do things
> subtly differently even when using the same KEMs.

Certainly having multiple combiners adds to the security-review load.
This argument favors having just one combiner across protocols _and_
having just one combiner across underlying KEMs. What's the contrary
argument?

> None of this is theroretical risk, this all has been seen in practice.
> Unlike unsafe modified versions of X-Wing.

I do not agree that we should be waiting to see X-Wing-derived failures
before taking steps to prevent those failures.

> > In context, this speedup is negligible: about
> > 1% of the cost of communicating that data in the first place, never mind
> > other application costs.
> The cost is not negligible.

Sorry, can you clarify which step you're disagreeing with?

https://cr.yp.to/papers.html#pppqefs includes dollar costs for Internet
communication (roughly 2^-40 dollars/byte) and for CPU cycles (roughly
2^-51 dollars/cycle). Cycle counts for hashing (under 2^4 cycles/byte)
are readily available from https://bench.cr.yp.to. Combining these
numbers produces the 1% figure (2^-47/2^-40). 1% is negligible.

There was also a question about security tokens, where I laid out
details of a calculation concluding that communication takes 82ms (plus
lower-layer overhead) while this hashing takes 2ms. The CPUs there cost
roughly $1, a smaller fraction of the token cost than desktop CPUs
inside desktop computers.

> If somebody starts blindly replacing algorithms, there is much higher
> risk of actual security problem caused by choosing bad algorithms than
> omitting some hashing.

How much higher? Where is that number coming from? Why is "blindly" a
useful model for real-world algorithm choice?

To be clear, I'm also concerned about the risk of people plugging in
KEMs that turn out to breakable. The basic reason we're applying double
encryption is that, as https://kyberslash.cr.yp.to illustrates, this is
a serious risk even for NIST's selected KEM! But I don't see how that
risk justifies picking a combiner that's unnecessarily fragile and
unnecessarily hard to review.

> No real attacker is looking how to break IND-CCA. They are looking at
> how to attack actual systems. The value of IND-CCA is that despite
> unrealistically strong attack model, most algorithms that fail it tend
> to fail in rather dangerous ways. Which is not at all clear for
> combiner with insufficient hashing.

What's the source for these statistics? Is the long series of
Bleichenbacher-type vulnerabilities, including

   https://nvd.nist.gov/vuln/detail/CVE-2024-23218

from Apple a week ago, being counted as just one failure?

I agree that the security goals we're aiming for here aren't as
_obviously_ necessary as, e.g., stopping attackers from figuring out
secret keys. But I'd be horrified to see a car manufacturer or plane
manufacturer saying "We don't really have to worry whether our seatbelts
work since they're only for disasters". We should be assuming there's a
disaster and making sure our seatbelts do what they're supposed to do.

---D. J. Bernstein

_______________________________________________
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls

Reply via email to