> On Nov 24, 2015, at 10:06 AM, Eric Rescorla <e...@rtfm.com> wrote:
> 
> 
> 
> On Tue, Nov 24, 2015 at 9:53 AM, Mike Hamburg <m...@shiftleft.org 
> <mailto:m...@shiftleft.org>> wrote:
> 
>> 
>> In general, servers have signature keys, not static DH keys. QUIC bridges 
>> this by
>> having the server generate an offline signature over a static DH key, but 
>> TLS explicitly
>> rejected this as a generic approach because of concerns about the impact of 
>> producing
>> a long-term delegated credential, especially if generic TLS credentials 
>> could be used to
>> do so (see the extensive discussion on the list a while back on the mailing 
>> list as well
>> as [0]). So, the current design requires the server to prove present 
>> possession of the
>> signing key, not just that it possessed it at some point.
>> 
>> It's correct that demonstrating proof of possession of a long-term DH share 
>> is
>> somewhat faster than signatures. There are two potential ways to do this with
>> TLS while retaining the guarantees above:
> 
> Correct-ish. For example, the current implementation of ed448 takes 463k 
> skylake cycles (new cpu, top of the chart, I'm on a phone, sorry) to compute 
> ecdh, which would need to happen twice. But it takes 162kcy to sign and 509k 
> to verify, for a total of 671k vs 926k.  Signing favors the server while 
> double DH favors the client; there are good reasons to go in either direction 
> in this.
> 
> In general, I tend to want to favor the server, I think :)

Actually, I did a quick test and the signatures still look faster when 
optimized.  The hitch is that the dual scalarmul algorithm must be side-channel 
protected, which is pretty expensive.  I did a quick implementation this 
morning with libdecaf on Haswell.  I’m sure it could be done better, but here’s 
a rough bench, not counting point encode and decode.

For Ed448, a protected fixed-base scalarmul (for signing) takes 109kcy and an 
unprotected verification double-scalarmul takes 517kcy, but a protected 
dual-scalarmul takes 690kcy which is more than both together.

For Ed25519, these take 164kcy, 40kcy and 236kcy respectively; again the dual 
scalarmul is slower.

Note that these numbers are for some internal testing version, so they’re not 
as optimized as the SUPERCOP numbers, but also that they don’t include point 
encode and decode.

So yeah, you’d need ECMQV to get a speed boost from this, or you’d need to care 
more about the client computation and bandwidth than the server computation.

Cheers,
— Mike
_______________________________________________
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls

Reply via email to