Re: [PHP-DEV] [RFC] Libsodium - Discussion

Stanislav Malyshev Sun, 05 Jun 2016 15:40:51 -0700

Hi!

> This would entail a BC break against all software currently written
> using libsodium.
> Are you certain that deeper namespacing would be worth that trade-off?


This is certainly a concern. But this is solvable - we can have aliases,
for example, that would be compiled only in PECL build.

> Libsodium offers few primitives. For AEAD, you have:
> 
> - ChaCha20-Poly1305
> - AES-256-GCM (but only with modern hardware)
> - ChaCha20-Poly1305 (IETF version with a bigger nonce)
> 
> These are distinct functions in the C API to discourage "Hey, I'll add
> another algorithm and users can access it by changing the string
> that's passed".

I think somehow separating primitives from recommended API would be
useful. Or maybe it's just documenting...

> When all is said and done, crypto_aead_encrypt() and
> crypto_aead_decrypt() will be the CAESAR winner.
> https://competitions.cr.yp.to/caesar.html

That seems to be 1.5 years from now, so what the user would do now?

> To be honest, most of these aren't necessary in PHP anymore. They were
> mostly useful in PHP 5.x before we had random_bytes() and
> random_int().

OK, then I assume we can mark those as something like "use
random_bytes/random_int in PHP 7.x"?

> Alternatively, if we can make bin2hex() and hex2bin() always
> constant-time in PHP

I think you mean time not dependent on argument content, but only on
length?

> These are the facts:
> 
> 1. Index lookups based on the contents cryptographic secrets is a
> recipe for microarchitecture side-channels (e.g. FLUSH+RELOAD).

That's the theory, and even that part of theory only applies to bin2hex,
as far as I can see. It also assumes lookups within 16 bytes are
distinguishable - and not just distinguishable but distinguishable
enough so that you can see them through layers of overhead of PHP, PHP
code, server infrastructure, etc., which sounds doubtful for me on a
modern architecture.
For hex2bin, even that doubtful theory seems to be not applicable.

> 2. The current implementation of all RFC 4648 encoding functions that
> PHP has (which, strangely, doesn't include base32) violates rule 1.
> 
> https://cryptocoding.net/index.php/Coding_rules#Avoid_table_look-ups_indexed_by_secret_data

That again is theory, as above. It says it depends on cache hit/miss.
Line size in most common Intel processor is 64 bytes, on ARM IIRC it's
32 bytes, while hexconvtab is 16 bytes long. So I'm not sure how would
you get any cache hit/miss sensitivity here. Am I missing something?

> Right, it has to be used with care. But the existence of a
> well-written cross-platform memory-zeroing function is almost reason
> enough to vote in favor of libsodium.

The problem is that no amount of care would actually help in a common
scenario. Due to PHP's copy-on-write semantics, it just wouldn't do what
you think it would do, in most cases. The only way it would work if you
ensure no reference to this variable ever exists throughout the code
except for one passed to memzero. And this is usually not easy to
achieve. So I think you overestimate the usefulness of it,
unfortunately. It _might_ work for temporary keys that are created and
destroyed in the same function without storing them anywhere, but in
most scenarios these keys are derived from stored keys anyway.

> That's a good question:
> http://www.daemonology.net/blog/2014-09-06-zeroing-buffers-is-insufficient.html

"Insufficient" is kind of misleading there - it's not merely
"insufficient", it is not adding practically anything, the level of
compromise is the same before and after.

> It's used for incrementing a nonce for, e.g. CTR mode. In little-endian.

You mean treat the whole string as little-endian representation of the
arbitrary-sized integer and increment that integer and return the
result? What if the size of the resulting integer is larger - does it
reallocate or overflow? Should be documented.

> See http://nacl.cr.yp.to

Are we bound by naming decisions of this project? Maybe, then we need
some good explanation of what they actually mean.

>> - crypto_box_keypair_from_secretkey_and_publickey seems to use secret
>> and public key from different people, so the result is not what is
>> usually called a key pair (set of public+private keys belonging to one
>> user) - in fact, I don't know what the result is. Could you explain?
> 
> A keypair is used in some APIs (crypto_box_seal_open comes to mind).
> All that function does is concatenate the two.
> 
> It has nothing to do with key exchange.

That is rather unfortunate reuse of the terminology. Can we use some
other term in PHP API or docs? Also, why concatenating two keys is
useful - are they then used as a single key? Or they are de-concatenated
internally? If the latter, it would be better to pass them as separate
parameters.

> Libsodium is an opinionated cryptography library that doesn't give you
> a lot of levers to pull -- only secure defaults you can't change,
> consisting of carefully-selected cryptography primitives.

OK, I can see this as a valid approach, but then:
- The choices have to be *very* thoroughly documented, otherwise
interoperating with such library is nearly impossible
- The approach should be consistent - i.e. the underlying algorithms
should be clearly segregated.

>> While it can be a boon for simplicity, I do not see how such
>> code could be made to interoperate with code in other languages and
>> other systems.
> 
> That isn't its purpose. Its purpose is to be secure. (It also happens
> to also be extremely fast.)

I would say such approach can have its place, but one has to be very
careful using it for any system that has any chance of interacting with
a wider world. I.e. encrypting internal data - fine, but any external
communication - one then would need to use something with a wider
appeal. But then the question one would ask is - why have two crypto
libraries in one code base? Not an argument against having this lib,
just food for thought.

> https://download.libsodium.org/libsodium/content/public-key_cryptography/authenticated_encryption.html#algorithm-details

This is a 404 for me.

> This is how a stream cipher works:
> 
> 1. You take your key and nonce, and generate a long string of random
> bytes deterministically from your key and nonce.
> 2. To encrypt: You XOR every byte of plaintext with a corresponding
> byte of the keystream.
> 3. To decrypt: You XOR every byte of _ciphertext_ with a corresponding
> byte of the keystream.
> 
> This is unauthenticated encryption; don't use it unless you're
> building something with it and authenticate elsewhere.

OK, so the _xor function does both 1 and 2? That sounds sensible, but
probably needs to be explained in the docs, to avoid misunderstanding.
Also, name like stream_cypher or something like that may be better, I'm
not sure exposing the fact that XOR is being used is that important.

>> - Why one would use crypto_scalarmult? That looks like a random piece of
>> low level API without clear indication why it exists.
> 
> It's used to derive a shared secret from an X25519 secret key and an
> X25519 public key.

Ok, so this is a part of underlying low-level API? That's fine but it
could be nicer if one could somehow follow which functions are low and
high level and how they relate.

>> - What is the difference between crypto_stream and randombytes_buf? Both
>> seem to produce a random string of given length. I assume crypto_stream
>> does more but the docs do not explain.
> 
> crypto_stream is deterministic; randombytes_buf is non-deterministic

Ah, ok, now I see it. So it's the step 1 in crypto_stream_xor above.
That makes total sense, but I think should be explained. Note that for
*you* as the developer it may seem completely obvious and me asking
about it sounds like nitpicking. But you know the overall picture and
the intent of the API, you knew it in your mind before you wrote the
whole thing. I, on the other hand, have only the docs to rely on, and
have to discover that intent from those docs. So I think there should be
more clues to it :)

>> - For crypto_kx, it is not clear what exactly this function does and why
>> the same parameter is passed twice in different places. I mean, if you
>> know the details of how DH key exchange works, it may make sense, but if
>> we want to make it simple for the common user, it needs some better
>> explanations and maybe better parameters.
> 
> This is basically a multi-step crypto_scalarmult in one complete pacakge.

This seems to be leaking details of the implementation somewhat through
the abstraction. If we define the goal of API as to provide abstraction
that allows to move from set of keys to shared secret, then I think the
API should look like:

$shared_secret = crypto_shared_secret($my_public_key, $my_private_key,
$their_public_key);

or something like that. In fact for DH I'm not even sure one needs
$my_public_key? And it may be possible to derive from secret key anyway,
so maybe just ($my_private_key, $their_public_key) - unless common case
is that we have both private and public keys already.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [RFC] Libsodium - Discussion

Reply via email to