Hi!

> Let's begin discussing the prospect of adding libsodium as a core extension
> in PHP 7.1. I've updated the RFC to explain why this would be a good idea
> and the benefits it offers.
> 
> https://wiki.php.net/rfc/libsodium
> 
> If the subsequent discussion goes smoothly, I would like to open voting on
> June 15.

Some notes based on the docs in https://paragonie.com/book/pecl-libsodium:
TLDR: I think the idea of having something like this in core is very
nice, but I think it needs some tweaks in order to be a solution we can
recommend as core module.

- I would appreciate deeper namespacing. It is rarely that you would use
all sub-modules (such as encryption, both symmetric and asymmetric,
password hashing, decryption, random, etc.) in the same piece of code.
So if I could just "use Sodium\Crypto;" and then use short names it
would be much nicer than having to spell out the whole story.

- Also, I'm not sure why functions like
crypto_aead_chacha20poly1305_encrypt() exist. Shouldn't algorythm be a
parameter to encryption function and not function name? Putting
parameter into function names makes it much harder to be flexible and
configurable. Or, if there's only one, then why not just call it
encrypt() and state the fixed algorithm in the docs?

- The function names in general are kind of all over the place. I.e., to
get a random, you get:
- for string: _buf
- for ranged (I assume unsigned?) integer: _uniform
- for 16-bit unsigned integer: _random16
If you can notice a pattern here, I can't. Also, I don't see why you
have a function for 16-bit int but not for 8-bit or 32-bit. Additionally
weird that _uniform gets int, but has limit different from PHP int.

- Naming/namespacing of constant-time function should make it clear they
are there because they are constant time, not just because somebody
wanted to reimplement bin2hex. It's also unclear why we need memcmp as
separate function from hash_equals().
Also, I'm still not convinced bin2hex in PHP even has a timing problem.
I haven't seen anything but vague generic theoretising in this regard.
Same with hex2bin (in fact, even more since hex2bin doesn't even have
index lookups).

- What exactly memzero does in PHP? Looks like it accepts argument
by-ref, which means if you do something like:

function foo($secret_key) {
// do stuff with key
memzero($secret_key);
}

and then:

$key = get_key(); foo($key);

then $key still has the key. So I'm not sure how that all memory wiping
is still working in practice. I mean, it may work with random keys that
you generate, use once and immediately destroy, but in any other
scenario it's just wasting time. Even with random keys it's iffy since
it assumes your RNG state is either not in memory or is protected.
E.g. in the example at
https://paragonie.com/book/pecl-libsodium/read/09-recipes.md#encrypted-cookies
keys are meticulously erased, but since they are all deterministically
derived from $this->key (and cookie name which is public) which stays in
memory, what exactly is the point?

- increment() needs clearer explanation what it does (e.g. does it
accept any string? What exactly does it do to that string?) and also it
seems to be it would be nicer to just return the value. Or at least have
this option - mutation in place is not always what you want.

- Why version functions are functions and not constants?

- Names like "secretbox" are weird to me - what exactly is "secretbox" -
is it encryption? Decryption? hashing? Is it symmetric or asymmetric? It
would be nicer to use established and more clear terminology. Same with
other names - e.g. crypto_auth - what it does? Does it check the message
for authenticity or generates the signature for somebody else to check?
Turns out the latter, but it's not clear from the name.

- Same with crypto_box() - actually it differs from crypto_secretbox()
by... try to guess what. You would never guess. The former is
asymmetric, the latter is symmetric.

- It's not clear why keypair parameter is string - if it's two keys, how
it is one string? Also, why you need a keypair to encrypt-decrypt if
you'd use only one key in each case (public or private)? What you do if
you only have public key - how would you decrypt a message? Can you
decrypt a message at all if you don't have secret key - and if so, how?

- crypto_box_keypair_from_secretkey_and_publickey seems to use secret
and public key from different people, so the result is not what is
usually called a key pair (set of public+private keys belonging to one
user) - in fact, I don't know what the result is. Could you explain?

- After looking into crypto_sign_open I understands that it separates
the message from attached signature and verifies the signature. But it's
a bit confusing as it doesn't actually open anything.

- It is unclear why you need different functions to generate encryption
keypair and signing keypair.

- The API never seems to either document or allow any choice of the
algorithms. While it can be a boon for simplicity, I do not see how such
code could be made to interoperate with code in other languages and
other systems. I mean, I can generate the signature easily, but how
would I check such signature in, say, Python or Java? The docs give no
clue. It's OK to support only one algorithm if it's considered superior
(though what the users would do if it turns out not to be the case?) but
at least it should be thoroughly documented.

- If crypto_pwhash_str highly recommends using provided constants, why
not make these parameters optional with defaults being recommended
constants?

- crypto_pwhash_scryptsalsa208sha256 looks awfully specific (and long).
Is that the only algorithm supported? If yes, do we need to spell it out
in the name (and not the docs)? If not, why it's not a parameter?

- Is there any support for importing/exporting keys in any of the
established formats? Is the library inter-operable with any of the
existing key/data interchange formats?

- Function names like crypto_sign_ed25519_sk_to_curve25519 seem to not
be very usable to common people. The docs says "Transform crypto_sign
key into crypto_box key" - maybe the name should reflect it?
Names like crypto_stream_xor have a similar problem - what it does is
encrypts stream with a key, according to docs. Looks like it uses XOR
for it, but is it the most important part for the user or the fact it
encrypts? Maybe it should be crypto_stream_encrypt or something? Also,
does it really do just XOR? Because XOR is not really the best idea to
do even if you don't care about authenticity - due to trivial chosen
plaintext attack. Probably it does more than mere XOR, but then I'm not
sure what exactly.

- Why one would use crypto_scalarmult? That looks like a random piece of
low level API without clear indication why it exists.

- What is the difference between crypto_stream and randombytes_buf? Both
seem to produce a random string of given length. I assume crypto_stream
does more but the docs do not explain.

- For crypto_kx, it is not clear what exactly this function does and why
the same parameter is passed twice in different places. I mean, if you
know the details of how DH key exchange works, it may make sense, but if
we want to make it simple for the common user, it needs some better
explanations and maybe better parameters.

- Names like CRYPTO_PWHASH_SCRYPTSALSA208SHA256_OPSLIMIT_INTERACTIVE
look very painful. Yes, people have IDEs and copy-paste, but imaging to
have to type it just once over ssh connection in vim, where copy-paste
is not working. Can't we have something more concise and put the rest in
the docs?

This came out longer than I expected (hopefully somebody made this far?)
and more disorganised, so I provide a quick summary:
- Need better naming, both namespaces and functions
- Need better docs explaining what the functions do and why

The stated goal is "You shouldn't need a Ph.D in Applied Cryptography to
build a secure web application." I fully agree with this goal. I however
feel that current implementation, while making admirable progress
towards this goal, still needs some work to actually achieve it.

-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to