On Thu, Apr 13, 2017 at 11:22 PM, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:
> Hi Pieter and all, > > On Thu, Apr 13, 2017 at 5:11 PM, Pieter Hordijk <i...@pieterhordijk.com> > wrote: > > > Is this really something we need in our official docs instead of for > > example > > on a personal blog? > > > > I wrote draft doc patch. > Please verify. > > Index: en/reference/hash/functions/hash-hkdf.xml > =================================================================== > --- en/reference/hash/functions/hash-hkdf.xml (リビジョン 342317) > +++ en/reference/hash/functions/hash-hkdf.xml (作業コピー) > @@ -3,7 +3,7 @@ > <refentry xml:id="function.hash-hkdf" xmlns="http://docbook.org/ns/ > docbook" > xmlns:xlink="http://www.w3.org/1999/xlink"> > <refnamediv> > <refname>hash_hkdf</refname> > - <refpurpose>Generate a HKDF key derivation of a supplied key > input</refpurpose> > + <refpurpose>Derive secure new key from existing key by using > HKDF</refpurpose> > </refnamediv> > <refsect1 role="description"> > &reftitle.description; > @@ -16,6 +16,20 @@ > <methodparam > choice="opt"><type>string</type><parameter>salt</ > parameter><initializer>''</initializer></methodparam> > </methodsynopsis> > > + <para> > + RFC 5869 defines HKDF (HMAC based Key Derivation Function) which > + is general purpose KDF. HKDF could be useful for many PHP > + applications that require temporary keys, such CSRF token, > + pre-signed key for URI, password for password protected > + URI, and so on. > + </para> > + <note> > + <para> > + When info and length > + is not required for your program, more efficient > + <function>hash_hmac</function> could be used instead. > + </para> > + </note> > </refsect1> > <refsect1 role="parameters"> > &reftitle.parameters; > @@ -25,7 +39,7 @@ > <term><parameter>algo</parameter></term> > <listitem> > <para> > - Name of selected hashing algorithm (i.e. "sha256", "sha512", > "haval160,4", etc..) > + Name of selected hashing algorithm (i.e. "sha3-256", "sha3-512", > "sha256", "sha512", "haval160,4", etc..) > See <function>hash_algos</function> for a list of supported > algorithms. > <note> > <para> > @@ -39,7 +53,7 @@ > <term><parameter>ikm</parameter></term> > <listitem> > <para> > - Input keying material (raw binary). Cannot be empty. > + Input keying material. Cannot be empty. > </para> > </listitem> > </varlistentry> > @@ -60,7 +74,8 @@ > <term><parameter>info</parameter></term> > <listitem> > <para> > - Application/context-specific info string. > + Application/context-specific info string. Info is intended for > + public information such as user ID, protocol version, etc. > </para> > </listitem> > </varlistentry> > @@ -71,8 +86,32 @@ > Salt to use during derivation. > </para> > <para> > - While optional, adding random salt significantly improves the > strength of HKDF. > + While optional, adding random salt significantly improves the > + strength of HKDF. Salt could be either secret or > + non-secret. It is used as "Pre Shared Key" in many use cases. > + Strong value is preferred. e.g. Use > <function>random_bytes</function>. > + Optimal salt size is size of used hash algorithm. > </para> > + <warning> > + <para> > + Although salt is the last optional parameter, salt is the > + most important parameter for key security. Omitted salt is > + indication of inappropriate design in most cases. Users must > + set appropriate salt value whenever it is possible. Omit salt > + only when it cannot be used. > + </para> > + <para> > + Strong salt is mandatory and must be kept secret when input > + key is weak, otherwise input key security will not be kept. > + Even when input key is strong, providing strong salt is the > + best practice for the best possible key security. > + </para> > + <para> > + Salt must not be able to be controlled by users. i.e. User > + must not be able to set salt value and get derived key. User > + controlled salt allows input key analysis to attackers. > + </para> > + </warning> > </listitem> > </varlistentry> > </variablelist> > @@ -101,6 +140,99 @@ > &reftitle.examples; > <para> > <example> > + <title>URI specific CSRF token that supports expiration by > <function>hash_hkdf</function></title> > + <programlisting role="php"> > +<![CDATA[ > +<?php > +define('CSRF_TOKEN_EXPIRE', 180); // CSRF token expiration > +define('CSRF_TOKENS', 5); // Last 5 CSRF tokens are valid > + > +/************************************** > + * Implementation note > + * > + * It uses "counter" for CSRF expiration management. > + * "counter" is very low entropy, but input key is strong and > + * CSRF_TOKEN_SEED is short term key. It should be OK. > + * > + * This CSRF token implementation has pros and cons > + * > + * Pros > + * - A CSRF token is valid only for specific URI. > + * - No database is required for URI specific CSRF tokens. > + * - Only CSRF token is required. i.e. No timestamp parameter. > + * - When user is active, a CSRF token is valid upto CSRF_TOKEN_EXPIRE * > CSRF_TOKENS sec. > + * - Even when user had long idle time, CSRF token is valid. > + * - CSRF token will expire eventually. > + * - Invalidating all active CSRF tokens could be done by > unset($_SESSION['CSRF_TOKEN_SEED']). > + * It is recommended to reset CSRF tokens by login/logout event at > least. > + * It may be good idea to invalidate all of older CSRF tokens when idle > time is long. > + * > + * Cons > + * - There could be no CSRF expiration time. > + * > + * Precise CSRF token expiration is easy. Just add timestamp parameter > + * as "info" and check it. > + **************************************/ > + > +session_start(); > +if (empty($_SESSION['CSRF_TOKEN_SEED'])) { > + $_SESSION['CSRF_TOKEN_SEED'] = random_bytes(32); > + $_SESSION['CSRF_TOKEN_COUNT'] = 1; > + $_SESSION['CSRF_TOKEN_EXPIRE'] = time(); > +} > + > + > +function csrf_get_token($uri) { > + // Check expiration > + if ($_SESSION['CSRF_TOKEN_EXPIRE'] + CSRF_TOKEN_EXPIRE < time()) { > + $_SESSION['CSRF_TOKEN_COUNT']++; > + $_SESSION['CSRF_TOKEN_EXPIRE'] = time(); > + } > + // Equivalent(NOT exactly the same) value by using hash_hmac() > + // return hash_hmac('sha3-256', hash_hmac('sha3-256', > $_SESSION['CSRF_TOKEN_SEED'], $_SESSION['CSRF_TOKEN_COUNT']), $uri); > + return hash_hkdf('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], 0, $uri, > $_SESSION['CSRF_TOKEN_COUNT']); > +} > + > +function csrf_validate_token($csrf_token, $uri) { > + for($i = 0; $i < CSRF_TOKENS; $i++) { > + // Equivalent(NOT exactly the same) value by using hash_hmac() > + // $token = hash_hmac('sha3-256', hash_hmac('sha3-256', > $_SESSION['CSRF_TOKEN_SEED'], $_SESSION['CSRF_TOKEN_COUNT'] - $i), $uri); > + $token = hash_hkdf('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], 0, > $uri, $_SESSION['CSRF_TOKEN_COUNT'] - $i); > + if (hash_equals($csrf_token, $token)) { > + return TRUE; > + } > + } > + return FALSE; > +} > + > + > +//// Generating CSRF token //// > +// $uri is target URI that browser POSTs form data > +$uri = 'https://example.com/some_form/'; > +$csrf_token = csrf_get_token($uri); > +// embed $csrf_token to your form > + > +//// Validating CSRF token //// > +$csrf_token = $_POST['csrf_token'] ?? ''; > +if (!csrf_validate_token($csrf_token, $_SERVER['REQUEST_URI'])) { > + // Invalid CSRF token > + throw new Exception('CSRF token validation error'); > +} > +// valid request > +?> > +]]> > + </programlisting> > + <para> > + Common CSRF token uses the same token value for a session and all > + URI. This example CSRF token expires and is specific to a > + URI. i.e. CSRF token http://example.com/form_A/ is not valid for > + http://example.com/form_B/ Since token value is computed, no > + database is required. > + </para> > + </example> > + </para> > + <para> > + <example> > <title><function>hash_hkdf</function> example</title> > <programlisting role="php"> > <![CDATA[ > @@ -124,6 +256,30 @@ > </para> > </example> > </para> > + <para> > + <example> > + <title><function>hash_hkdf</function> bad example</title> > + <para> > + Users must not simply extend input key material length. HKDF does > + not add additional entropy automatically. Therefore, weak key > + remains weak unless strong salt is supplied. Following is bad > + example. > + </para> > + <programlisting role="php"> > +<![CDATA[ > +<?php > +$inputKey = get_my_aes128_key(); // AES 128 bit key > + > +// Derive AES 256 key from AES 128 key > +$encryptionKey = hash_hkdf('sha256', $inputKey, 32, 'aes-256-encryption'); > +// Users should not do this. $encryptionKey only has 128 bit > +// entropy while it should have 256 bit entropy. > +// To derive strong AES 256 key, strong enough salt is required. > +?> > +]]> > + </programlisting> > + </example> > + </para> > </refsect1> > > <refsect1 role="seealso"> > @@ -130,6 +286,7 @@ > &reftitle.seealso; > <para> > <simplelist> > + <member><function>hash_hmac</function></member> > <member><function>hash_pbkdf2</function></member> > <member><link xlink:href="&url.rfc;5869">RFC 5869</link></member> > <member><link > xlink:href="&url.git.hub;narfbg/hash_hkdf_compat">userland > implementation</link></member> > Strong -1 on these docs changes. They are wrong and they will confuse users about when and how HKDF should be used. I have no idea where you got the idea that HKDF is supposed to be used for CSRF token generation, but it isn't. I did not check whether your code is correct and secure, but CSRF token generation is certainly not a common or typical application of HKDF and as such should not be present in the documentation. Your "bad example" is actually pretty much the textbook use-case for HKDF. The way you wrote it (get a AES-256 key from an AES-128 key) doesn't make much sense, but the general principle of extracting two keys (for encryption and authentication) from a single key is one of *the* use-cases of HKDF. It is also, contrary to your statement in the documentation snippet, perfectly cryptographically sound. A salt is not required for this case. A salt *may* be beneficial, but for entirely different reasons (as Scott pointed out, for many block cipher modes fixed encryption keys only have a lifetime of around 2^64 encryptions, past which point IV collisions are to be expected -- a salt in key derivation could mitigate this.) Nikita