On Thu, Apr 13, 2017 at 11:22 PM, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:

> Hi Pieter and all,
>
> On Thu, Apr 13, 2017 at 5:11 PM, Pieter Hordijk <i...@pieterhordijk.com>
> wrote:
>
> > Is this really something we need in our official docs instead of for
> > example
> > on a personal blog?
> >
>
> I wrote draft doc patch.
> Please verify.
>
> Index: en/reference/hash/functions/hash-hkdf.xml
> ===================================================================
> --- en/reference/hash/functions/hash-hkdf.xml    (リビジョン 342317)
> +++ en/reference/hash/functions/hash-hkdf.xml    (作業コピー)
> @@ -3,7 +3,7 @@
>  <refentry xml:id="function.hash-hkdf" xmlns="http://docbook.org/ns/
> docbook"
> xmlns:xlink="http://www.w3.org/1999/xlink";>
>   <refnamediv>
>    <refname>hash_hkdf</refname>
> -  <refpurpose>Generate a HKDF key derivation of a supplied key
> input</refpurpose>
> +  <refpurpose>Derive secure new key from existing key by using
> HKDF</refpurpose>
>   </refnamediv>
>   <refsect1 role="description">
>    &reftitle.description;
> @@ -16,6 +16,20 @@
>     <methodparam
> choice="opt"><type>string</type><parameter>salt</
> parameter><initializer>''</initializer></methodparam>
>    </methodsynopsis>
>
> +  <para>
> +   RFC 5869 defines HKDF (HMAC based Key Derivation Function) which
> +   is general purpose KDF. HKDF could be useful for many PHP
> +   applications that require temporary keys, such CSRF token,
> +   pre-signed key for URI, password for password protected
> +   URI, and so on.
> +  </para>
> +  <note>
> +    <para>
> +      When info and length
> +      is not required for your program, more efficient
> +      <function>hash_hmac</function> could be used instead.
> +    </para>
> +  </note>
>   </refsect1>
>   <refsect1 role="parameters">
>    &reftitle.parameters;
> @@ -25,7 +39,7 @@
>       <term><parameter>algo</parameter></term>
>       <listitem>
>        <para>
> -       Name of selected hashing algorithm (i.e. "sha256", "sha512",
> "haval160,4", etc..)
> +       Name of selected hashing algorithm (i.e. "sha3-256", "sha3-512",
> "sha256", "sha512", "haval160,4", etc..)
>         See <function>hash_algos</function> for a list of supported
> algorithms.
>         <note>
>          <para>
> @@ -39,7 +53,7 @@
>       <term><parameter>ikm</parameter></term>
>       <listitem>
>        <para>
> -       Input keying material (raw binary). Cannot be empty.
> +       Input keying material. Cannot be empty.
>        </para>
>       </listitem>
>      </varlistentry>
> @@ -60,7 +74,8 @@
>       <term><parameter>info</parameter></term>
>       <listitem>
>        <para>
> -       Application/context-specific info string.
> +       Application/context-specific info string. Info is intended for
> +       public information such as user ID, protocol version, etc.
>        </para>
>       </listitem>
>      </varlistentry>
> @@ -71,8 +86,32 @@
>         Salt to use during derivation.
>        </para>
>        <para>
> -       While optional, adding random salt significantly improves the
> strength of HKDF.
> +        While optional, adding random salt significantly improves the
> +        strength of HKDF. Salt could be either secret or
> +        non-secret. It is used as "Pre Shared Key" in many use cases.
> +        Strong value is preferred. e.g. Use
> <function>random_bytes</function>.
> +        Optimal salt size is size of used hash algorithm.
>        </para>
> +      <warning>
> +        <para>
> +         Although salt is the last optional parameter, salt is the
> +         most important parameter for key security. Omitted salt is
> +         indication of inappropriate design in most cases. Users must
> +         set appropriate salt value whenever it is possible. Omit salt
> +         only when it cannot be used.
> +        </para>
> +        <para>
> +         Strong salt is mandatory and must be kept secret when input
> +         key is weak, otherwise input key security will not be kept.
> +         Even when input key is strong, providing strong salt is the
> +         best practice for the best possible key security.
> +        </para>
> +        <para>
> +         Salt must not be able to be controlled by users. i.e. User
> +         must not be able to set salt value and get derived key. User
> +         controlled salt allows input key analysis to attackers.
> +        </para>
> +      </warning>
>       </listitem>
>      </varlistentry>
>     </variablelist>
> @@ -101,6 +140,99 @@
>    &reftitle.examples;
>    <para>
>     <example>
> +    <title>URI specific CSRF token that supports expiration by
> <function>hash_hkdf</function></title>
> +    <programlisting role="php">
> +<![CDATA[
> +<?php
> +define('CSRF_TOKEN_EXPIRE', 180); // CSRF token expiration
> +define('CSRF_TOKENS', 5); // Last 5 CSRF tokens are valid
> +
> +/**************************************
> + * Implementation note
> + *
> + * It uses "counter" for CSRF expiration management.
> + * "counter" is very low entropy, but input key is strong and
> + * CSRF_TOKEN_SEED is short term key. It should be OK.
> + *
> + * This CSRF token implementation has pros and cons
> + *
> + * Pros
> + *  - A CSRF token is valid only for specific URI.
> + *  - No database is required for URI specific CSRF tokens.
> + *  - Only CSRF token is required. i.e. No timestamp parameter.
> + *  - When user is active, a CSRF token is valid upto CSRF_TOKEN_EXPIRE *
> CSRF_TOKENS sec.
> + *  - Even when user had long idle time, CSRF token is valid.
> + *  - CSRF token will expire eventually.
> + *  - Invalidating all active CSRF tokens could be done by
> unset($_SESSION['CSRF_TOKEN_SEED']).
> + *    It is recommended to reset CSRF tokens by login/logout event at
> least.
> + *    It may be good idea to invalidate all of older CSRF tokens when idle
> time is long.
> + *
> + * Cons
> + *  - There could be no CSRF expiration time.
> + *
> + * Precise CSRF token expiration is easy. Just add timestamp parameter
> + * as "info" and check it.
> + **************************************/
> +
> +session_start();
> +if (empty($_SESSION['CSRF_TOKEN_SEED'])) {
> +    $_SESSION['CSRF_TOKEN_SEED'] = random_bytes(32);
> +    $_SESSION['CSRF_TOKEN_COUNT'] = 1;
> +    $_SESSION['CSRF_TOKEN_EXPIRE'] = time();
> +}
> +
> +
> +function csrf_get_token($uri) {
> +    // Check expiration
> +    if ($_SESSION['CSRF_TOKEN_EXPIRE'] + CSRF_TOKEN_EXPIRE < time()) {
> +        $_SESSION['CSRF_TOKEN_COUNT']++;
> +        $_SESSION['CSRF_TOKEN_EXPIRE'] = time();
> +    }
> +    // Equivalent(NOT exactly the same) value by using hash_hmac()
> +    // return hash_hmac('sha3-256', hash_hmac('sha3-256',
> $_SESSION['CSRF_TOKEN_SEED'], $_SESSION['CSRF_TOKEN_COUNT']), $uri);
> +    return hash_hkdf('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], 0, $uri,
> $_SESSION['CSRF_TOKEN_COUNT']);
> +}
> +
> +function csrf_validate_token($csrf_token, $uri) {
> +    for($i = 0; $i < CSRF_TOKENS; $i++) {
> +        // Equivalent(NOT exactly the same) value by using hash_hmac()
> +        // $token = hash_hmac('sha3-256', hash_hmac('sha3-256',
> $_SESSION['CSRF_TOKEN_SEED'], $_SESSION['CSRF_TOKEN_COUNT'] - $i), $uri);
> +        $token = hash_hkdf('sha3-256', $_SESSION['CSRF_TOKEN_SEED'], 0,
> $uri, $_SESSION['CSRF_TOKEN_COUNT'] - $i);
> +        if (hash_equals($csrf_token, $token)) {
> +            return TRUE;
> +        }
> +    }
> +    return FALSE;
> +}
> +
> +
> +//// Generating CSRF token ////
> +// $uri is target URI that browser POSTs form data
> +$uri = 'https://example.com/some_form/';
> +$csrf_token = csrf_get_token($uri);
> +// embed $csrf_token to your form
> +
> +//// Validating CSRF token ////
> +$csrf_token = $_POST['csrf_token'] ?? '';
> +if (!csrf_validate_token($csrf_token, $_SERVER['REQUEST_URI'])) {
> +    // Invalid CSRF token
> +    throw new Exception('CSRF token validation error');
> +}
> +// valid request
> +?>
> +]]>
> +    </programlisting>
> +    <para>
> +     Common CSRF token uses the same token value for a session and all
> +     URI. This example CSRF token expires and is specific to a
> +     URI. i.e. CSRF token http://example.com/form_A/ is not valid for
> +     http://example.com/form_B/ Since token value is computed, no
> +     database is required.
> +    </para>
> +   </example>
> +  </para>
> +  <para>
> +   <example>
>      <title><function>hash_hkdf</function> example</title>
>      <programlisting role="php">
>  <![CDATA[
> @@ -124,6 +256,30 @@
>      </para>
>     </example>
>    </para>
> +  <para>
> +   <example>
> +    <title><function>hash_hkdf</function> bad example</title>
> +    <para>
> +     Users must not simply extend input key material length. HKDF does
> +     not add additional entropy automatically.  Therefore, weak key
> +     remains weak unless strong salt is supplied. Following is bad
> +     example.
> +    </para>
> +    <programlisting role="php">
> +<![CDATA[
> +<?php
> +$inputKey = get_my_aes128_key(); // AES 128 bit key
> +
> +// Derive AES 256 key from AES 128 key
> +$encryptionKey = hash_hkdf('sha256', $inputKey, 32, 'aes-256-encryption');
> +// Users should not do this. $encryptionKey only has 128 bit
> +// entropy while it should have 256 bit entropy.
> +// To derive strong AES 256 key, strong enough salt is required.
> +?>
> +]]>
> +    </programlisting>
> +   </example>
> +  </para>
>   </refsect1>
>
>   <refsect1 role="seealso">
> @@ -130,6 +286,7 @@
>    &reftitle.seealso;
>    <para>
>     <simplelist>
> +    <member><function>hash_hmac</function></member>
>      <member><function>hash_pbkdf2</function></member>
>      <member><link xlink:href="&url.rfc;5869">RFC 5869</link></member>
>      <member><link
> xlink:href="&url.git.hub;narfbg/hash_hkdf_compat">userland
> implementation</link></member>
>

Strong -1 on these docs changes. They are wrong and they will confuse users
about when and how HKDF should be used.

I have no idea where you got the idea that HKDF is supposed to be used for
CSRF token generation, but it isn't. I did not check whether your code is
correct and secure, but CSRF token generation is certainly not a common or
typical application of HKDF and as such should not be present in the
documentation.

Your "bad example" is actually pretty much the textbook use-case for HKDF.
The way you wrote it (get a AES-256 key from an AES-128 key) doesn't make
much sense, but the general principle of extracting two keys (for
encryption and authentication) from a single key is one of *the* use-cases
of HKDF. It is also, contrary to your statement in the documentation
snippet, perfectly cryptographically sound. A salt is not required for this
case. A salt *may* be beneficial, but for entirely different reasons (as
Scott pointed out, for many block cipher modes fixed encryption keys only
have a lifetime of around 2^64 encryptions, past which point IV collisions
are to be expected -- a salt in key derivation could mitigate this.)

Nikita

Reply via email to