Hi Nikita, Andrey and all,

My apologies, I misread mails by super sloppy reading.
I'll explain basis by my idea clearly and properly this time.
This mail is long.

Basis of my idea is
 - Salt is made to optional only for applications that such value is not
available. (From RFC 5869)
 - Omitting salt could lead to security disaster. i.e. password leak.
 - Combined key, output key and salt, as final key(combined key) is common
in many use cases.
 - Many HKDF applications with PHP must/should have salt for better
implementation.
 - API should encourage "salt" use by its signature. (From RFC 5869)

Ref: https://tools.ietf.org/html/rfc5869

Before restart discussion, there should be rationale for others.

In theory, cryptographic hashes are cryptographically secure. Therefore,
following operations should be considered as secure by definition.

$new_key = sha1('some original key' . 'strong salt' . 'some info');
$signature = sha1('some data' . 'strong key');

However, in real world, people come up with better idea for cryptographic
hashes. Sometimes people invent efficient way to attack cryptographic
hashes.

HMAC is known and proven method to generate more secure signature.
Therefore,
$signature = hash_hmac('sha1', 'some data', 'some key');
is secure even when cryptographic hash had minor defect(s).

HKDF is made to generate secure new keys from existing key suitable
for required operations by using HMAC.
HKDF inputs
 - IKM, input key which may be weak or strong
 - salt, some entropy which makes HKDF stronger overall, may be secret or
non secret/weak or strong.
 - info, which specifies HKDF contexts that are non secret usually. e.g. a
protocol number, algorithm identifiers, user identities, etc.
 - length(L), output key length

HKDF calculates output key(OKM) as follows

Extract step
   PRK = HMAC-Hash(salt, IKM)

This step is designed to make strong output key(OKM) and PRK always.
OKM to be secure, either IKM or salt must be strong.

Expand step
   N = ceil(L/HashLen)
   T = T(1) | T(2) | T(3) | ... | T(N)
   OKM = first L octets of T

   where:
   T(0) = empty string (zero length)
   T(1) = HMAC-Hash(PRK, T(0) | info | 0x01)
   T(2) = HMAC-Hash(PRK, T(1) | info | 0x02)
   T(3) = HMAC-Hash(PRK, T(2) | info | 0x03)

Note: OKM is output key material that is return value from HKDF.

This step is designed to make derived key have the length(L) from
strong key(PRK) generated by Extract step. Key context(info) is
distinguished by this step also.


Both salt and info is optional, but RFC 5869 states differently.

  "HKDF is defined to operate with and without random salt.  This is
   done to accommodate applications where a salt value is not available.
   We stress, however, that the use of salt adds significantly to the
   strength of HKDF, ensuring independence between different uses of the
   hash function, supporting "source-independent" extraction, and
   strengthening the analytical results that back the HKDF design."

This statement implies salt is almost mandatory parameter for HKDF
when salt can be used. In contrast, info is described as pure optional
parameter for key context.

  "While the 'info' value is optional in the definition of HKDF, it is
   often of great importance in applications.  Its main objective is to
   bind the derived key material to application- and context-specific
   information.  For example, 'info' may contain a protocol number,
   algorithm identifiers, user identities, etc."

With regard to mandatoriness parameters, strong salt is mandatory to derive
cryptographically strong key when input key is weak, while info/length is
optional always.

In addition, it is common that salt being used as a part of combined keys.
Salt is mandatory for such applications. There are many authentication
implementations that use
 - key which does not disclose original key by hashing
 - nonce(salt)

Access permission with timeout is another typical usage that requires
timestamp as salt and combined key.

HKDF can produce secure key, which protects input key(IKM) and
generates strong output key (OKM), but this is true only when
IKM or salt is strong. Use of weak IKM and salt could lead password leak.


For above reasons, I'm proposing change
   string hash_hkdf(string $hash, string $ikm [, int $length=0 [, string
$info='' [, string $salt='']]])
to
   string hash_hkdf(string $hash, string $ikm, string $salt [, string
$info='' [, int $length=0 ]])
     - To omit salt, $salt=NULL. $salt='' raise exception.


On Mon, Jan 16, 2017 at 8:08 PM, Nikita Popov <nikita....@gmail.com> wrote:

> Making the salt required makes no sense to me.
>
> HKDF has a number of different applications:
> a) Derive multiple strong keys from strong keying material. Typical case
> for this is deriving independent encryption and authentication keys from a
> master key. This requires only specification of $length. A salt is neither
> necessary nor useful in this case, because you start with strong
> cryptographic keying material.
>

Very true.

Shorter password would hide IKM/salt/info state, so shorter output
key could be said more secure. e.g. SHA512 and 32 bytes output key
This is only applicable when IKM and/or salt is strong, though.

If we could assume IKM to be always cryptographic

Extract step

   PRK = HMAC-Hash(salt, IKM)

is not needed, but only

   T(1) = HMAC-Hash(PRK, T(0) | info | 0x01)

In short, if key is strong,

    hash_hmac('sha256', $key, 1)
    substr(hash_hmac('sha256', $key, 1), 0, $length)

is good enough. Thus, HKDF is not needed.

But in real world, IKM does not have to be cryptographically secure.
Therefore, HKDF users must ensure secure PRK by

   PRK = HMAC-Hash(salt, IKM)

This requirement makes "salt" more important than "length" because
strong PRK is mandatory for HKDF security. i.e. IKM or salt  must be
strong always.

For the same reason as "Apply HTML escape for all vars", random salt
regardless of key strength whenever possible, is safer. It prevents
weak IKM leak by misuse effectively.


b) Generating per-session (or similar) keys from a (strong cryptographic)
> master key. For this purpose you can specify the $info parameter. again, a
> salt is neither necessary nor useful in this case. (You could probably also
> use $salt instead of $info in this case, but the design of the function
> implies that $info should be used for this purpose.)
>


Assuming

    $_SESSION['master_key'] = random_bytes(80); // Keep this for this
session.

and current hash_hkdf() function signature

    string hash_hkdf(string $hash_function, string $ikm, [int $length [,
string $info [, string $salt]]]);

Your idea might be

    $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0,
session_id());

Although it works. Session ID is not identity, but key that identifies
user's
connection. Key(entropy) value for info parameter violates the RFC
recommended
info usage.

OR

You might be assuming logged in session, then username can be used as info
parameter.

    $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0,
$_SESSION['username']);

username is non secret user identifier. This usage matches RFC recommended
usage.
However, username is only available for logged in session, when username is
changed
during session it stops working. It stops working at logout also. And most
important of all,
this is not a per-session, but per-user...

The RFC compliant implementation with PHP is to use session_id() as secret
salt.

   $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0, '',
session_id());

This method is better because it works regardless of authentication/changed
username.  However, this version still has problem with regenerated session
ID.
This could be fixed with better salt.

   $_SESSION['salt'] = session_create_id();

then

   $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0, '',
$_SESSION['salt']);

This works for any session always. Both $new_key and $_SESSION['salt']
could be
safely passed as page content unlike session_id() as salt.

I have other examples like these that illustrates, users should consider
salt usage.


c) Extracting strong cryptographic keying material from weak cryptographic
> keying material. Standard example here is extracting strong keys from DH
> g^xy values (which are non-uniform) and similar. This is the usage that
> benefits from a $salt.
>

Even when strong IKM is used, low entropy salt like timestamp can be used
as combined key. This use case would be one of the most used with PHP.

Remember that HKDF is an extract-and-expand algorithm, and the extract step
> (which uses the salt) is only necessary if the input keying material is
> weak. We always include the extract step for compatibility with the overall
> HKDF construction (per the RFCs recommendation), but it's essentially just
> an unnecessary operation if you work on strong keying material.
>

I presume reply to a) should be sufficient for this.
Strong(cryptographic) key, e.g. random_bytes(80), can be used in many
cases.
Even when key supposed to be strong key, it would be better
to use salt to derive even stronger key. This practice will prevent
users to omit salt for weak input keys accidentally.
Slat could be used as combined key also.


The only thing that we may want to discuss is whether we should swap the
> $info and the $salt parameters. This depends on which usage (b or c) we
> consider more likely.



"length" has little relevance with respect to IKM protection and output key
(OKM)
security. While it is user's responsibility to ensure secure PRK and
protect
IKM by

Extract step
   PRK = HMAC-Hash(salt, IKM)

HKDF info parameter is unrelated to IKM protection and output key as per
the RFC,
but salt has the responsibility. Therefore, "salt" must have priority over
"info" and
"length", IMHO.

Importance with regard to security: salt >>>> info > length

We also must consider how output key and salt is used in real world PHP
applications.
There are applications that use output key and salt as combined key.
e.g. authentication, access key with expiration

    // URL access key with expiration
    $expire = time() + 90; // 90 sec timeout. Low entropy salt is allowed
with strong IKM.
    $key = hash_hkdf('sha256', $_SESSION['strong_master_key'], 0, $URL,
$expire);
    // Send $key and $expire as combined key for $URL

hash_hkdf() is key generation function and output key and salt are used as
"combined key" in many use cases. Parameter being as key is important.
Therefore,
commonly used combined key(salt) is better to locate after IKM.

Importance with regard to common use case: salt >> info > length

Although there are HKDF usage without salt, many HKDF applications with PHP
require or are better with salt. e.g. Previous per-session encryption.
Developers
will develop better application if they consider how salt could be used.
Therefore,
salt is better to be required parameter and omit it only when salt cannot
be used.

Importance with regard to education: salt >>>  info > length
(User must learn safe salt usage)

"length" is the least for me. "salt" and "info" has important effect for
derived
key/input key security. These 2 should have priority over "length".

Making the most responsible/sensitive parameter(salt) which is currently
optional
to required parameter should not be a issue for users.

If it is C, I don't care and accept whatever signature. C is full of pit
holes already. I just don't want to see news, "Passwords are stolen from
PHP
app!". hash_hkdf() is could be misused easily like

hash_hkdf('sha256', $weak_ikm, 9); // We can generate strong key easily,
Nice!
hash_hkdf('sha256', $weak_ikm, 9, 'Super User Only'); // Safe key for super
user
hash_hkdf('sha256', $strong_ikm, 4); // Secure and nice password for super
secret

These are security disaster. If salt is required, users would always think
about it
at least. Length should not be shortened unless user is absolutely sure.

If there are unclear sentences, please let me know.
Thank you for reading long mail!

Regards,

P.S. I'll be more careful, but I become very sloppy mail reader sometimes.
I appreciate if you could let know via private email. Thank you!

--
Yasuo Ohgaki
yohg...@ohgaki.net

Reply via email to