On 19/10/15 21:43, Scott Arciszewski wrote:
(...)
If you have a keyspace of 2^32 possible output sequences like we do
from rand(), we can say that after 65,536 there is a 50% probability
of finding at least one collision.
It should go without saying, but if users have weak/common password
choices and their salts collide, they will end up with duplicate
bcrypt hashes.
My conclusion is thus: Yes, we do need a CSPRNG here, especially if we
want to encourage the use of the password_* API for any large system.
That assumes that:
(a) rand() has the same internal state on each password generation. Given
that it is seeded once per minit, that suggests they are using something
like cgi.
Which is not too suitable for large systems.
(b) You have people sharing the same password *and* colliding in the
same salt bucket.
Supposing that you have an equiprobable 4-byte salt (which is actually
the case to which
we fall if (a) holds), and 152 million users (the amount of the adobe
leak, biggest to date)
we would only have collisions on the top-13 passwords… not even
colliding on 1234 (maybe
relax it to the top-16 or top-25 to account for not being completely
fairy distributed).Which
are passwords in the really really bad zone, and would be cracked even
with no help of collisions.
And that's another point, a salt used by two users (ie. finding one
collision) won't speed you
much in terms of cracking their respective passwords.
Undesirable? Sure. Broken? Not really.
Tom Worster wrote:
I don't follow the logic of this hypothetical. php_rand() is *only*
relevant to this discussion if PHP can't read /dev/random or
CryptGenRandom, in which case we know that the state of php_rand()'s
LCG is not randomized (it's open to tampering) so this statistical
analysis is not possible.
I've verified that password_hash() without /dev/urandom can produce
systematically predictable salts, repeating a sequence of just two
salts. There's nothing statistical involved. Reported in bug 70743.
Seems crypt() is similarly afflicted.
Only providing two hashes *would* be very worrisome. Note however that
you are crippling rand() by always resetting the seed (those two
different outputs appear because it is xoring with the same sequence).
It's also reading unitialized memory in line 155, which is probably
accounting for the salts being different every time. And a change in
allocations for the first run being different.
Whereas on a real world usage, even if a new php instance was used every
time, you would need a GENERATE_SEED() collision. It initializes the
seed using the timestamp, pid, microseconds and thread id (plus any
state change caused if different php scripts were run).
You have a very good point in that it would fail only if /dev/urandom
does not exist at all, my initial concerns came from thinking that it
could fail on a low-entropy case, but that won't happen on Linux
(urandom always works), and even a urandom blocking OS (eg. on FreeBSD)
wouldn't make it use the fallback code. Seems it would only be used if
(a) you don't have a /dev/urandom at all or (b) it isn't the special
file it is supposed to be (eg. it's a regular file). In which case you
are running a broken OS :)
So, changing my previous opinion, it'd be acceptable to drop it (I'd
prefer it being done in the point release, though).
Best regards