I ran mt_rand() through dieharder and it appears to perform well. I put
the results here:
https://gist.github.com/tom--/a12175047578b3ae9ef8
On 2/19/16 8:39 PM, Andrea Faulds wrote:
PHP's implementation of the Mersenne Twister algorithm is buggy, so it
doesn't produce the same output as in other languages. But the buggy
algorithm produces sufficiently random sequences of apparently the same
quality as the proper algorithm.
I don't think it's safe to say that mt_rand() has the *same* qualities
as MT19937. mt_rand()'s output has been tested using the available
randomness testers and seems ok. But randomness testing is tricky and
shows only that an RNG probably passes those specific tests, not that it
has, for example, 623-dimensional equidistribution.
So we *could* simply consider this as a
documentation issue if we wanted to. I'm not saying that's the right
course of action, though.
mt_rand() is really weird.
- Some unique RNG not described or studied in the literature that by
some fluke(*) appears to work, in a manner of speaking.
- It's output is 31-bits wide.
- It's scaling to a given [min, max] range is crazy.
It's so weird I would suggest documenting the problems it and leave it
alone.
Users that don't need to reseed and regenerate a sequence can use
random_bytes() and random_int(). Those that *do* need to reseed but
don't need specifically MT19937 are probably adequately served by mt_rand().
PHP is an unlikely language for the typical programs that specifically
need MT19937. I doubt we would sort out anyone's existing problems by
fixing it. If I'm wrong and there is indeed a need for this kind of RNG
then I'd rather see an API that supports more than just this one generator.
So I don't think PHP should feel obliged to provide a correct MT19937,
although it should correctly document what it does provide, like the
hash ext.
Tom
(*) At one level it astonishes me that the buggy mt_rand() works at all
as an RNG, given that it's algorithm presumably was never actually
designed. But the fact that it passes the standard statistical tests
makes me wonder if it is MT19937 in disguise. I tried to figure out if
its output is a function of MT19937's, perhaps a bit permutation, for
example, but didn't get far.
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php