Hi Nikita,
Nikita Popov wrote:
On Thu, Feb 18, 2016 at 11:45 PM, Zeev Suraski <z...@zend.com> wrote:
* rand(), the first function anyone will try, uses a potentially horrible
libc RNG
The manual could really be better here. It mentions that rand()'s
maximum output might be quite small on platforms like Windows, whereas
it ought to recommend mt_rand() in most cases.
* It was recently noticed that the mt_rand() implementation contains a
typo and our output differs from the original well-researched algorithm. As
yet it is unclear what that typo does to the quality of the output.
I'm not a statistician, so take what I say with a grain of salt. But
from an amateur analysis, it seems to have fairly uniform output:
https://www.reddit.com/r/lolphp/comments/46fxi8/typofixing_commit_in_mersenne_twister_rng_code_is/d0552tb
* mt_getrandmax() is 2^31-1 even on 64-bit machines and numbers are scaled
using floating point multiplication. That means if you tell mt_rand() to
generate a 64-bit random numbers by specifying the range, only a tiny
fraction of numbers can actually be hit. I also strongly suspect that the
floating point scaling is inherently non-uniform even for smaller ranges.
Perhaps we could produce an E_NOTICE if you give a range that's too large?
* Functions like array_rand() or shuffle() use rand() and not mt_rand(),
so if you're on Windows and your array is larger than some 30k elements the
output will likely be severely biased.
If we fix this, we should do it at the same time as we fix mt_rand()'s
typo, since both could potentially break code (realistically just unit
tests) relying on deterministic output.
Even though changing our PRNG implementations will break seed sequences, I
think the time has come to clean up this mess for 7.1. (We might also want
to consider to alias rand and mt_rand to an entirely new algorithm, not
MT19937. Nowadays PRNGs are available that have both better statistical
properties and are faster than MT.)
If we did that, we could change it to always use a 64-bit value
internally (including on 32-bit systems through emulation). That way
some of our scaling woes would disappear.
On a different note, I don't think that philosophical discussions on the
topic of how much we ought to be deprecating will be very productive --
this is one of those topics people tend to be very stubborn about ;) Some
people value stability above everything else, and for others the number one
evil in PHP is our reluctance to get rid of old ---crap--- cruft. It would
be nice if we could let voting decide that question, and keep this thread
focused on specific issues and suggestion. I.e. on one hand suggestions for
things that we may want to deprecate, together with reasoning for why we
should do it.
I do think we should at least have a good case made for deprecating each
item. If something is merely a redundant alias or not very useful, then
there's not much case for getting rid of it, because there's little
benefit and the large disadvantage of breaking existing code. On the
other hand, if it's harmful in some way, then there's a greater case for
deprecation.
Thanks!
--
Andrea Faulds
https://ajf.me/
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php