On 8 Sep 2017, at 8:31, Solar Designer wrote:
On Fri, Sep 08, 2017 at 07:56:23AM -0400, Tom Worster wrote:
From: Nikita Popov <nikita....@gmail.com>
Sorry for the long delay. I've just applied
https://github.com/php/php-src/commit/fd07302024bc47082b13b32217147fd39d1e9e61
to the 7.2 branch.
Davey, Joe, do we want to take action here for 7.1? It's a pretty
severe
bias, but fixing it is going to change seed sequences. I think at
this
point we're too far in the 7.1 cycle to apply this kind of change.
I think it is very unlikely that anyone has PHP software that relies
on
predictable output given a 64-bit seed. And, yes, the bias is bad so
I
would not worry about fixing it asap.
This sounds confused. There's no 64-bit seed - PHP's mt_srand() only
supports 32-bit seeds. Then you say "the bias is bad" and at the same
time "would not worry about fixing it asap", which look inconsistent.
The original problem I reported applies to 64-bit builds of PHP -
which
is probably most builds these days - when mt_rand() is invoked with a
range that fits in 32 bits - which again is the typical case for the
use
of ranges. However, the bias can be large only for large ranges (yet
not exceeding 32 bits). For typically used small ranges, the bias is
small. Also, fixing the bug doesn't fully change the sequence of
generated random numbers - for typically used small ranges, the
probability that the fix changes a random number to another (for the
same seed) is small. So the sequences will change, but not fully.
I'm
not sure if this is good or bad, as sometimes complete failure of
something that worked for someone before is preferable; I merely point
out what will actually happen.
Later in the discussion, Nikita pointed out an extra problem (also
causing biases) that affected the rarely-used 64-bit ranges.
Similarly,
fixing it doesn't fully change the sequence of generated random
numbers -
again, for typically used small ranges (this time relative to the
64-bit space), the probability that the fix changes a random number to
another (for the same seed) is small.
Another detail is that these fixes make 32- and 64-bit builds of PHP
consistent, which isn't the case for 7.1.x now. So retaining the bugs
in 7.1.x for consistent behavior doesn't exactly achieve that - it
does
for consistency within 7.1.x series, but not across 32- vs. 64-bit
builds. Fixing the bugs would achieve the latter, but break the
former.
I have no strong preference here. I merely point out the confusion
and
try to correct it.
Yes, I was confused. I meant to talk about large ranges but even so your
summary is an education so thank you.
My input is to offer an opinion on the relative importance of
considerations.
Fixing the bias would be an urgent priority because I think I a lot of
programs are written assuming a uniform distribution.
While I broadly agree with what you describe as "typical", it might be
hard for a user to know how big a problem the bias is in their
situation. Fixing the bias eliminates that doubt and the handwaving
about what is typical.
Programs that exploit the predictable property are specialized
(comparing different monte carlo experiments based on the same
pseudorandom input is the only example I know) and I think much less
common in PHP. (Note: these programs are also likely to need unbiased
stats.) I doubt that something will fail (i.e. break as in BC break) due
to inconsistency within 7.1.x but the change might cause some extra work
or faulty experimental conclusions. If I were an experimenter dealing
with this change I'd rather rerun the cases I ran on the buggy RNG than
continue with a known bad RNG. And I'd rather do this sooner than later.
I think we serve this specialized community better (if it exists at
all!?) fixing it in 7.1, which also helps make these users aware of the
bug. Everyone else is probably either unaffected by the fix or their
programs will behave better.
Tom
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php