On Wed, Aug 23, 2017 at 10:05 PM, Solar Designer <so...@openwall.com> wrote:

> On Thu, Aug 17, 2017 at 03:18:30PM +0200, Solar Designer wrote:
> > On Thu, Aug 17, 2017 at 12:57:56AM +0200, Nikita Popov wrote:
> > > On Thu, Aug 17, 2017 at 12:02 AM, Solar Designer <so...@openwall.com>
> wrote:
> > > > One difference I didn't notice at first: the currently committed code
> > > > does only one php_mt_rand() call per loop iteration when it's
> skipping
> > > > "numbers over the limit", even in the 64-bit case (thus, it replaces
> 32
> > > > bits of the value at a time).  Your 64-bit version (and my revision
> of
> > > > it) does two calls per iteration (replacing the whole number).
> > > >
> > > > Arguably, the currently committed code is smarter and more efficient
> in
> > > > that respect, whereas yours is cleaner.  Regardless, it's an extra
> > > > change that will affect the generated random numbers in some cases
> where
> > > > we could avoid making that change (it's not fixing any bug).
> > > >
> > > > I think it's preferable not to unnecessarily change what numbers are
> > > > generated.
> >
> > > Good point. However, I think that this optimization is not actually
> > > correct. For example, let's take umax = 0x1_0000_0001, which is the
> > > smallest value for which this codepath is taken. In this case limit
> would
> > > be 0xffff_ffff_ffff_fffe, so we only resample if the value is exactly
> > > 0xffff_ffff_ffff_ffff. So if the resampling codepath is taken in this
> case,
> > > and we only generate one new 32-bit value, the top word of the result
> will
> > > be fixed to 0xffff_ffff. (A very small bias in this case, but there's
> > > probably more significant cases.)
> >
> > Great point.  More generally, we can't reuse the same random number for
> > decision-making (to skip it) and as part of the next (not so) random
> > number, without introducing bias.  So we shouldn't, and we should in
> > fact move away from the old "smarter" behavior as a bug fix.  Thank you!
>
> So I guess both the bug I reported and this one Nikita found are going
> to get fixed soon?  For 7.2.0 maybe?
>
> Meanwhile, I released php_mt_seed 4.0 yesterday with support for latest
> PHP's mt_rand(), as well as with support for PHP 5.2.0 and below (as it
> happens, all the way to 3.0.7, although that's overkill).  Near the end
> of its documentation, I included a lengthy section entitled "PHP version
> curiosities", which I include below in this e-mail in case any of this
> is useful for PHP's own documentation.  It starts from PHP 3.0.6, but
> then actually spends half of the text on PHP 7.1.0+.  Here we go:
>
> ---
> While php_mt_seed supports 3 major revisions of PHP's mt_rand()
> algorithm and that sort of covers PHP 3.0.7 through 7.1.0+ (up to the
> latest as of this writing and probably beyond), the reality is somewhat
> trickier than that.  From older versions to newer:
>
> As a mere historical curiosity, php_mt_seed is in fact able to crack
> seeds of PHP 3.0.6, which is the very first version that introduced
> mt_rand(), but only as long as no range was passed into mt_rand().  That
> version had broken support for ranges, and indeed there's no point in
> supporting that short-lived breakage in php_mt_seed now.  With this
> detail, php_mt_seed has some support for all mt_rand() capable versions
> of PHP released so far.
>
> Then there's PHP 3.0.7 through 5.2.0, where Mersenne Twister's state
> initialization is with multiples of 69069.  This enables our stateless
> implementation to quickly jump to the state array element needed to
> compute the first mt_rand() output by using a precomputed value for
> 69069 raised to the power 396 (mod 2**32), which is MT's M-1.  Another
> curiosity of those versions, which we take advantage of too, is that
> they treat adjacent even and odd seeds the same, so the effective seed
> space is 31-bit.
>
> PHP 3.0.6 to 4.1.2 used a default seed of 4357 (and thus also 4356) if
> mt_srand() was not called.  PHP 4.2.0 changed that to automatic seeding
> using system time and PHP process ID (still predictable and now also
> leaky, but no longer a constant), but there was "Bug #25007 rand &
> mt_rand seed RNG every call" until 4.3.3, which presumably affected how
> cracked seeds could (not) be used.
>
> PHP 5.2.1 changed MT state initialization to MT authors' new recommended
> algorithm, which is no longer linear so we have to compute the first 397
> state elements (out of 624) even though in the simplest case we only
> need (and only store) the first and last one of those (or we could use a
> time-memory trade-off, which we currently don't).
>
> PHP 5.2.1 also introduced a bug into its implementation of MT (use of a
> wrong variable, whereas pre-5.2.1 code was correct in that respect).
> This bug lets us skip a few operations for every other seed, which we
> do, although this optimization is so minor that we could as well not
> bother.  PHP 7.1.0 fixed this bug (reverting to pre-5.2.1 code in that
> respect, so we use the same logic for pre-5.2.1 and 7.1.0+ there).
>
> In PHP versions from 3.0.7 to 7.0.x, if mt_rand() was called with its
> optional output range specified, a 31-bit (0 to 2147483647
> <%28214%29%20748-3647>) MT PRNG
> output was scaled to that range using floating-point math.  This meant
> that if a range wider than 31-bit was requested on a 64-bit build of
> PHP, some values would never occur.  This also meant that even for most
> ranges smaller than 31-bit a bias was introduced (some output values
> became more likely than others), as compared to MT's raw output (which
> was relatively unbiased).
>
> PHP 7.1.0 tried to fix those biases by dropping the floating-point math
> and instead mapping the raw 32-bit MT PRNG outputs to the target range
> using integer modulo division.  To avoid inherent bias when the target
> range isn't a whole power of 2 of possible integer values, a loop was
> introduced to skip raw 32-bit PRNG outputs (until a suitable one is
> seen) that would result in such bias.  A bug in that code was found and
> reported due to work on php_mt_seed.  As it turned out, the loop only
> works right in 32-bit builds of PHP, and is ineffective on 64-bit
> (except with 64-bit ranges, see below).  Luckily, this actually makes
> things simpler for php_mt_seed, and currently php_mt_seed fully supports
> the behavior of 64-bit builds only (for ranges up to 0 to 2147483646
> <%28214%29%20748-3646>).
>
> There's currently no intent to add to php_mt_seed the complication of
> bias-avoidance of 32-bit builds of PHP 7.1.0+, as well as of 64-bit
> builds of future versions where the bug will presumably get fixed.  What
> this means in practice is that for 32-bit builds of PHP and future
> versions of PHP, php_mt_seed may occasionally find wrong and miss
> correct seeds for mt_rand() invoked with a range, but the probability of
> this happening is very low except for very wide ranges that are not a
> whole power of 2 of possible integer values.  For example, mt_rand(0,
> 61) or mt_rand(111, 222) are very unlikely to trigger the problem,
> mt_rand(0, 255) can't trigger the problem, whereas mt_rand(1000000000,
> 2000000000) is somewhat likely to trigger it.  Such likely problematic
> ranges are probably rarely used and are of little relevance to uses of
> php_mt_seed.  Also, supporting this buggy vs. correct behavior would
> require treating 32- and 64-bit builds of PHP separately and reporting
> on them differently.
>
> PHP 7.1.0 also tried to introduce proper support for 64-bit ranges in
> 64-bit builds.  It generates two raw 32-bit PRNG outputs to derive one
> mt_rand() output when the target range spans more than a 32-bit space.
> Unfortunately, the implementation is buggy in a way where it'd introduce
> biases into such mt_rand() outputs.  The bug will presumably get fixed
> as well, but regardless there's currently no intent to support wider
> than 31-bit ranges in php_mt_seed.  This is obscure functionality
> (arguably, originally an accidental misfeature, which the PHP developers
> didn't really have to make official) that is only available on 64-bit
> builds of PHP.  Currently, php_mt_seed does not allow specifying larger
> than 31-bit integers on its command line (it will report an error when a
> larger value is specified).
>
> Prior to PHP 7.1.0, mt_rand(0, 2147483647 <%28214%29%20748-3647>) was
> equivalent to mt_rand()
> without a range, and php_mt_seed still assumes so.  This assumption is
> no longer valid for PHP 7.1.0+, which means that when searching for
> seeds for PHP 7.1.0+ for mt_rand() called with a range specified, you
> can specify at most a range one smaller than that, thus "0 2147483646
> <%28214%29%20748-3646>"
> being the maximum that php_mt_seed supports for those versions.  This
> minor limitation shouldn't matter in practice, except that you might
> need to be aware you can continue to specify a range of "0 2147483647
> <%28214%29%20748-3647>"
> to indicate that no range was passed into mt_rand().
>
> PHP 7.1.0 also aliased rand() to mt_rand() and srand() to mt_srand().
> This means that on one hand you can use php_mt_seed to crack rand()
> seeds for PHP 7.1.0+ (since those are also mt_rand() seeds), but on the
> other hand this cross-seeding and cross-consumption of random numbers
> can affect which attacks work or don't work, and exactly how, against
> specific applications that make use of both sets of PHP functions.
>
> PHP 7.1.0 also introduced MT_RAND_PHP as optional second parameter to
> mt_srand().  When specified, it correctly enables behavior identical to
> that of PHP versions 5.2.1 to 7.0.x.  Thus, seeds that php_mt_seed
> reports as valid for 5.2.1 to 7.0.x are always also valid for 7.1.0+
> with MT_RAND_PHP, and conversely seeds that php_mt_seed reports as valid
> for 7.1.0+ are often invalid for 7.1.0+ with MT_RAND_PHP (except when
> the same seeds are also valid for 5.2.1 to 7.0.x, which is common).
> ---
>
> Alexander
>

Sorry for the long delay. I've just applied
https://github.com/php/php-src/commit/fd07302024bc47082b13b32217147fd39d1e9e61
to the 7.2 branch.

Davey, Joe, do we want to take action here for 7.1? It's a pretty severe
bias, but fixing it is going to change seed sequences. I think at this
point we're too far in the 7.1 cycle to apply this kind of change.

Nikita

Reply via email to