On Wed, Aug 23, 2017 at 10:05 PM, Solar Designer <so...@openwall.com> wrote:
> On Thu, Aug 17, 2017 at 03:18:30PM +0200, Solar Designer wrote: > > On Thu, Aug 17, 2017 at 12:57:56AM +0200, Nikita Popov wrote: > > > On Thu, Aug 17, 2017 at 12:02 AM, Solar Designer <so...@openwall.com> > wrote: > > > > One difference I didn't notice at first: the currently committed code > > > > does only one php_mt_rand() call per loop iteration when it's > skipping > > > > "numbers over the limit", even in the 64-bit case (thus, it replaces > 32 > > > > bits of the value at a time). Your 64-bit version (and my revision > of > > > > it) does two calls per iteration (replacing the whole number). > > > > > > > > Arguably, the currently committed code is smarter and more efficient > in > > > > that respect, whereas yours is cleaner. Regardless, it's an extra > > > > change that will affect the generated random numbers in some cases > where > > > > we could avoid making that change (it's not fixing any bug). > > > > > > > > I think it's preferable not to unnecessarily change what numbers are > > > > generated. > > > > > Good point. However, I think that this optimization is not actually > > > correct. For example, let's take umax = 0x1_0000_0001, which is the > > > smallest value for which this codepath is taken. In this case limit > would > > > be 0xffff_ffff_ffff_fffe, so we only resample if the value is exactly > > > 0xffff_ffff_ffff_ffff. So if the resampling codepath is taken in this > case, > > > and we only generate one new 32-bit value, the top word of the result > will > > > be fixed to 0xffff_ffff. (A very small bias in this case, but there's > > > probably more significant cases.) > > > > Great point. More generally, we can't reuse the same random number for > > decision-making (to skip it) and as part of the next (not so) random > > number, without introducing bias. So we shouldn't, and we should in > > fact move away from the old "smarter" behavior as a bug fix. Thank you! > > So I guess both the bug I reported and this one Nikita found are going > to get fixed soon? For 7.2.0 maybe? > > Meanwhile, I released php_mt_seed 4.0 yesterday with support for latest > PHP's mt_rand(), as well as with support for PHP 5.2.0 and below (as it > happens, all the way to 3.0.7, although that's overkill). Near the end > of its documentation, I included a lengthy section entitled "PHP version > curiosities", which I include below in this e-mail in case any of this > is useful for PHP's own documentation. It starts from PHP 3.0.6, but > then actually spends half of the text on PHP 7.1.0+. Here we go: > > --- > While php_mt_seed supports 3 major revisions of PHP's mt_rand() > algorithm and that sort of covers PHP 3.0.7 through 7.1.0+ (up to the > latest as of this writing and probably beyond), the reality is somewhat > trickier than that. From older versions to newer: > > As a mere historical curiosity, php_mt_seed is in fact able to crack > seeds of PHP 3.0.6, which is the very first version that introduced > mt_rand(), but only as long as no range was passed into mt_rand(). That > version had broken support for ranges, and indeed there's no point in > supporting that short-lived breakage in php_mt_seed now. With this > detail, php_mt_seed has some support for all mt_rand() capable versions > of PHP released so far. > > Then there's PHP 3.0.7 through 5.2.0, where Mersenne Twister's state > initialization is with multiples of 69069. This enables our stateless > implementation to quickly jump to the state array element needed to > compute the first mt_rand() output by using a precomputed value for > 69069 raised to the power 396 (mod 2**32), which is MT's M-1. Another > curiosity of those versions, which we take advantage of too, is that > they treat adjacent even and odd seeds the same, so the effective seed > space is 31-bit. > > PHP 3.0.6 to 4.1.2 used a default seed of 4357 (and thus also 4356) if > mt_srand() was not called. PHP 4.2.0 changed that to automatic seeding > using system time and PHP process ID (still predictable and now also > leaky, but no longer a constant), but there was "Bug #25007 rand & > mt_rand seed RNG every call" until 4.3.3, which presumably affected how > cracked seeds could (not) be used. > > PHP 5.2.1 changed MT state initialization to MT authors' new recommended > algorithm, which is no longer linear so we have to compute the first 397 > state elements (out of 624) even though in the simplest case we only > need (and only store) the first and last one of those (or we could use a > time-memory trade-off, which we currently don't). > > PHP 5.2.1 also introduced a bug into its implementation of MT (use of a > wrong variable, whereas pre-5.2.1 code was correct in that respect). > This bug lets us skip a few operations for every other seed, which we > do, although this optimization is so minor that we could as well not > bother. PHP 7.1.0 fixed this bug (reverting to pre-5.2.1 code in that > respect, so we use the same logic for pre-5.2.1 and 7.1.0+ there). > > In PHP versions from 3.0.7 to 7.0.x, if mt_rand() was called with its > optional output range specified, a 31-bit (0 to 2147483647 > <%28214%29%20748-3647>) MT PRNG > output was scaled to that range using floating-point math. This meant > that if a range wider than 31-bit was requested on a 64-bit build of > PHP, some values would never occur. This also meant that even for most > ranges smaller than 31-bit a bias was introduced (some output values > became more likely than others), as compared to MT's raw output (which > was relatively unbiased). > > PHP 7.1.0 tried to fix those biases by dropping the floating-point math > and instead mapping the raw 32-bit MT PRNG outputs to the target range > using integer modulo division. To avoid inherent bias when the target > range isn't a whole power of 2 of possible integer values, a loop was > introduced to skip raw 32-bit PRNG outputs (until a suitable one is > seen) that would result in such bias. A bug in that code was found and > reported due to work on php_mt_seed. As it turned out, the loop only > works right in 32-bit builds of PHP, and is ineffective on 64-bit > (except with 64-bit ranges, see below). Luckily, this actually makes > things simpler for php_mt_seed, and currently php_mt_seed fully supports > the behavior of 64-bit builds only (for ranges up to 0 to 2147483646 > <%28214%29%20748-3646>). > > There's currently no intent to add to php_mt_seed the complication of > bias-avoidance of 32-bit builds of PHP 7.1.0+, as well as of 64-bit > builds of future versions where the bug will presumably get fixed. What > this means in practice is that for 32-bit builds of PHP and future > versions of PHP, php_mt_seed may occasionally find wrong and miss > correct seeds for mt_rand() invoked with a range, but the probability of > this happening is very low except for very wide ranges that are not a > whole power of 2 of possible integer values. For example, mt_rand(0, > 61) or mt_rand(111, 222) are very unlikely to trigger the problem, > mt_rand(0, 255) can't trigger the problem, whereas mt_rand(1000000000, > 2000000000) is somewhat likely to trigger it. Such likely problematic > ranges are probably rarely used and are of little relevance to uses of > php_mt_seed. Also, supporting this buggy vs. correct behavior would > require treating 32- and 64-bit builds of PHP separately and reporting > on them differently. > > PHP 7.1.0 also tried to introduce proper support for 64-bit ranges in > 64-bit builds. It generates two raw 32-bit PRNG outputs to derive one > mt_rand() output when the target range spans more than a 32-bit space. > Unfortunately, the implementation is buggy in a way where it'd introduce > biases into such mt_rand() outputs. The bug will presumably get fixed > as well, but regardless there's currently no intent to support wider > than 31-bit ranges in php_mt_seed. This is obscure functionality > (arguably, originally an accidental misfeature, which the PHP developers > didn't really have to make official) that is only available on 64-bit > builds of PHP. Currently, php_mt_seed does not allow specifying larger > than 31-bit integers on its command line (it will report an error when a > larger value is specified). > > Prior to PHP 7.1.0, mt_rand(0, 2147483647 <%28214%29%20748-3647>) was > equivalent to mt_rand() > without a range, and php_mt_seed still assumes so. This assumption is > no longer valid for PHP 7.1.0+, which means that when searching for > seeds for PHP 7.1.0+ for mt_rand() called with a range specified, you > can specify at most a range one smaller than that, thus "0 2147483646 > <%28214%29%20748-3646>" > being the maximum that php_mt_seed supports for those versions. This > minor limitation shouldn't matter in practice, except that you might > need to be aware you can continue to specify a range of "0 2147483647 > <%28214%29%20748-3647>" > to indicate that no range was passed into mt_rand(). > > PHP 7.1.0 also aliased rand() to mt_rand() and srand() to mt_srand(). > This means that on one hand you can use php_mt_seed to crack rand() > seeds for PHP 7.1.0+ (since those are also mt_rand() seeds), but on the > other hand this cross-seeding and cross-consumption of random numbers > can affect which attacks work or don't work, and exactly how, against > specific applications that make use of both sets of PHP functions. > > PHP 7.1.0 also introduced MT_RAND_PHP as optional second parameter to > mt_srand(). When specified, it correctly enables behavior identical to > that of PHP versions 5.2.1 to 7.0.x. Thus, seeds that php_mt_seed > reports as valid for 5.2.1 to 7.0.x are always also valid for 7.1.0+ > with MT_RAND_PHP, and conversely seeds that php_mt_seed reports as valid > for 7.1.0+ are often invalid for 7.1.0+ with MT_RAND_PHP (except when > the same seeds are also valid for 5.2.1 to 7.0.x, which is common). > --- > > Alexander > Sorry for the long delay. I've just applied https://github.com/php/php-src/commit/fd07302024bc47082b13b32217147fd39d1e9e61 to the 7.2 branch. Davey, Joe, do we want to take action here for 7.1? It's a pretty severe bias, but fixing it is going to change seed sequences. I think at this point we're too far in the 7.1 cycle to apply this kind of change. Nikita