On Thu, Aug 17, 2017 at 12:57:56AM +0200, Nikita Popov wrote: > On Thu, Aug 17, 2017 at 12:02 AM, Solar Designer <so...@openwall.com> wrote: > > One difference I didn't notice at first: the currently committed code > > does only one php_mt_rand() call per loop iteration when it's skipping > > "numbers over the limit", even in the 64-bit case (thus, it replaces 32 > > bits of the value at a time). Your 64-bit version (and my revision of > > it) does two calls per iteration (replacing the whole number). > > > > Arguably, the currently committed code is smarter and more efficient in > > that respect, whereas yours is cleaner. Regardless, it's an extra > > change that will affect the generated random numbers in some cases where > > we could avoid making that change (it's not fixing any bug). > > > > I think it's preferable not to unnecessarily change what numbers are > > generated.
> Good point. However, I think that this optimization is not actually > correct. For example, let's take umax = 0x1_0000_0001, which is the > smallest value for which this codepath is taken. In this case limit would > be 0xffff_ffff_ffff_fffe, so we only resample if the value is exactly > 0xffff_ffff_ffff_ffff. So if the resampling codepath is taken in this case, > and we only generate one new 32-bit value, the top word of the result will > be fixed to 0xffff_ffff. (A very small bias in this case, but there's > probably more significant cases.) Great point. More generally, we can't reuse the same random number for decision-making (to skip it) and as part of the next (not so) random number, without introducing bias. So we shouldn't, and we should in fact move away from the old "smarter" behavior as a bug fix. Thank you! Alexander -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php