On Thu, Aug 17, 2017 at 12:02 AM, Solar Designer <so...@openwall.com> wrote:

> On Wed, Aug 16, 2017 at 11:41:55PM +0200, Solar Designer wrote:
> > On Wed, Aug 16, 2017 at 10:06:02PM +0200, Nikita Popov wrote:
> > > I'd suggest to split the 32-bit and 64-bit code codepaths entirely, as
> the
> > > interleaved #ifs are somewhat hard to follow. Something like
> > > https://gist.github.com/nikic/64e7ec58ebb6121d350fb80927a65082 (not
> > > thoroughly tested).
> >
> > This looks good to me - especially how you reduced the nesting of if's
> > by special-casing the "Powers of two are not biased" return.  With this
> > change, you can as well drop the "Special case where no modulus is
> > required", as it'd happen to be handled the same by your new return.
> > OTOH, that optimization might require an extra comment on its own.
> >
> > Here's what this might look like (totally untested):
> >
> > https://gist.github.com/solardiz/5e3d313bbee2c1ce6e200e433b750bef
>
> One difference I didn't notice at first: the currently committed code
> does only one php_mt_rand() call per loop iteration when it's skipping
> "numbers over the limit", even in the 64-bit case (thus, it replaces 32
> bits of the value at a time).  Your 64-bit version (and my revision of
> it) does two calls per iteration (replacing the whole number).
>
> Arguably, the currently committed code is smarter and more efficient in
> that respect, whereas yours is cleaner.  Regardless, it's an extra
> change that will affect the generated random numbers in some cases where
> we could avoid making that change (it's not fixing any bug).
>
> I think it's preferable not to unnecessarily change what numbers are
> generated.
>
> Alexander
>

Good point. However, I think that this optimization is not actually
correct. For example, let's take umax = 0x1_0000_0001, which is the
smallest value for which this codepath is taken. In this case limit would
be 0xffff_ffff_ffff_fffe, so we only resample if the value is exactly
0xffff_ffff_ffff_ffff. So if the resampling codepath is taken in this case,
and we only generate one new 32-bit value, the top word of the result will
be fixed to 0xffff_ffff. (A very small bias in this case, but there's
probably more significant cases.)

Nikita

Reply via email to