2022年6月10日(金) 19:42 Tim Düsterhus <t...@bastelstu.be>:

> Hi
>
> On 6/10/22 12:02, Go Kudo wrote:
> >> It has a single generate(): string method that generates random numbers
> > as a binary string. This string must be non-empty and attempting to
> return
> > an empty will result in a RuntimeException.
> >> If you implement a random number generator in PHP, the generated numbers
> > must be converted to binary using the pack() function, and the values
> must
> > be little-endian.
>
> Thanks, that looks good to me.
>
> > * class Random Randomizer
> >
> >> The same current PHP algorithm is used to generate random numbers within
> > the specified range in Randomizer::getInt(). This is necessary for
> backward
> > compatibility.
> >> It also provides a guarantee of consistency in the future.
> >
> > However, I understand that perhaps this fix will not satisfy your
> request.
> > I just do not have a good understanding of your intentions due to my poor
> > English....
>
> Don't worry. I understand that using a foreign language can sometimes be
> hard - I am not a native speaker of English either and I suspect that
> this also applies to many other participants.
>
> > I am considering the following possibilities regarding your intentions:
> >
> > 1. Should be stated in detail so that consistency of results is
> maintained
> > in the future.
> > 2. Current PHP ranged random number generation algorithm has room for
> > improvement and should be examined further
> > 3. Consistency of results is difficult to maintain in the future (Maybe
> > incorrect)
> >
> > In case 1, I think the following statement would satisfy the requirement.
> >
> >> The values generated by the seedable Engine guarantee future
> > reproducibility of results, and the Randomizer uses the results to
> process
> > them, so if the results generated by the Engine are identical, the
> > Randomizer's results will also be consistent.
> >
> > Although the consistency of the Randomizer results is not mentioned here,
> > it would be a clear BC Break if the results were to change after the
> > Randomizer is officially implemented, so I do think it is sufficient that
> > an RFC be created and voted on as necessary.
>
> If I understand you correctly, then (1) is what I am looking for: It
> should be spelled out explicitly what behavior the user may rely on and
> what should be considered an implementation detail.
>
> Something like the following would work for me:
>
> ----
>
> The engines implement a specific well-defined random number generator.
> For a given seed it is guaranteed that they return the same sequence as
> the reference implementation.
>
> For the Randomizer it is considered a breaking change if the observable
> behavior of the methods changes. For a given seeded engine and identical
> method parameters the following must hold:
>
> - The number of calls to the Engine's ->generate() method remains the same.
> - The return value remains the same for a given result retrieved from
> ->generate().
>
> Any changes to the Randomizer that violate these guarantees require a
> separate RFC.
>
> ----
>
> > In case 2, I also thought about it along the way. Nikita also taught me
> > about better algorithms: https://externals.io/message/115918#115982
> >
> > But, I also think that the current PHP implementation is good enough, and
> > there is no need to change it to the point of breaking compatibility.
> >
> > I think the current global scope MT implementation is very troublesome
> for
> > some use cases and should first be able to be drop-in-replaceable with
> this
> > implementation.
> >
> > In case 3, I think it is necessary to guarantee consistency at least once
> > at the language level, even though this may change in the future. I have
> > already indicated the need for this in the RFC.
>
> Can you comment on whether the Randomizer behaves identically on both 32
> and 64 bit PHP and also on little endian and big endian architectures?
> As an example: Will ->getInt() calculate the same "uniform distribution"
> on all bitnesses? If not I consider that a bug.
>
> The engines *should* behave identically, because of the "little endian
> string" return value.
>
> If that's already the case then something like the following should be
> added to the RFC guarantees:
>
> - The implementation will guarantee that the same results are returned
> independent of the processor architecture (little endian / big endian)
> and integer bit length's (32 / 64).
>
> > Most of all, I do not believe you intend this to be the case. (And this
> > sentence is not intended to offend you either. Please don't misunderstand
> > me.)
> >
>
>
> Best regards
> Tim Düsterhus
>

It does not depend on endianness, but does depend on the number of bits.
This is because the new algorithm generates 64-bit values.

When using a 64-bit RNG in a 32-bit environment, Engine::generate() returns
the same binary string as in a 64-bit environment, but the Randomizer
methods return different values. This is because the size of zend_long in a
32-bit environment does not match uint64_t and is truncated.

To keep the results the same in 32-bit / 64-bit environments, only the
lower 32-bit of the 64-bit value should be used. However, this leads to
reduced randomness and does not seem appropriate considering that most
environments running PHP are 64-bit.

I have created a PoC that allows all internal operations to be performed in
64-bit environments to achieve the same results, although the efficiency of
execution in 32-bit environments will be reduced. (Note that
Randomizer::getInt() with no argument is still incompatible.)

https://github.com/php/php-src/commit/dbed218bfcd45e713fa3df2c88a4c2efce9f0651

Another idea I had was to throw an exception when trying to generate a
64-bit RNG in a 32-bit environment.

Regards,
Go Kudo

Reply via email to