On Tue, May 18, 2021 at 6:19 PM Go Kudo <zeriyo...@gmail.com> wrote: > Hello internals. > > I have created a draft of the RFC. > > https://wiki.php.net/rfc/rng_extension >
Hello Go Kudo, First of all, thank you for the (re)work. I think the API looks better, but I still have some remarks/questions (roughly in order of appearance, with some overlap): - How many bytes is `Source::next()` supposed to generate? Is it even intended to be called directly (vs `Randomizer::generateBytes(int $length)`)? - The stubs in "Proposal" should show `__construct()` signatures for the classes implementing `Source`. - About the `MT19937PHP` variant: The PHP manual for `mt_srand()` describes the `MT_RAND_PHP` mode like this: "Uses an *incorrect* Mersenne Twister implementation which was used as the default up till PHP 7.1.0. This mode is available for backward compatibility."; do we really need/want to port it to this new extension? - I think you said in the past that `XorShift128Plus` should be the preferred/default implementation, and `MT19937` is only provided for compatibility/migration? But if so, that's not clear at all in the RFC. - `PlatformProvided` feels like the "odd one out" here: its constructor [assuming same as previous RFC] doesn't take a seed, and it isn't serializable; I understand why, but the main point of the RFC is "RNG reproducibility", which this class cannot achieve. Also, we already have `random_bytes()`/`random_int()`, and you're proposing to change `shuffle()`/`str_shuffle()`/`array_rand()`'s internal RNG from `mt_rand()`-like to `random_bytes()`-like, so what reason would there be to use this class? only `generateFloat()`? but couldn't that be a new global function `random_float()`? All in all, wouldn't it be simpler to drop `PlatformProvided`, and have all `Source` implementations take a seed on construction and be serializable? - Technical, about `Randomizer::generateFloat()`: Is the returned float guaranteed to be finite (i.e. never NAN nor +/-INF)? And may it ever return -0.0 (negative zero)? and even if not, is 0.0 twice as likely as other floats? Also, no `generateFloat(float $min, float $max)` variant like for `generateInt()`? - **About main usage of the API:** To get e.g. $n (say 3) pseudo-random integers between $min (say 0) and $max (say 100) using the "best" implementation (which we have to know/guess is XorShift128+, isn't it?), we must: *first* choose a $seed (`time()`, maybe?), *then* do `$source = new XorShift128Plus($seed);` *and* `$rng = new Randomizer($source);` [even if $source (and $seed) could be inlined, that's still several "steps"], *then* we can finally call `$rng->generateInt($min, $max);` $n times. Is that correct? If so, it may make sense from the implementer side, but from the user side it doesn't look like the most friendly. Also, that lets the possibility of constructing a second Randomizer using the same $source (or `$rng->getSource()`), which could be undesirable for reproducibility, isn't it? Wouldn't it be better to "merge" Randomizer into the interface/classes? Actually I see that's the case in your other project ext/orng ( https://github.com/zeriyoshi/php-ext-orng/blob/master/rng/rnginterface.stub.php and https://github.com/zeriyoshi/php-ext-orng/blob/master/rng/xorshift128plus.stub.php ), why the different approach for ext/rng? Maybe even better (and already suggested), have Randomizer constructor take an *optional* algorithm (string, default XorShift128Plus::class?) and an *optional* seed (int, default time()?) and construct the source internally (and don't expose it)? So we could do just `$rng = new Randomizer();` to rely on the defaults (then call `$rng->generateInt($min, $max);` $n times). - Finally, I think the vote should be split into 3 parts: the new classes, the deprecations (which itself could be split between [mt_]srand and [mt_]rand), and the change of internal RNG for shuffle/str_shuffle/array_rand. Best regards, -- Guilliam Xavier