Sorry, I'm late to reply.

> How much more "costly" would it be to define a class (implementing
RandomNumberGenerator) and use its (full) name as the algo identifier?

If the Random class always accepts an instance of the
RandomNumberGenerator, it will be necessary to provide a class that
implements the RandomNumberGenerator whenever you try to implement a new
RNG.
The current approach of specifying the algorithm by string does not require
the implementation of a class and is more developer friendly.

However, I may be the only one who writes an extension that adds RNG in the
first place, so the additional cost is well worth it.

> (already pointed out by Jordi) new Random(new CustomRNG(/*custom args*/),
$seed) is "illogical" (the passed $seed won't be used) but
"possible" (even if your current implementation throws a ValueError when
$seed is non-null);

I thought about it carefully after that, and this is indeed a problem.
It is difficult to find the problem easily by static analysis or other
methods.

> 64-bit problems

I feel like I'm making too big a deal out of this. This kind of
incompatibility happens some time in userland.
In conclusion, I see no problem with the implicit truncation behavior.



Indeed, the current implementation appears to have several problems. So,
how about the following implementation?

```php
interface RandomNumberGenerator
{
    public function generate(): int;
}

final class Random
{
    private RandomNumberGenerator $rng;
    public function __construct(?RandomNumberGenerator $rng = null)
    {
        $this->rng = $rng ?? new XorShift128Plus(random_int(PHP_INT_MIN,
PHP_INT_MAX));
    }
    public function nextInt(): int {}
    public function getInt(int $min, int $max): int {}
    public function getBytes(int $length): string {}
    public function shuffleArray(array $array): array {}
    public function shuffleString(string $string): string {}
    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
}

class XorShift128PlusNumberGenerator implements RandomNumberGenerator
{
    public function generate(): int {}
    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
}

class MT19937NumberGenerator implements RandomNumberGenerator
{
    public function generate(): int {}
    public function __serialize(): array {}
    public function __unserialize(array $data): void {}
}

class SecureNumberGenerator implements RandomNumberGenerator
{
    public function generate(): int {}
}
```

This is an approach similar to Nikita's `createDefault`. If the constructor
has a null argument, it uses the default, XorShift128+, internally.

Also, whether the Random class is serializable or clonable depends on the
instance of RandomNumberGenerator being used. This means that when the
Random class clone is called, the `$rng` member will be implicitly cloned.

How about this?

Regards,
Go Kudo

2021年6月10日(木) 1:11 Guilliam Xavier <guilliam.xav...@gmail.com>:

>
> On Wed, Jun 9, 2021 at 3:16 PM Go Kudo <zeriyo...@gmail.com> wrote:
>
>> Thanks Guilliam.
>>
>> > I'm not sure you really addressed
>> https://externals.io/message/114561#114680 ?
>>
>> I thought I had solved this problem by implementing the
>> `RandomNumberGenerator` interface.
>>
>> Accepting strings and objects as arguments is certainly complicated, but
>> I thought it met the necessary requirements.
>> From the aspect of static analysis, there seems to be no problem.
>>
>> Reverting to a full object-based approach would increase the cost of
>> providing new RNGs from extensions. I think the string approach is superior
>> in this respect.
>>
>
> How much more "costly" would it be to define a class (implementing
> RandomNumberGenerator) and use its (full) name as the algo identifier?
>
>
>> Nikita's `createDefault()` approach is certainly good, but I think it is
>> still difficult for beginners to understand.
>>
>> I also thought about it for a while after that, and I think that the
>> current implementation is the most appropriate for now. What do you think?
>>
>
> I still agree with Nikita that there's a discrepancy between e.g. `new
> Random(RANDOM_XXX, $seed)` and `new Random(new CustomRNG(/*custom
> args*/))`, which also poses additional questions, e.g.:
>
> - (already pointed out by Jordi) `new Random(new CustomRNG(/*custom
> args*/), $seed)` is "illogical" (the passed $seed won't be used) but
> "possible" (even if your current implementation throws a ValueError when
> $seed is non-null);
>
> - This opens the possibility of "leaking" the RNG "source" (even if no
> `public function getRNG()`), e.g.:
>
> ```
> $rng = new CustomRNG(/*custom args*/);
> $r1 = new Random($rng);
> $r2 = new Random($rng);
> var_dump($r1->nextInt() === $r2->nextInt()); // false (not true), I
> suppose?
> ```
>
> or:
>
> ```
> $rng = new CustomRNG(/*custom args*/);
> $r = new Random($rng);
> var_dump($r->nextInt()); // 1st of sequence
> $rng->generate();
> var_dump($r->nextInt()); // 3rd of sequence (not 2nd), I suppose?
> ```
>
> (possibly in less obvious ways), which may be considered bad or not (but
> still a discrepancy vs extension-provided algos).
>
> Again, taking a class name and optional extra args (or an $options array,
> like password_hash(), maybe even including 'seed' only for the algos that
> need one) to construct, rather than an already constructed object, might be
> better?
>
> But that would also "split" the constructor args from the class
> (increasing the risk of "mismatch"), and make using anonymous classes less
> easy, and maybe we haven't considered all the uses cases (e.g. what about
> `Random::getAlgoInfo(CustomRNG::class)`?)... So that might also be worse :/
>
> It would be great to have more insights! (if nobody else speaks up then
> let's keep it as is of course)
>
>
>> > Couldn't the Random class simply implement `public function __clone():
>> void` (that would internally do like `$this->rng = clone $this->rng;`)?
>>
>> Implicitly cloning in areas where the user cannot interfere may cause
>> problems when using objects that cannot be cloned.
>> However, this may be overthinking it. The reason is that a
>> `RandomNumberGenerator` that cannot be cloned should be implemented as algo
>> in the first place.
>> If there are no objections, I will change the implementation.
>>
>
> Ah, true, I hadn't thought about non-clonable implementations... But
> that's already the case for `__serialize()` and non-serializable
> implementations, right? (But let's wait a bit for possible objections,
> indeed)
>
>
>> > Indeed, a decision has to be made. If you throw an exception, we
>> wouldn't have the "issue" of different results on 32- vs 64-bit machines,
>> but XorShift128+ and Secure would be unusable on a 32-bit machine, right?
>>
>> I don't see any problem with the current implicit truncation behavior for
>> this. Even if a user switches to a 64-bit environment, compatibility can be
>> maintained by explicitly truncating the generated values. In addition, most
>> of the other methods only use 32bit internally.
>> If you are concerned about this, you should probably be concerned about
>> the incompatibility with MT_RAND_PHP.
>> That problem can also be compensated for with an extension that adds algo.
>>
>
> I thought you were "struggling with [this implicit truncation] behavior
> when calling nextInt() in a 32-bit environment using a 64-bit RNG [...]
> which means that the code loses compatibility with the result of running on
> a 64-bit machine"? And you asked if throwing an exception would be
> preferable?
> Anyway, I personally don't care about 32-bit (but other people may).
>
>
>> Regards,
>> Go Kudo
>>
>
> Regards,
>
> --
> Guilliam Xavier
>

Reply via email to