Re: [PHP-DEV] [RFC] [Under Discussion] Random Extension 4.0

Tim Düsterhus Fri, 18 Feb 2022 02:46:25 -0800

Hi

On 2/18/22 07:31, Go Kudo wrote:

I have been looking into output buffering, but don't know the right way to
do it. The buffering works fine if all RNG generation widths are static,
but if they are dynamic so complicated.

I believe the primary issue here is that the engines are expected toreturn an uint64_t, instead of a buffer with raw bytes. This requiresyou to perform many conversions between the uint64 and the raw buffer:

When calling Randomizer::getBytes() for a custom engine the followingneeds to happen:


- The Engine returns a byte string.
- This bytestring is then internally converted into an uint64_t.

- Then calling Randomizer::getBytes() this uint64_t needs to beconverted back to a bytestring.

To avoid those conversations without sacrificing too much performance itmight be possible to return a struct that contains a single 4 or 8-bytearray:


    struct four_bytes {
        unsigned char val[4];
    };

    struct four_bytes r;
    r.val[0] = (result >> 0) & 0xff;
    r.val[1] = (result >> 8) & 0xff;
    r.val[2] = (result >> 16) & 0xff;
    r.val[3] = (result >> 24) & 0xff;

    return r;

.val can be treated as a bytestring, but it does not require dynamicallocation. By doing that the internal engines (e.g. Xoshiro) would beconsistent with the userland engines.

It is possible to solve this problem by allowing generate() itself to
specify the size it wants, but this would significantly slow down
performance.


I don't think it's a good idea to add a size parameter to generate().

I've looked at the sample code, but do you really need support for
Randomizer? Engine::generate() can output dynamic binaries up to 64 bits.
You can use Engine directly, instead of Randomizer::getBytes().

What exactly is the situation where buffering by Randomizer is needed?

*I* don't need anything. I'm just trying to think of use-cases andedge-cases. Basically: What would a user attempt to do and what wouldtheir expectations be?

I'm not saying that this buffering *must* be implemented, but this issomething we need to think about. Because changing the behavior later ispretty much impossible, as users might rely on a specific behavior fortheir seeded sequences. The behavior might also need to be part of thedocumentation.

Basically what we need to think about is what guarantees we give. As anexample:

1. Calling Engine::generate() with the same seed results in the samesequence (This guarantee we give, and it is useful).2. Calling Randomizer::getInt() with the same seeded engine results inthe same numbers for the same parameters (I think this also is useful).3. Calling Randomizer::getBytes() with the same seeded engine results inthe same byte sequence (This is something we are currently discussing).4. Calling Randomizer::getBytes() simply concatenates the raw bytesretrieved by the Engine (This ties into (3)).5. Calling Randomizer::shuffleArray() with the same seeded engineresults in the same result for the same string (This one is moredebatable, because then we must maintain the exact same shuffleArray()implementation forever).

All these guarantees should be properly documented within the RFC. TheRFC template (https://wiki.php.net/rfc/template) says:

> Remember that the RFC contents should be easily reusable in the PHPDocumentation.

So by thinking about this now and putting it in the RFC, theexplanations can easily be copied into the documentation if the RFCpasses the vote.

One should not need to look into the implementation to understand howthe Engines and the Randomizer is supposed to work.

Also worried that buffering will cut off random numbers at arbitrary sizes.
It may cause bias in the generated results.

If there's bias in specific bits or bytes of the generated number thengetBytes(32) will already be biased even without buffering, as the rawbytes are what's of interest here. It does not matter if they are at the1st or 4th position (for a 32-bit engine).


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Re: [PHP-DEV] [RFC] [Under Discussion] Random Extension 4.0

Reply via email to