Re: [PHP-DEV] RFC [Discussion]: Randomizer Additions

Tim Düsterhus Sat, 05 Nov 2022 09:00:43 -0700

Hi

On 11/5/22 16:34, Go Kudo wrote:

I am skeptical only about getFloat(). The use cases are limited and seem
somewhat excessive. Do you have examples of how this is supported in other
languages?

Yes, unfortunately getFloat() became pretty complex, but that is because"generating random floats correctly" is pretty complicated, due to howfloats work.

The getFloat() method as proposed implements the γ-section algorithm aspublished in: Drawing Random Floating-Point Numbers from an Interval.Frédéric Goualard, ACM Trans. Model. Comput. Simul., 32:3, 2022.https://doi.org/10.1145/3503512

This publication is just 7 months old and explains how theimplementation in every other programming language is broken in one wayor another and proposes the γ-section algorithm as a not-broken algorithm.

As floats are not uniformly dense and do not allow representing allvalues, it is very easy to introduce a bias or generate incorrect values.


An example taken from the publication:

php > $r = new Random\Randomizer();

We generate a random float in [0, 1) (allowing 0, but not 1), bydividing a random int between 2^53 - 1 by 2^53. This is effectively what->nextFloat() does. This creates a uniformly distributed float with asmany different values as possible, because a double (the underlyingrepresentation) has 53 bits of precision.

The nextFloat() method is often the only thing that is available inother languages, e.g. JavaScript with Math.random() [1]

php > $f = $r->getInt(0, (2**53 - 1)) / (2**53);
php > var_dump($f);
float(0.6942225382038698)

Now we want to turn this into a random float between [3.5, 4.5) (notallowing 4.5), because that's what we need. It's also the formula givenin MDN for JavaScript's Math.random():

php > $min = 3.5;
php > $max = 4.5;
php > var_dump($min + ($max - $min) * $f);
float(4.19422253820387)

The simple formula appears to do the correct thing and it would becorrect if floats could represent all value values. But what happens ifthe random integer is 2^53 - 1 (i.e. the maximum integer we allowed togenerate)?

php > $f = (2**53 - 1) / (2**53);
php > var_dump($f);
float(0.9999999999999999)
php > var_dump($min + ($max - $min) * $f);
float(4.5)

In this case the result was rounded to 4.5, because the exact result wasnot representable. Now an invalid value was generated!

Likewise if you generate a random float between 0 and 1000 with thismethod, some values will appear more often than others due to roundingand the changing density of floats for each power of two.

With the γ-section algorithm by Prof. Goualard all these issues areeliminated and that's what is used for getFloat(). The getFloat() methodsupports all 4 possible boundary combinations to ensure that users havea safe solution for all possible use cases, so that they don't need tobuild an unsafe solution in userland.


Best regards
Tim Düsterhus

[1]https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Re: [PHP-DEV] RFC [Discussion]: Randomizer Additions

Reply via email to