> On 14 Mar 2019, at 18:22, Gilles Sadowski <[email protected]> wrote:
>
> Hello.
>
> Le jeu. 14 mars 2019 à 12:58, Alex Herbert <[email protected]> a écrit
> :
>>
>> I've just added a Jira task to add the new XorShiRo generaters to the
>> user guide.
>>
>> I note that the order of the RNGs in the *Quality* section does not
>> match the order of the RandomSource enum, nor is it alphabetical. The
>> list order is defined in
>> org.apache.commons.rng.examples.stress.GeneratorsList. Nearly all the
>> IntProviders are first with the exception of MWC_256 and KISS.
>>
>> So:
>>
>> - Should the new XoShiRo generators be added to the end?
>
> +1
> [To minimize source code changes.]
Agreed.
>
>> - Should all the *Quality* results be reordered to match the
>> RandomSource enum?
>
> +0
> [Unless some other orders make sense: from fast to slow, from good
> to bad, …]
Since the order in the rng.apt file does not have to match the underlying
GeneratorsList or RandomSource enum then sorting/ordering the user guide table
is the easiest option.
If we use total failures for BigCrush this makes the order:
RNG identifier Dieharder BigCrush
JDK 11, 12, 13 74, 72, 75
WELL_512_A 0, 0, 0 7, 6, 6
WELL_1024_A 0, 0, 0 4, 4, 5
WELL_44497_A 0, 0, 0 2, 3, 3
WELL_19937_A 0, 0, 0 3, 2, 3
WELL_19937_C 0, 1, 0 2, 2, 3
MT_64 0, 0, 1 3, 2, 3
MT 0, 1, 0 3, 2, 2
WELL_44497_B 0, , 0 2, 2, 2
KISS 0, 0, 0 1, 2, 0
ISAAC 0, 0, 1 0, 1, 0
TWO_CMRES 1, 1, 1 0, 0, 1
SPLIT_MIX_64 0, 0, 0 2, 0, 0
XOR_SHIFT_1024_S 0, 0, 0 2, 0, 0
MWC_256 0, 0, 0 0, 0, 0
I am not sure I like this. It appears more of a mess to my eye. At least in the
current order the WELL family is grouped together.
Maybe the only thing I would change is to put the MT and MT_64 together.
>
>> - Should results be alphabetical?
>
> -0
> [The names are not especially telling…]
Agreed. It was logical but not useful. I’ve just done it in my spreadsheet copy
and it also looks a mess to my eye.
>
>> - Should the MWC_256 and KISS be moved so the generators are listed for
>> IntProvider then LongProvider?
>
> ?
Since the end-user does not have to know if the RNG is a based on making int or
long then this suggestion is ruled out.
>
>> Looking at the history for RandomSource enum and the GeneratorsList it
>> seems the order has always been the same in both, although not between
>> them. If new generators are added to the end of the RandomSource enum
>> then perhaps all the other lists used in the test suites and
>> benchmarking suites should match the same order.
>
> The order in "RandomSource" is immaterial to the te user (i.e. an application
> should certainly not rely on that order).
>
>> I would vote for requiring that new generators are added to the end of
>> the RandomSource enum.
>
> +1
>
>> All other lists through the code should match
>> this order.
>
> What when the list split in two ("Int" vs "Long"), then combined for running
> a test/benchmark? Results are not going to be sorted. At first sight, I
> don't
> think it's worth making this sort of requirements (unless everything can be
> automated to provide the sorted output for free).
Keeping the order the same through the codebase was an idea just for
maintenance. There is very similar functionality here:
[test] org.apache.commons.rng.core.ProvidersList
[test] org.apache.commons.rng.simple.ProvidersList
org.apache.commons.rng.examples.stress.GeneratorsList
org.apache.commons.rng.examples.jmh.ConstructionPerformance$Sources
org.apache.commons.rng.examples.jmh.GenerationPerformance$Sources
org.apache.commons.rng.examples.jmh.distribution.SamplersPerformance$Sources
The first two build a list with a seed. Core uses a random seed from
SecureRandom. Simple uses a fixed seed. It still tests randomness using copied
code from the tests in core. However it does have a note: "check whether it is
possible to have a single implementation.” As an aside I think this test for
randomness should stay but the generator should be built with a null seed.
Currently it tests empty seeds and an empty seed array is not processed
specially by the ProviderBuilder so this actually tests self-seeding of the
generator. Self-Seeding is already testing in the core module. I suggest the
ProviderBuilder be updated to detect empty seed arrays and do something better,
like handle an empty array as if it were null. The test in simple should also
add test for randomness using a null seed, effectively testing the
RandomSource.create method creates a provider that works given no seed, i.e.
like testEmptyIntArraySeed but with a null seed:
@Test
public void testNullSeed() {
// … Do the checkNextIntegerInRange test
}
I also note that there is this:
@Ignore@Test
public void testZeroIntArraySeed()
Currently the XorShift1024Star generator would fail this. This would also be
failed by all the new XoShiRo generators. I think this should be fixed and the
test reinstated. Sooner or later someone will try
RandomSource.create(RandomSource.XYZ, new long[50]) and get a bad generator. I
am not sure how to fix it yet[1] but I would like to try.
The third list does not have any seeds. This arguably could be generated
automatically using the RandomSource enum for all generators that do not have
multiple constructor arguments (i.e. all but TWO_CMRES_SELECT).
The other lists are built using JMH annotations from strings. These cannot be
auto-generated.
This makes only 2 key lists to maintain: one in core and one in simple. These
should be updated when a new RNG is added just to get tests working. The order
does not matter. However I think the test in simple should add an assertion
that the list of providers is the expected length given the RandomSource enum.
Currently this would be in the static initialisation for the ProvidersList:
if (LIST.size() != RandomSource.values().length) {
throw new RuntimeException("Providers list is not the expected size");
}
This would make PRs for a new generator fail if the functionality in simple has
not be added to the tests but was added to RandomSource.
The third list could be rearranged to match the RandomSource enum and so would
be autogenerated. That would mean rearranging the results files from Dieharder
and BigCrush. I’m happy to do this with some scripted git mv commands.
The JMH lists I suppose would be updated when the benchmarks need to be redone
for new generators. This is manual labour.
Summary of my proposals:
- Keep the order in the user guide as is but move MT next to MT_64
- Auto generate the GeneratorsList from the RandomSource enum
- Update all the file names and the user guide to point to the renamed files
for the new order
- New generators can go anywhere in the user guide table if related to others,
or at the end
- Assert that the ProviderList in [simple] matches RandomSource in size
- Add test that a null seed passed to RandomSource.create can build a generator
that passes the test for randomness
- Update the ProviderBuilder to handle zero filled seeds for generators that
cannot handle them
- Update the ProviderBuilder to handle an empty array as if it were a null seed
[1] The new XoShiRo generators break when the entire seed is zero. However
depending on the first sequence iteration they can handle a lot of zeros. For
one generator I seem to recall that only the final long of the seed needs to
have a single bit set in the least significant X bits (I cannot recall X). If
bits are set above the least significant X then even though the seed is not
zero the generator breaks (the first step shifted all the bits using <<
(64-X)). I tried to build a method to test for bad seeds but then it was
decided that broken generators were OK as they match the reference
implementation. However we should do better for the [simple] module and try to
make it work, at least as far as possible.
>
> Regards,
> Gilles
>
>>
>>
>> The order in PR #30 for the XorShift1024StarPhi generator adds to the
>> RandomSource enum directly after XOR_SHIFT_1024_S.
>>
>> The order in PR #20 for the new XorShiRo generators adds to the
>> RandomSource enum at the end in alphabetical order for the IntProviders
>> then again for the new LongProviders.
>>
>> These can be fixed depending on the order decision when the PRs are merged.
>>
>>
>> Alex
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>