> On 16 May 2019, at 15:33, Gilles Sadowski <gillese...@gmail.com> wrote: > > Hi. > > Le jeu. 16 mai 2019 à 16:04, Alex Herbert <alex.d.herb...@gmail.com > <mailto:alex.d.herb...@gmail.com>> a écrit : >> >> >> >>> On 16 May 2019, at 14:42, Gilles Sadowski <gillese...@gmail.com >>> <mailto:gillese...@gmail.com>> wrote: >>> >>> Hello. >>> >>> Le jeu. 16 mai 2019 à 12:06, Alex Herbert <alex.d.herb...@gmail.com >>> <mailto:alex.d.herb...@gmail.com> <mailto:alex.d.herb...@gmail.com >>> <mailto:alex.d.herb...@gmail.com>>> a écrit : >>>> >>>> I have run the stress test using the new application. The new application >>>> has two major changes over the previous application: >>>> >>>> 1. It detects the platform byte-order and sends the bits in the correct >>>> order to be read by a C application >>>> 2. The bridge to TestU01 has been updated to use all the input int values, >>>> previously it was using every other int value >>>> >>>> So we can expect differences from both test suites Dieharder and TestU01 >>>> BigCrush. >>>> >>>> For reference here are the old results (from the user guide, reordered to >>>> the RandomSource enum order): >>>> >>>> RNG Dieharder TestU01 (BigCrush) >>>> JDK 11, 12, 13 74, 72, 75 >>>> WELL_512_A 0, 0, 0 7, 6, 6 >>>> WELL_1024_A 0, 0, 0 4, 4, 5 >>>> WELL_19937_A 0, 0, 0 3, 2, 3 >>>> WELL_19937_C 0, 1, 0 2, 2, 3 >>>> WELL_44497_A 0, 0, 0 2, 3, 3 >>>> WELL_44497_B 0, 0, 0 2, 2, 2 >>>> MT 0, 1, 0 3, 2, 2 >>>> ISAAC 0, 0, 1 0, 1, 0 >>>> SPLIT_MIX_64 0, 0, 0 2, 0, 0 >>>> XOR_SHIFT_1024_S 0, 0, 0 2, 0, 0 >>>> TWO_CMRES 1, 1, 1 0, 0, 1 >>>> MT_64 0, 0, 1 3, 2, 3 >>>> MWC_256 0, 0, 0 0, 0, 0 >>>> KISS 0, 0, 0 1, 2, 0 >>>> >>>> Here are the new results: >>>> >>>> RNG Dieharder TestU01 (BigCrush) >>>> JDK 4,4,4,4,4 74,72,74,73,74 >>>> WELL_512_A 0,0,0,0,0 7,6,6,6,6 >>>> WELL_1024_A 0,0,0,0,0 4,4,5,4,4 >>>> WELL_19937_A 0,1,0,0,1 3,3,2,2,2 >>>> WELL_19937_C 0,0,0,0,0 2,2,3,2,2 >>>> WELL_44497_A 0,0,0,0,0 2,2,2,2,3 >>>> WELL_44497_B 0,0,0,0,0 2,3,2,2,2 >>>> MT 0,0,0,0,0 2,3,2,2,2 >>>> ISAAC 0,0,0,0,0 0,1,2,0,0 >>>> SPLIT_MIX_64 0,0,0,0,0 1,0,0,0,0 >>>> XOR_SHIFT_1024_S 0,0,0,0,0 0,0,0,0,0 >>>> TWO_CMRES 2,2,2,2,2 4,3,3,5,4 >>>> MT_64 0,0,0,0,0 2,3,2,2,2 >>>> MWC_256 0,1,0,0,0 0,0,0,2,0 >>>> KISS 0,0,0,0,0 0,0,0,0,0 >>>> XOR_SHIFT_1024_S_PHI 0,0,0,0,0 0,0,0,0,0 >>>> XO_RO_SHI_RO_64_S 0,0,0,0,0 1,1,2,1,3 >>>> XO_RO_SHI_RO_64_SS 0,0,0,0,0 0,0,0,0,0 >>>> XO_SHI_RO_128_PLUS 0,0,1,0,0 1,2,2,1,1 >>>> XO_SHI_RO_128_SS 0,0,0,1,0 0,1,0,0,0 >>>> XO_RO_SHI_RO_128_PLUS 0,0,0,0,0 0,1,0,0,0 >>>> XO_RO_SHI_RO_128_SS 0,0,0,0,0 1,0,1,0,0 >>>> XO_SHI_RO_256_PLUS 0,1,0,0,0 0,0,0,0,0 >>>> XO_SHI_RO_256_SS 0,0,0,0,0 0,1,0,2,1 >>>> XO_SHI_RO_512_PLUS 0,0,0,0,1 0,0,0,2,2 >>>> XO_SHI_RO_512_SS 0,0,0,0,0 0,1,0,1,0 >>>> >>>> (Note: All of the single fails except one under Dieharder are for the >>>> flawed diehard_sums test. I include it here for direct comparison with old >>>> results. I would recommend we strip this from the new results for the user >>>> guide.) >>>> >>>> I ran them 3 times. Then because the results were different (mainly for >>>> the JDK generator for Dieharder) I doubled checked everything and ran >>>> another 2. Results are still the same. Dieharder is much better for the >>>> JDK than previously. It systematically fails: >>>> >>>> diehard_opso:0 >>>> diehard_oqso:0 >>>> diehard_dna:0 >>>> dab_bytedistrib:0 >>>> >>>> The TWO_CMRES generator is now worse as it is systematically failing: >>>> >>>> diehard_oqso:0 >>>> diehard_dna:0 >>>> >>>> The results from BigCrush are similar for JDK and all the others except >>>> TWO_CMRES. This is now failing a few more tests. It systematically fails: >>>> >>>> 1 SerialOver, r = 0 >>>> 41 Permutation, t = 5 >>>> 42 Permutation, t = 7 >>>> >>>> To check the JDK results for Dieharder I ran it 5 times using the wrong >>>> platform byte order (i.e. what the previous test application was doing). >>>> >>>> Old results : 11, 12, 13 >>>> New results: 11,16,14,14,15 >>>> >>>> So this matches up. If the JDK output is byte reversed it is a poor >>>> generator. >>>> >>>> A few sources I have read indicate that BigCrush favours the upper bits of >>>> a generator. A test should therefore run the generator bit reversed >>>> through the test application. Here are the full forward and backward >>>> results ignoring the Diehard sums test: >>>> >>>> RNG Bit-reversed Dieharder TestU01 (BigCrush) >>>> JDK false 4,4,4,4,4 74,72,74,73,74 >>>> JDK true 42,42,43,49,49 35,34,35,36,36 >>>> WELL_512_A false 0,0,0,0,0 7,6,6,6,6 >>>> WELL_512_A true 0,0,1,0,0 7,6,6,7,6 >>>> WELL_1024_A false 0,0,0,0,0 4,4,5,4,4 >>>> WELL_1024_A true 0,0,0,0,0 4,4,4,4,4 >>>> WELL_19937_A false 0,1,0,0,0 3,3,2,2,2 >>>> WELL_19937_A true 0,0,0,0,0 3,2,2,2,3 >>>> WELL_19937_C false 0,0,0,0,0 2,2,3,2,2 >>>> WELL_19937_C true 0,0,0,0,0 3,2,2,3,2 >>>> WELL_44497_A false 0,0,0,0,0 2,2,2,2,3 >>>> WELL_44497_A true 0,0,0,0,0 3,3,3,2,2 >>>> WELL_44497_B false 0,0,0,0,0 2,3,2,2,2 >>>> WELL_44497_B true 0,0,0,0,0 2,2,2,2,3 >>>> MT false 0,0,0,0,0 2,3,2,2,2 >>>> MT true 0,0,0,0,0 2,2,3,3,3 >>>> ISAAC false 0,0,0,0,0 0,1,2,0,0 >>>> ISAAC true 0,0,0,0,0 0,0,0,0,0 >>>> SPLIT_MIX_64 false 0,0,0,0,0 1,0,0,0,0 >>>> SPLIT_MIX_64 true 0,0,0,0,0 0,1,0,0,0 >>>> XOR_SHIFT_1024_S false 0,0,0,0,0 0,0,0,0,0 >>>> XOR_SHIFT_1024_S true 0,0,0,0,0 0,0,1,0,0 >>>> TWO_CMRES false 2,2,2,2,2 4,3,3,5,4 >>>> TWO_CMRES true 7,5,5,7,6 4,3,4,4,4 >>>> MT_64 false 0,0,0,0,0 2,3,2,2,2 >>>> MT_64 true 0,0,0,0,0 2,2,2,2,2 >>>> MWC_256 false 0,0,0,0,0 0,0,0,2,0 >>>> MWC_256 true 0,0,0,0,0 1,0,0,0,0 >>>> KISS false 0,0,0,0,0 0,0,0,0,0 >>>> KISS true 0,0,0,0,0 0,0,1,0,1 >>>> XOR_SHIFT_1024_S_PHI false 0,0,0,0,0 0,0,0,0,0 >>>> XOR_SHIFT_1024_S_PHI true 0,0,0,0,0 0,0,2,0,0 >>>> XO_RO_SHI_RO_64_S false 0,0,0,0,0 1,1,2,1,3 >>>> XO_RO_SHI_RO_64_S true 0,0,0,0,0 2,2,2,2,2 >>>> XO_RO_SHI_RO_64_SS false 0,0,0,0,0 0,0,0,0,0 >>>> XO_RO_SHI_RO_64_SS true 0,0,0,0,0 1,0,0,0,0 >>>> XO_SHI_RO_128_PLUS false 0,0,0,0,0 1,2,2,1,1 >>>> XO_SHI_RO_128_PLUS true 0,0,0,0,0 2,2,2,2,2 >>>> XO_SHI_RO_128_SS false 0,0,0,0,0 0,1,0,0,0 >>>> XO_SHI_RO_128_SS true 0,0,0,0,0 0,0,0,0,0 >>>> XO_RO_SHI_RO_128_PLUS false 0,0,0,0,0 0,1,0,0,0 >>>> XO_RO_SHI_RO_128_PLUS true 0,0,0,0,0 2,1,1,1,2 >>>> XO_RO_SHI_RO_128_SS false 0,0,0,0,0 1,0,1,0,0 >>>> XO_RO_SHI_RO_128_SS true 0,0,0,0,0 0,0,2,0,0 >>>> XO_SHI_RO_256_PLUS false 0,0,0,0,0 0,0,0,0,0 >>>> XO_SHI_RO_256_PLUS true 0,0,0,0,0 0,0,0,0,0 >>>> XO_SHI_RO_256_SS false 0,0,0,0,0 0,1,0,2,1 >>>> XO_SHI_RO_256_SS true 0,0,0,0,0 0,1,1,1,2 >>>> XO_SHI_RO_512_PLUS false 0,0,0,0,0 0,0,0,2,2 >>>> XO_SHI_RO_512_PLUS true 0,0,0,0,0 1,0,0,0,1 >>>> XO_SHI_RO_512_SS false 0,0,0,0,0 0,1,0,1,0 >>>> XO_SHI_RO_512_SS true 0,0,0,0,0 0,1,1,0,0 >>>> >>>> So bit reversed the JDK is terrible at Dieharder. It actually improves for >>>> BigCrush from terrible to less terrible. TWO_CMRES is a bit worse when >>>> bit-reversed at Dieharder but no different at BigCrush (it was already >>>> systematically failing 3 tests). >>> >>> Is it the same version of "BigCrush"? I'm surprised that TWO_CMRES >>> have much more failures (bit-reversed or not). >> >> I was surprised by that as well. I thought each sub-cycle generator within >> TWO_CMRES could almost pass BigCrush. So when combined the generator should >> easily pass it. Here is the version, same as all my previous usage: >> >> Version: TestU01 1.2.3 >> >> I may investigate this further using the tests that systematically fail. >> >>> >>>> >>>> All the other generators have similar results when bit reversed. So adding >>>> the bit-reversed results to the user-guide does not appear worthwhile. I >>>> will archive these and they can be added later if required, for example to >>>> show a good generator against a bad one. This will only be relevant if the >>>> library adds reference implementations of bad generators. >>> >>> It's on Abhishek's TODO list (e.g. "LCG”). >> >> I’ll leave it until it is needed. For now it just adds a load of extra data >> with little merit to the user guide. > > I mean that we'll have bad generators added to the library; but I agree > that the bit-reversed results are not useful since users od the library > would never see the wrong values. It was the side-effect of a bug in > the testing code.
It is a different way to test the generator. It would be important to know the lower order bits are not poor for certain usages. But perhaps a better use of time, and space in the user guide, is to add results for PractRand instead. > >> >>> >>>> Currently only the JDK is bad generator. >>>> >>>> Next: >>>> >>>> I have added a ‘results' command to the stress test application that can >>>> generate these results tables. It requires some header information not >>>> found in the old results files so only works with the new results. It can >>>> generate the APT table directly for the user guide. It will be useful >>>> going forward when more generators are added to update the results. >>>> >>>> The new results are named using the test suite (dh_ or tu_), optionally >>>> the bit-reversed flag (r_), the enum ordinal and the trial run: >>>> >>>> dh_1_1 = Dieharder for JDK trial 1 >>>> tu_1_1 = BigCrush for JDK trial 1 >>>> dh_r_2_3 = Dieharder bit reversed for WELL_512_A trial 3 >>>> >>>> I propose to: >>>> >>>> - Delete all the old results and add these new ones using a new directory >>>> structure. All results can reside in a single directory. >>>> - Ignore for now the bit-reversed results. >>>> - Delete the old stress test code. The new code supersedes all >>>> functionality of the old version. >>>> - Commit the new ‘results’ command when I have confirmed the APT table is >>>> correctly generated. >>> >>> +1 >> >> OK. >> >>> >>>> >>>> Questions: >>>> >>>> 1. Do we stick to using 3 trials or update to 5 (because I have the >>>> results)? >>> >>> +1 >> >> +1 to which? I assume sticking to 3 trials. > > Fine with 5 trials. :-) > >> >>> >>>> 2. Do we remove the diehard_sums test result? >>>> >>>> I would recommend removing diehard_sums. It pollutes the results for most >>>> generators with a spurious fail that should be ignored. So I think we >>>> should ignore it. >>> >>> +0 (as you wish) >> >> The Dieharder web page and documentation indicates that this test should not >> be used. > > Yes; I mentioned it on the "Commons RNG" web page. > The result was there, just as it is output by "DieHarder" (it could be > construed that > DieHarder should skip the flawed test in the first place...). > >> So adding it to the results is incorrect. I will document it as so. I’ll >> also update the ‘results’ command to ignore the test by default so you >> explicitly have to request it is included. This should prevent future >> updates to the user guide from including it by mistake. > > Quite fine too. > > Gilles > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > <mailto:dev-unsubscr...@commons.apache.org> > For additional commands, e-mail: dev-h...@commons.apache.org > <mailto:dev-h...@commons.apache.org>