> On 16 May 2019, at 14:42, Gilles Sadowski <gillese...@gmail.com> wrote: > > Hello. > > Le jeu. 16 mai 2019 à 12:06, Alex Herbert <alex.d.herb...@gmail.com > <mailto:alex.d.herb...@gmail.com>> a écrit : >> >> I have run the stress test using the new application. The new application >> has two major changes over the previous application: >> >> 1. It detects the platform byte-order and sends the bits in the correct >> order to be read by a C application >> 2. The bridge to TestU01 has been updated to use all the input int values, >> previously it was using every other int value >> >> So we can expect differences from both test suites Dieharder and TestU01 >> BigCrush. >> >> For reference here are the old results (from the user guide, reordered to >> the RandomSource enum order): >> >> RNG Dieharder TestU01 (BigCrush) >> JDK 11, 12, 13 74, 72, 75 >> WELL_512_A 0, 0, 0 7, 6, 6 >> WELL_1024_A 0, 0, 0 4, 4, 5 >> WELL_19937_A 0, 0, 0 3, 2, 3 >> WELL_19937_C 0, 1, 0 2, 2, 3 >> WELL_44497_A 0, 0, 0 2, 3, 3 >> WELL_44497_B 0, 0, 0 2, 2, 2 >> MT 0, 1, 0 3, 2, 2 >> ISAAC 0, 0, 1 0, 1, 0 >> SPLIT_MIX_64 0, 0, 0 2, 0, 0 >> XOR_SHIFT_1024_S 0, 0, 0 2, 0, 0 >> TWO_CMRES 1, 1, 1 0, 0, 1 >> MT_64 0, 0, 1 3, 2, 3 >> MWC_256 0, 0, 0 0, 0, 0 >> KISS 0, 0, 0 1, 2, 0 >> >> Here are the new results: >> >> RNG Dieharder TestU01 (BigCrush) >> JDK 4,4,4,4,4 74,72,74,73,74 >> WELL_512_A 0,0,0,0,0 7,6,6,6,6 >> WELL_1024_A 0,0,0,0,0 4,4,5,4,4 >> WELL_19937_A 0,1,0,0,1 3,3,2,2,2 >> WELL_19937_C 0,0,0,0,0 2,2,3,2,2 >> WELL_44497_A 0,0,0,0,0 2,2,2,2,3 >> WELL_44497_B 0,0,0,0,0 2,3,2,2,2 >> MT 0,0,0,0,0 2,3,2,2,2 >> ISAAC 0,0,0,0,0 0,1,2,0,0 >> SPLIT_MIX_64 0,0,0,0,0 1,0,0,0,0 >> XOR_SHIFT_1024_S 0,0,0,0,0 0,0,0,0,0 >> TWO_CMRES 2,2,2,2,2 4,3,3,5,4 >> MT_64 0,0,0,0,0 2,3,2,2,2 >> MWC_256 0,1,0,0,0 0,0,0,2,0 >> KISS 0,0,0,0,0 0,0,0,0,0 >> XOR_SHIFT_1024_S_PHI 0,0,0,0,0 0,0,0,0,0 >> XO_RO_SHI_RO_64_S 0,0,0,0,0 1,1,2,1,3 >> XO_RO_SHI_RO_64_SS 0,0,0,0,0 0,0,0,0,0 >> XO_SHI_RO_128_PLUS 0,0,1,0,0 1,2,2,1,1 >> XO_SHI_RO_128_SS 0,0,0,1,0 0,1,0,0,0 >> XO_RO_SHI_RO_128_PLUS 0,0,0,0,0 0,1,0,0,0 >> XO_RO_SHI_RO_128_SS 0,0,0,0,0 1,0,1,0,0 >> XO_SHI_RO_256_PLUS 0,1,0,0,0 0,0,0,0,0 >> XO_SHI_RO_256_SS 0,0,0,0,0 0,1,0,2,1 >> XO_SHI_RO_512_PLUS 0,0,0,0,1 0,0,0,2,2 >> XO_SHI_RO_512_SS 0,0,0,0,0 0,1,0,1,0 >> >> (Note: All of the single fails except one under Dieharder are for the flawed >> diehard_sums test. I include it here for direct comparison with old results. >> I would recommend we strip this from the new results for the user guide.) >> >> I ran them 3 times. Then because the results were different (mainly for the >> JDK generator for Dieharder) I doubled checked everything and ran another 2. >> Results are still the same. Dieharder is much better for the JDK than >> previously. It systematically fails: >> >> diehard_opso:0 >> diehard_oqso:0 >> diehard_dna:0 >> dab_bytedistrib:0 >> >> The TWO_CMRES generator is now worse as it is systematically failing: >> >> diehard_oqso:0 >> diehard_dna:0 >> >> The results from BigCrush are similar for JDK and all the others except >> TWO_CMRES. This is now failing a few more tests. It systematically fails: >> >> 1 SerialOver, r = 0 >> 41 Permutation, t = 5 >> 42 Permutation, t = 7 >> >> To check the JDK results for Dieharder I ran it 5 times using the wrong >> platform byte order (i.e. what the previous test application was doing). >> >> Old results : 11, 12, 13 >> New results: 11,16,14,14,15 >> >> So this matches up. If the JDK output is byte reversed it is a poor >> generator. >> >> A few sources I have read indicate that BigCrush favours the upper bits of a >> generator. A test should therefore run the generator bit reversed through >> the test application. Here are the full forward and backward results >> ignoring the Diehard sums test: >> >> RNG Bit-reversed Dieharder TestU01 (BigCrush) >> JDK false 4,4,4,4,4 74,72,74,73,74 >> JDK true 42,42,43,49,49 35,34,35,36,36 >> WELL_512_A false 0,0,0,0,0 7,6,6,6,6 >> WELL_512_A true 0,0,1,0,0 7,6,6,7,6 >> WELL_1024_A false 0,0,0,0,0 4,4,5,4,4 >> WELL_1024_A true 0,0,0,0,0 4,4,4,4,4 >> WELL_19937_A false 0,1,0,0,0 3,3,2,2,2 >> WELL_19937_A true 0,0,0,0,0 3,2,2,2,3 >> WELL_19937_C false 0,0,0,0,0 2,2,3,2,2 >> WELL_19937_C true 0,0,0,0,0 3,2,2,3,2 >> WELL_44497_A false 0,0,0,0,0 2,2,2,2,3 >> WELL_44497_A true 0,0,0,0,0 3,3,3,2,2 >> WELL_44497_B false 0,0,0,0,0 2,3,2,2,2 >> WELL_44497_B true 0,0,0,0,0 2,2,2,2,3 >> MT false 0,0,0,0,0 2,3,2,2,2 >> MT true 0,0,0,0,0 2,2,3,3,3 >> ISAAC false 0,0,0,0,0 0,1,2,0,0 >> ISAAC true 0,0,0,0,0 0,0,0,0,0 >> SPLIT_MIX_64 false 0,0,0,0,0 1,0,0,0,0 >> SPLIT_MIX_64 true 0,0,0,0,0 0,1,0,0,0 >> XOR_SHIFT_1024_S false 0,0,0,0,0 0,0,0,0,0 >> XOR_SHIFT_1024_S true 0,0,0,0,0 0,0,1,0,0 >> TWO_CMRES false 2,2,2,2,2 4,3,3,5,4 >> TWO_CMRES true 7,5,5,7,6 4,3,4,4,4 >> MT_64 false 0,0,0,0,0 2,3,2,2,2 >> MT_64 true 0,0,0,0,0 2,2,2,2,2 >> MWC_256 false 0,0,0,0,0 0,0,0,2,0 >> MWC_256 true 0,0,0,0,0 1,0,0,0,0 >> KISS false 0,0,0,0,0 0,0,0,0,0 >> KISS true 0,0,0,0,0 0,0,1,0,1 >> XOR_SHIFT_1024_S_PHI false 0,0,0,0,0 0,0,0,0,0 >> XOR_SHIFT_1024_S_PHI true 0,0,0,0,0 0,0,2,0,0 >> XO_RO_SHI_RO_64_S false 0,0,0,0,0 1,1,2,1,3 >> XO_RO_SHI_RO_64_S true 0,0,0,0,0 2,2,2,2,2 >> XO_RO_SHI_RO_64_SS false 0,0,0,0,0 0,0,0,0,0 >> XO_RO_SHI_RO_64_SS true 0,0,0,0,0 1,0,0,0,0 >> XO_SHI_RO_128_PLUS false 0,0,0,0,0 1,2,2,1,1 >> XO_SHI_RO_128_PLUS true 0,0,0,0,0 2,2,2,2,2 >> XO_SHI_RO_128_SS false 0,0,0,0,0 0,1,0,0,0 >> XO_SHI_RO_128_SS true 0,0,0,0,0 0,0,0,0,0 >> XO_RO_SHI_RO_128_PLUS false 0,0,0,0,0 0,1,0,0,0 >> XO_RO_SHI_RO_128_PLUS true 0,0,0,0,0 2,1,1,1,2 >> XO_RO_SHI_RO_128_SS false 0,0,0,0,0 1,0,1,0,0 >> XO_RO_SHI_RO_128_SS true 0,0,0,0,0 0,0,2,0,0 >> XO_SHI_RO_256_PLUS false 0,0,0,0,0 0,0,0,0,0 >> XO_SHI_RO_256_PLUS true 0,0,0,0,0 0,0,0,0,0 >> XO_SHI_RO_256_SS false 0,0,0,0,0 0,1,0,2,1 >> XO_SHI_RO_256_SS true 0,0,0,0,0 0,1,1,1,2 >> XO_SHI_RO_512_PLUS false 0,0,0,0,0 0,0,0,2,2 >> XO_SHI_RO_512_PLUS true 0,0,0,0,0 1,0,0,0,1 >> XO_SHI_RO_512_SS false 0,0,0,0,0 0,1,0,1,0 >> XO_SHI_RO_512_SS true 0,0,0,0,0 0,1,1,0,0 >> >> So bit reversed the JDK is terrible at Dieharder. It actually improves for >> BigCrush from terrible to less terrible. TWO_CMRES is a bit worse when >> bit-reversed at Dieharder but no different at BigCrush (it was already >> systematically failing 3 tests). > > Is it the same version of "BigCrush"? I'm surprised that TWO_CMRES > have much more failures (bit-reversed or not).
I was surprised by that as well. I thought each sub-cycle generator within TWO_CMRES could almost pass BigCrush. So when combined the generator should easily pass it. Here is the version, same as all my previous usage: Version: TestU01 1.2.3 I may investigate this further using the tests that systematically fail. > >> >> All the other generators have similar results when bit reversed. So adding >> the bit-reversed results to the user-guide does not appear worthwhile. I >> will archive these and they can be added later if required, for example to >> show a good generator against a bad one. This will only be relevant if the >> library adds reference implementations of bad generators. > > It's on Abhishek's TODO list (e.g. "LCG”). I’ll leave it until it is needed. For now it just adds a load of extra data with little merit to the user guide. > >> Currently only the JDK is bad generator. >> >> Next: >> >> I have added a ‘results' command to the stress test application that can >> generate these results tables. It requires some header information not found >> in the old results files so only works with the new results. It can generate >> the APT table directly for the user guide. It will be useful going forward >> when more generators are added to update the results. >> >> The new results are named using the test suite (dh_ or tu_), optionally the >> bit-reversed flag (r_), the enum ordinal and the trial run: >> >> dh_1_1 = Dieharder for JDK trial 1 >> tu_1_1 = BigCrush for JDK trial 1 >> dh_r_2_3 = Dieharder bit reversed for WELL_512_A trial 3 >> >> I propose to: >> >> - Delete all the old results and add these new ones using a new directory >> structure. All results can reside in a single directory. >> - Ignore for now the bit-reversed results. >> - Delete the old stress test code. The new code supersedes all functionality >> of the old version. >> - Commit the new ‘results’ command when I have confirmed the APT table is >> correctly generated. > > +1 OK. > >> >> Questions: >> >> 1. Do we stick to using 3 trials or update to 5 (because I have the results)? > > +1 +1 to which? I assume sticking to 3 trials. > >> 2. Do we remove the diehard_sums test result? >> >> I would recommend removing diehard_sums. It pollutes the results for most >> generators with a spurious fail that should be ignored. So I think we should >> ignore it. > > +0 (as you wish) The Dieharder web page and documentation indicates that this test should not be used. So adding it to the results is incorrect. I will document it as so. I’ll also update the ‘results’ command to ignore the test by default so you explicitly have to request it is included. This should prevent future updates to the user guide from including it by mistake. > > Gilles > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > <mailto:dev-unsubscr...@commons.apache.org> > For additional commands, e-mail: dev-h...@commons.apache.org > <mailto:dev-h...@commons.apache.org>