>> [...]
> > I was thinking of a utility for taking the output (of several runs) of
> > the test
> > suites and generate the table in "apt" format (with correct links).
>
> I previously did a lot of this when testing the cache of the int and
> boolean values. I could not find the scripts on local machines until I
> realised I committed them to that topic branch.
>
> Q. How is your Perl? I use it because I know it but I find the syntax
> ugly. I should switch to Python but Perl is so good for parsing text files.

I also many times contemplated learning Python, but everything
was more readily available in Perl. :-}

> This one renames results files to the same names as the order in master.
> I used it to rename results files from a partial run since they all
> start at 1.
>
> https://github.com/aherbert/commons-rng/blob/improvement-RNG-57-stress/commons-rng-examples/examples-stress/rename.pl
>
> This sort of idea may be useful when the number of generators gets large
> and a rerun of only a few of them is needed (or even just new ones).
>
> This one tabulates failures and systematic failures in the results for
> Jira posts:
>
> https://github.com/aherbert/commons-rng/blob/improvement-RNG-57-stress/commons-rng-examples/examples-stress/tabulate_results.pl
>
> There is some weird tabs vs spaces problem in that version but it works.
> Here's what it does on the current master. Note the first table matches
> the userguide and the script could easily be adapted to create the
> rng.apt table. The second table identifies unique tests for dieharder
> using the name and ntup parameter (n-tuples), and for BigCrush it is the
> test ID and name. It reports tests that always fail.
>
> > rng-stress-tabulate-results.pl
> src/site/resources/txt/userguide/stress/*/*/*
>
> || RNG identifier || Dieharder   || TestU01 (BigCrush) ||
> | JDK | 11 12 13 |74 72 75 |
> | MT | 0 0 0 |3 2 2 |
> | WELL_512_A | 0 0 0 |7 6 6 |
> | WELL_1024_A | 0 0 0 |4 4 5 |
> | WELL_19937_A | 0 0 0 |3 2 2 |
> | WELL_19937_C | 0 0 0 |2 2 3 |
> | WELL_44497_A | 0 0 0 |2 3 3 |
> | WELL_44497_B | 0 0 0 |2 2 2 |
> | ISAAC | 0 0 0 |0 1 0 |
> | MT_64 | 0 0 1 |3 2 3 |
> | SPLIT_MIX_64 | 0 0 0 |2 0 0 |
> | XOR_SHIFT_1024_S | 0 0 0 |2 0 0 |
> | TWO_CMRES | 1 1 1 |0 0 1 |
> | MWC_256 | 0 0 0 |0 0 0 |
> | KISS | 0 0 0 |1 2 0 |
>
> || RNG identifier || Test suite || Systematic failures || Fails ||
> | JDK | Dieharder | diehard_oqso:0 | 3/3 |
> | JDK | Dieharder | diehard_dna:0 | 3/3 |
> | JDK | Dieharder | rgb_minimum_distance:3 | 3/3 |
> | JDK | Dieharder | rgb_minimum_distance:4 | 3/3 |
> | JDK | Dieharder | rgb_minimum_distance:5 | 3/3 |
> | JDK | Dieharder | rgb_lagged_sum:15 | 3/3 |
> | JDK | Dieharder | rgb_lagged_sum:31 | 3/3 |
> | JDK | Dieharder | dab_bytedistrib:0 | 3/3 |
> | JDK | Dieharder | dab_filltree:32 | 3/3 |
> | JDK | TestU01 (BigCrush) | 1:SerialOver | 3/3 |
> | JDK | TestU01 (BigCrush) | 3:CollisionOver | 3/3 |
> | JDK | TestU01 (BigCrush) | 5:CollisionOver | 3/3 |
> | JDK | TestU01 (BigCrush) | 7:CollisionOver | 3/3 |
> | JDK | TestU01 (BigCrush) | 9:CollisionOver | 3/3 |
> | JDK | TestU01 (BigCrush) | 11:CollisionOver | 3/3 |
> | JDK | TestU01 (BigCrush) | 13:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 14:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 15:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 16:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 17:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 18:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 19:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 20:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 21:BirthdaySpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 22:ClosePairs | 3/3 |
> | JDK | TestU01 (BigCrush) | 23:ClosePairs | 3/3 |
> | JDK | TestU01 (BigCrush) | 24:ClosePairs | 3/3 |
> | JDK | TestU01 (BigCrush) | 25:ClosePairs | 3/3 |
> | JDK | TestU01 (BigCrush) | 26:SimpPoker | 3/3 |
> | JDK | TestU01 (BigCrush) | 28:SimpPoker | 3/3 |
> | JDK | TestU01 (BigCrush) | 30:CouponCollector | 3/3 |
> | JDK | TestU01 (BigCrush) | 31:CouponCollector | 3/3 |
> | JDK | TestU01 (BigCrush) | 34:Gap | 3/3 |
> | JDK | TestU01 (BigCrush) | 36:Gap | 3/3 |
> | JDK | TestU01 (BigCrush) | 40:Permutation | 3/3 |
> | JDK | TestU01 (BigCrush) | 41:Permutation | 3/3 |
> | JDK | TestU01 (BigCrush) | 42:Permutation | 3/3 |
> | JDK | TestU01 (BigCrush) | 43:Permutation | 3/3 |
> | JDK | TestU01 (BigCrush) | 44:CollisionPermut | 3/3 |
> | JDK | TestU01 (BigCrush) | 45:CollisionPermut | 3/3 |
> | JDK | TestU01 (BigCrush) | 46:MaxOft | 3/3 |
> | JDK | TestU01 (BigCrush) | 47:MaxOft | 3/3 |
> | JDK | TestU01 (BigCrush) | 48:MaxOft | 3/3 |
> | JDK | TestU01 (BigCrush) | 49:MaxOft | 3/3 |
> | JDK | TestU01 (BigCrush) | 50:SampleProd | 3/3 |
> | JDK | TestU01 (BigCrush) | 51:SampleProd | 3/3 |
> | JDK | TestU01 (BigCrush) | 52:SampleProd | 3/3 |
> | JDK | TestU01 (BigCrush) | 57:AppearanceSpacings | 3/3 |
> | JDK | TestU01 (BigCrush) | 59:WeightDistrib | 3/3 |
> | JDK | TestU01 (BigCrush) | 62:WeightDistrib | 3/3 |
> | JDK | TestU01 (BigCrush) | 63:WeightDistrib | 3/3 |
> | JDK | TestU01 (BigCrush) | 65:SumCollector | 3/3 |
> | JDK | TestU01 (BigCrush) | 74:RandomWalk1 | 3/3 |
> | JDK | TestU01 (BigCrush) | 84:Fourier3 | 3/3 |
> | JDK | TestU01 (BigCrush) | 86:LongestHeadRun | 3/3 |
> | JDK | TestU01 (BigCrush) | 90:HammingWeight2 | 3/3 |
> | JDK | TestU01 (BigCrush) | 95:HammingIndep | 3/3 |
> | JDK | TestU01 (BigCrush) | 97:HammingIndep | 3/3 |
> | JDK | TestU01 (BigCrush) | 99:HammingIndep | 3/3 |
> | JDK | TestU01 (BigCrush) | 101:Run | 3/3 |
> | MT | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | MT | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | WELL_512_A | TestU01 (BigCrush) | 68:MatrixRank | 3/3 |
> | WELL_512_A | TestU01 (BigCrush) | 69:MatrixRank | 3/3 |
> | WELL_512_A | TestU01 (BigCrush) | 70:MatrixRank | 3/3 |
> | WELL_512_A | TestU01 (BigCrush) | 71:MatrixRank | 3/3 |
> | WELL_512_A | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | WELL_512_A | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | WELL_1024_A | TestU01 (BigCrush) | 70:MatrixRank | 3/3 |
> | WELL_1024_A | TestU01 (BigCrush) | 71:MatrixRank | 3/3 |
> | WELL_1024_A | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | WELL_1024_A | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | WELL_19937_A | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | WELL_19937_A | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | WELL_19937_C | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | WELL_19937_C | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | WELL_44497_A | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | WELL_44497_A | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | WELL_44497_B | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | WELL_44497_B | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | MT_64 | TestU01 (BigCrush) | 80:LinearComp | 3/3 |
> | MT_64 | TestU01 (BigCrush) | 81:LinearComp | 3/3 |
> | TWO_CMRES | Dieharder | diehard_dna:0 | 3/3 |
>
> These scripts (or variants of) could be added to the main source but:
>
> - The perl should be tidied up and more comments added. Perl is not easy
> to maintain if you don't know it.

Only because TMTOWTDY does not necessarily imply it.
If a source is obfuscated or not, it's the author's choice; using the OO
syntax does not make a source less maintainable than Java code.

> - Perhaps there is another place for them?
>
> I am happy to not clutter the source repo and keep them somewhere for
> devs. But where? You tell me. Maybe in the examples-stress module they
> are OK as that one is not officially part of the library.

Unless a script is robust and readily usable by a newbie (i.e. anyone
but the author), there is the risk that it becomes cruft.
What you did is great work but I doubt that many would be interested
in detailed tables of the failures, once one can easily compare the number
of failures.
If more is needed, there is no shortcut to reading the doc of the test suites
themselves and other resources on the web...

So, to summarize, what I think is interesting is to make it easy to rerun
the test suites and update the site (e.g. for the "current" JDK):
 * a document that mentions the external tools requirement
 * a standalone application for running "RandomStressTester"
 * a script that collects results from the above run, formats them into the
   "quality" table ready to be pasted into the "rng.apt" file, and copies the
   output files to their appropriate place (*not* assuming a git repository)

It would also be great to be able to easily start the many benchmarks
and similarly collects the results into the user guide tables.

> >> [...]
> >> You then have a utility for dumping output of any random source to file
> >> in a variety of formats.
> >>
> >> Although long output is not needed for the test suites it is useful for
> >> native long generators.
> >>
> >> WDYT?
> > Looks good!
>
> OK. I will work on the raw data dumper as a Jira ticket. It is
> encapsulated work that does not really effect anything else.
>
>
> DieHarder has finished!
>
> I think my stupidity is what caused previous crashes. I was running the
> stress test within the source tree and possibly git checkout onto
> another branch makes some of the directory paths stale killing any
> processes linked to those paths. I'll not do that again.

Hence, the "standalone" application is the right choice it seems.

>
> FYI: Here are the old results with incorrect byte order:
>
> XorShiftSerialComposite : 24, 25, 23 : 134.1 +/- 16.1
> XorShiftXorComposite : 88, 105, 89 : 396.2 +/- 9.9
> SplitXorComposite : 0, 0, 0 : 90.8 +/- 21.9
>
> Here are the new results with correct byte order:
>
> XorShiftSerialComposite : 13, 15, 10 : 105.5 +/- 1.8
> XorShiftXorComposite : 57, 57, 57 : 102.9 +/- 1.5
> SplitXorComposite : 0, 0, 0 : 99.9 +/- 3.2
>
> So interestingly passing the correct byte order lowers the number of
> failures. There are still lots.
>
> And BigCrush (with the fix for passing the correct byte order):
>
> XorShiftSerialComposite : 40, 39, 39 : 608.2 +/- 3.9
> XorShiftXorComposite : 54, 53, 53 : 646.8 +/- 10.9
> SplitXorComposite : 0, 0, 0 : 625.8 +/- 0.2

Curious to know whether it is also affected by the byte ordering.

>
> So I think this means we deprecate XOR_SHIFT_1024_S.
>
> I can rebase the PRs for the new generators and merge them in.
>
>
> For reference here are the systematic failures (in case you like to read
> about the tests that are failing):
>
> [...]

Hmm, I'm among those happy enough to be able to choose a
generator better than "java.util.Random". ;-)

Best regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to