> On 19 Mar 2019, at 10:35, Gilles Sadowski <gillese...@gmail.com> wrote: > >>> [...] >>>> So leave the testing to just ints and document on the user guide that is >>>> what we are testing. >>> >>> +1 >> >> OK. That seems simplest. >> >> Given all the stress tests will be rerun shall I go ahead and reorder the >> existing files, user guide .apt file and the GeneratorsList to be in the >> order of the RandomSource enum? > > We could wait for the new results before updating the site.
I was going to rearrange it all and test all the links in the local site are all ok. I have this scripted but have not yet run it. When new results are ready they can be written over the existing ones. Either way I am fine. So let’s leave it until new results have been done and then check the site. I will update the GeneratorsList to be autogenerated from the RandomSource enum. > >> >> >> Big/Little Endian for Dieharder: >> >> I’ve spent some time looking at the source code for Dieharder. It reads >> binary file data using this (taken from libdieharder/rng_file_input_raw.c): >> >> unsigned int iret; >> // ... >> fread(&iret,sizeof(uint),1,state->fp); >> >> So it reads single unsigned integers using fread(). >> >> Given that it is possible to run die harder using numbers from ascii and >> binary input files I set up a test. I created them using a RNG with the same >> seed with the standard output from a DataOutputStream and the byte reversed >> output using Integer.reverseBytes. Here’s what happens: >> >>> dieharder -g 201 -d 0 -f raw.bin.rev >> diehard_birthdays| 0| 100| 100|0.89220858| PASSED >>> dieharder -g 202 -d 0 -f raw.txt >> diehard_birthdays| 0| 100| 100|0.89220858| PASSED >> >>> dieharder -g 201 -d 0 -f raw.bin >> diehard_birthdays| 0| 100| 100|0.30776452| PASSED >>> dieharder -g 202 -d 0 -f raw.txt.rev >> diehard_birthdays| 0| 100| 100|0.30776452| PASSED >> >>> cat raw.bin | dieharder -g 200 -d 0 >> diehard_birthdays| 0| 100| 100|0.30776452| PASSED >> >> >> Note the reversed byte sequence (.rev suffix) is required to get the same >> results from the binary (.bin) file as from the text (.txt) file. >> >> So the binary read of Dieharder is using the little endian representation, >> as was required for TestU01. >> >> I had modified the stdin2testu01.c bridge to detect if the system was little >> endian and then correct the input data by reversing the bytes. It may be a >> better idea to write a test c program to detect the endianness of the system >> for reference. Then update the stress test benchmark to have an argument for >> little or big endian output when piping the int data to the command line >> program. >> >> I think it is important to get the endianness of the data correct. At least >> for Dieharder it runs tests using tuples of bits from the data which can >> span multiple bytes. For example the sts_serial test (-d 102) uses >> overlapping n-tuples of bits with n from 1 to 16. Other tests using non >> overlapping tuples such as rgb_bitdist (-d 200) use n 1 to 12. >> >> Reversing the bytes in the Java code is the easiest option. > > +1 > [With an option flag for selecting whether the output should be BE or LE.] > OK. I will consolidate all this and update the stress_test.md instructions to make it clear that endianness needs to be considered. Should I add the raw data dumper to the source base? This runs a named RandomSource for a given number of iterations with a provided seed and outputs 4 files: Dieharder text format and raw binary, with standard order and byte reversed. It may be useful if debugging the output of RNGs ever needs to be done again. Alex