Le dim. 17 mars 2019 à 01:01, Alex Herbert <alex.d.herb...@gmail.com> a écrit :
>
>
>
> > On 16 Mar 2019, at 23:10, Alex Herbert <alex.d.herb...@gmail.com> wrote:
> >
> >
> >
> >> On 16 Mar 2019, at 02:54, Gilles Sadowski <gillese...@gmail.com 
> >> <mailto:gillese...@gmail.com>> wrote:
> >>> This is read by dieharder which directly reads from stdin. This worked to 
> >>> collect all the generated bits and the serial and xor composites failed 
> >>> the test suite.
> >>>
> >>> It is also read by the stdin2testu01.c program to pass to TestU01.
> >>>
> >>> What is happening is that the stdin2testu01.c is reading 64-bits using an 
> >>> unsigned long.
> >>
> >> I don't remember why I wrote that, but as you pointed outit now looks
> >> like a plain bug.
> >
> > It may be more complicated again...
> >
> > I’ve had a play around with the data being pushed through to the testU01 
> > library using the c bridge. I wanted to check that the int value that is 
> > generated by the RNG is passed through to the c program. So I wrote a 
> > simple BridgeTester class to do this. It writes all the int values to a 
> > data file (for reference) then passes them to the c executable with the 
> > same method as the RandomStressTester. I then modified the stdin2testu01.c 
> > program to have an extra hidden debug mode where all the data is just 
> > written to stdout.
> >
> > I found the data file written from Java did not match the data that the c 
> > program had. I bit more digging found that the problem was that Java uses a 
> > big endian representation and the c program is little endian. This is true 
> > on my linux and Mac OSX platforms. So the raw bytes read from stdin are in 
> > the wrong order.
> >
> > When I updated the program to self detect endianness and swap the byte 
> > order of each set of 4 bytes from the stdin then the data in the c program 
> > matched the original.
> >
> > Since it was non destructive to the module I added all this to master. You 
> > can see this working by rebuilding the c bridge and running the new profile 
> > to test it:
> >
> > > cd commons-rng-examples/examples-stress
> > > gcc src/main/c/stdin2testu01.c -o stdin2testu01 -ltestu01 
> > > -ltestu01probdist -ltestu01mylib -lm
> > > mvn test -P bridge
> >
> > You should see two files:
> >
> > target/bridge.data
> > target/bridge.out
> >
> > These should have the same contents. The .data file is written by the java 
> > program, and the .out file is the stdout captured from the c program with 
> > its view of the data.
> >
> > This should fix running TestU01.
> >
> > BUT I’ve not had time to determine how Dieharder is reading the stdin. 
> > Given it is a c library it may be reading it using little endian as well. 
> > I’ll look into that next.
> >
> > Composite update:
> >
> > For some reason all my BigCrush simulations crashed. It could be a RAM 
> > issue. The runs did take longer than expected but I did not monitor memory 
> > usage. I’ve started them again but using only the serial composite. I think 
> > the xor one is really broken.
> >
> > FYI. Using the new bridge code with 3 runs of SmallCrush finds [6, 6, 6] / 
> > 15 failed tested for the serial composite and [9, 9, 10] / 15 for the xor 
> > composite.
> >
> > I’m expecting BigCrush to fail a lot. I’m now more interested in seeing if 
> > it will complete.
> >
> > Alex
> >
>
>
> PS. Thinking about the endianness it might not matter. The test suite ideally 
> will be able to detect if the bits are not random in the lower or upper most 
> significant byte of the 32 bits. I.e. it should always find a problem. I am 
> not clear if this is the case. I have read that some generators can pass 
> BigCrush but fail if the bits are reversed (not the bytes but the bits). I’m 
> happy to think that endianness is not an issue.
>
> It was a good exercise in debugging if the bridge was working though.
>
> One actual issue is that we are testing long providers using the long to 
> create 2 int values. Should we test using a series of the upper 32 bits and 
> then a series of the lower 32 bits?

Is that useful since the test now sees the integers as they are produced (i.e. 2
values per long)?

Gilles

> I may set an unused workstation on this task to see what happens.
>
> Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to