> > On 6 Mar 2019, at 21:42, Alex Herbert <alex.d.herb...@gmail.com> wrote: > > > >> On 6 Mar 2019, at 21:24, Gilles Sadowski <gillese...@gmail.com> wrote: >> >> Hello. >> >> Le mer. 6 mars 2019 à 21:49, Alex Herbert <alex.d.herb...@gmail.com> a écrit >> : >>> >>> >>> >>>> On 6 Mar 2019, at 17:11, Gilles Sadowski <gillese...@gmail.com> wrote: >>>> >>>> Do the two variants produce uncorrelated sequences? >>> >>> I will test this when I branch a new PR for just this code. >> >> IMHO, it's strange that there would be 2 sources of randomness in a single >> implementation. >> Concretely: If one needs a fast "int" provider, and a fast "long" provider, >> I'd >> consider the simpler solution of using 2 different providers. > > I think this has crossed wires somewhere. I was talking about the variant of > the XorShift1024Star algorithm and whether XorShift1024Star should be > deprecated in favour of XorShift1024StarPhi. > > The variant of the SplitMix64 algorithm for producing ints was tested in a > benchmark that I am prepared to throw away. The results are in the Jira > ticket. The way the SplittableRandom creates an int is slightly slower than > the method used in [RNG] SplitMix64 which divides the long in half. This > ticket can be closed as done and I’ll add a comment that no speed improvement > was found. > > I agree that this variant algorithm should have been in a new provider. It > would produce a different output of bytes since the bit shift in the second > step is different. But I’m not going to add this algorithm so it does not > matter. > > However I will test if XorShift1024Star and XorShift1024StarPhi are > correlated just for completeness. >
Did a test of 100 repeats of a correlation of 50 longs from the XorShift1024Star and XorShift1024StarPhi, new seed each time: SummaryStatistics: n: 100 min: -0.30893547071559685 max: 0.37616626218398586 sum: 3.300079237520435 mean: 0.033000792375204355 geometric mean: NaN variance: 0.022258533475114764 population variance: 0.022035948140363616 second moment: 2.2035948140363617 sum of squares: 2.312500043775496 standard deviation: 0.14919294043323486 sum of logs: NaN Note that the algorithm is the same except the final step when the multiplier is used to scale the final output long: return state[index] * multiplier; So if it was outputting a double the correlation would be 1. But it is a long generator so the long arithmetic wraps to negative on large multiplications. The result is that the mean correlation is close to 0. A single repeat using 1,000,000 numbers has a correlation of 0.002. Am I missing something here with this type of test? >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org