Hello. Le lun. 10 juin 2019 à 17:17, Alex Herbert <alex.d.herb...@gmail.com> a écrit : > > > On 10/06/2019 15:31, Gilles Sadowski wrote: > >>> P.S. Thinking of releasing 1.3? > >> Not yet. I think there are a few outstanding items that work together > >> for the multi-threaded focus of the new code and the new generators: > > Sure but some of them could be postponed, if just to RERO. > > > >> - RNG-98: LongJumpable (easy) > >> > >> - RNG-102: SharedStateSampler (lots of easy work) > >> > >> - RNG-106: XorShiRo generators require non-zero input seeds > >> > >> (I'm still thinking about the best way to do this. The Jira ticket > >> suggests a speed test to at least know the implications of different > >> ideas.) > > This is only when using the "SeedFactory" (?). [Otherwise, it's the > > user's responsibility to construct an appropriate seed.] > > > > Couldn't we just check that the output of the internal generator is not > > all zero (and call it again if it is)? > > Yes. The worse case scenario is a 1 in 2^64 collision rate with zero. > All other generators have larger state sizes. So this would be fine. An > alternative would be to set a single bit to non zero. This throws away 1 > bit of randomness from the seed and will always work without any > recursion. But it makes the seed worse. The ideas are in the header for > this Jira ticket: > > https://issues.apache.org/jira/browse/RNG-106 > > I'll fix this soon. > > The other item I did not mention is outcome from RNG-104. This seems to > indicate that using System.identityHashCode(new Object()) is not as good > a mixer as a ThreadLocal random generator, both for speed and also > quality. I'm currently testing Well44497b ^ SplitMix in BigCrush but I > think this should replace the identity hash code method.
Didn't you also suggest to use XOR_SHIFT_1024_PHI (given the large enough period, better speed score on BigCrush)? > It also shows that using a synchronized block on each call to the > generator is slow. Seed arrays can be built 2x faster using 8 calls to > the generator per synchronisation when single threaded. When > multi-threaded it is much better. I'm still testing to find a good > estimate of the optimum block size for all scenarios. +1 > > > >> There are also outstanding items I've partially looked at: > >> > >> - RNG-90: Improve nextInt(int) and nextLong(long) for powers of 2 > >> > >> I paused testing this as I moved on to other things. The easy fix is to > >> copy the JDK SplittableRandom implementation. But it requires a > >> generator with good quality lower bits. It would have to be worked > >> around for generators that have low period lower bits. So this requires > >> digesting all the results of BigCrush to determine which generators can > >> use the new method and which should not change. Then is the decision on > >> how to do it. > > A second-order improvement IMHO. > Which is why I moved on. Note that the speed of using the new approach > is much faster for powers of 2. > > > >> - RNG-95: DiscreteUniformSampler > >> > >> I have code that computes a discrete uniform sample using multiply and > >> not the modulus algorithm used in nextInt(int). However I cannot find > >> anywhere that uses the method so currently I am the author. I cannot > >> imagine no-one has done this before > > Interesting... > > > >> but to be on the safe side it may be > >> better to put it in as an alternative DiscreteSampler, e.g. > >> FastDiscreteUniformSampler and leave the current DiscreteUniformSampler > >> to default to using nextInt(int). > > I'm wary of this naming after the "FastMath" experience. > > Perhaps safer is to puti in a feature branch, until you are sure that > > it can replace the current implementation. > I couldn't think of a name describing the method. It is related to the > discrete Weyl sequence so perhaps WeylDiscreteUniformSampler. It's a > work in progress... > > > >> Speed tests show it is faster, and can be over 2-fold faster when the > >> rejection algorithm in nextInt(int) is worse case. > >> > >> - RNG-100: GuideTableDiscreteSampler > >> > >> All done but should be rebased and put in a PR > >> > >> - RNG-99: AliasMethodDiscreteSampler > >> > >> Also done but is very hard to find a probability distribution where it > >> is better than GuideTableDiscreteSampler. It could be added as a > >> reference implementation. > > +1 > > > >> - RNG-XX: Use GuideTableDiscreteSampler behind > >> DiscreteProbabilityCollectionSampler<T> > >> > >> It will be faster and remove the binarySearch method from that class. > > +1 > > > >> I also thought we wait until end of GSoC and so new generators could > >> also be included. > > I'd rather not wait. > > There are many improvements and new features (thanks to you!) > > that warrant a release. > OK. I'll get on with fixing the "must haves". :-) > > And I also think that releasing his GSoC work would be a nice > > achievement task for Abhishek (after he will have assisted at > > how it worked out for 1.3). > > Regards, Gilles --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org