Hello.

Le lun. 10 juin 2019 à 17:17, Alex Herbert <alex.d.herb...@gmail.com> a écrit :
>
>
> On 10/06/2019 15:31, Gilles Sadowski wrote:
> >>> P.S. Thinking of releasing 1.3?
> >> Not yet. I think there are a few outstanding items that work together
> >> for the multi-threaded focus of the new code and the new generators:
> > Sure but some of them could be postponed, if just to RERO.
> >
> >> - RNG-98: LongJumpable (easy)
> >>
> >> - RNG-102: SharedStateSampler (lots of easy work)
> >>
> >> - RNG-106: XorShiRo generators require non-zero input seeds
> >>
> >> (I'm still thinking about the best way to do this. The Jira ticket
> >> suggests a speed test to at least know the implications of different 
> >> ideas.)
> > This is only when using the "SeedFactory" (?).  [Otherwise, it's the
> > user's responsibility to construct an appropriate seed.]
> >
> > Couldn't we just check that the output of the internal generator is not
> > all zero (and call it again if it is)?
>
> Yes. The worse case scenario is a 1 in 2^64 collision rate with zero.
> All other generators have larger state sizes. So this would be fine. An
> alternative would be to set a single bit to non zero. This throws away 1
> bit of randomness from the seed and will always work without any
> recursion. But it makes the seed worse. The ideas are in the header for
> this Jira ticket:
>
> https://issues.apache.org/jira/browse/RNG-106
>
> I'll fix this soon.
>
> The other item I did not mention is outcome from RNG-104. This seems to
> indicate that using System.identityHashCode(new Object()) is not as good
> a mixer as a ThreadLocal random generator, both for speed and also
> quality. I'm currently testing Well44497b ^ SplitMix in BigCrush but I
> think this should replace the identity hash code method.

Didn't you also suggest to use XOR_SHIFT_1024_PHI (given the
large enough period, better speed score on BigCrush)?

> It also shows that using a synchronized block on each call to the
> generator is slow. Seed arrays can be built 2x faster using 8 calls to
> the generator per synchronisation when single threaded. When
> multi-threaded it is much better. I'm still testing to find a good
> estimate of the optimum block size for all scenarios.

+1

> >
> >> There are also outstanding items I've partially looked at:
> >>
> >> - RNG-90: Improve nextInt(int) and nextLong(long) for powers of 2
> >>
> >> I paused testing this as I moved on to other things. The easy fix is to
> >> copy the JDK SplittableRandom implementation. But it requires a
> >> generator with good quality lower bits. It would have to be worked
> >> around for generators that have low period lower bits. So this requires
> >> digesting all the results of BigCrush to determine which generators can
> >> use the new method and which should not change. Then is the decision on
> >> how to do it.
> > A second-order improvement IMHO.
> Which is why I moved on. Note that the speed of using the new approach
> is much faster for powers of 2.
> >
> >> - RNG-95: DiscreteUniformSampler
> >>
> >> I have code that computes a discrete uniform sample using multiply and
> >> not the modulus algorithm used in nextInt(int). However I cannot find
> >> anywhere that uses the method so currently I am the author. I cannot
> >> imagine no-one has done this before
> > Interesting...
> >
> >> but to be on the safe side it may be
> >> better to put it in as an alternative DiscreteSampler, e.g.
> >> FastDiscreteUniformSampler and leave the current DiscreteUniformSampler
> >> to default to using nextInt(int).
> > I'm wary of this naming after the "FastMath" experience.
> > Perhaps safer is to puti in a feature branch, until you are sure that
> > it can replace the current implementation.
> I couldn't think of a name describing the method. It is related to the
> discrete Weyl sequence so perhaps WeylDiscreteUniformSampler. It's a
> work in progress...
> >
> >> Speed tests show it is faster, and can be over 2-fold faster when the
> >> rejection algorithm in nextInt(int) is worse case.
> >>
> >> - RNG-100: GuideTableDiscreteSampler
> >>
> >> All done but should be rebased and put in a PR
> >>
> >> - RNG-99: AliasMethodDiscreteSampler
> >>
> >> Also done but is very hard to find a probability distribution where it
> >> is better than GuideTableDiscreteSampler. It could be added as a
> >> reference implementation.
> > +1
> >
> >> - RNG-XX: Use GuideTableDiscreteSampler behind
> >> DiscreteProbabilityCollectionSampler<T>
> >>
> >> It will be faster and remove the binarySearch method from that class.
> > +1
> >
> >> I also thought we wait until end of GSoC and so new generators could
> >> also be included.
> > I'd rather not wait.
> > There are many improvements and new features (thanks to you!)
> > that warrant a release.
> OK. I'll get on with fixing the "must haves".

:-)

> > And I also think that releasing his GSoC work would be a nice
> > achievement task for Abhishek (after he will have assisted at
> > how it worked out for 1.3).
> >

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to