Re: [rng] Releasing 1.3

Gilles Sadowski Fri, 30 Aug 2019 07:44:26 -0700

Hello.

Le ven. 30 août 2019 à 15:29, Alex Herbert <alex.d.herb...@gmail.com> a écrit :
>
>
> On 30/08/2019 14:22, Gilles Sadowski wrote:
> > Hi.
> >
> > Le ven. 30 août 2019 à 15:11, Abhishek Dhadwal <dhadwal1...@gmail.com> a 
> > écrit :
> >> Hello,
> >>
> >> What would be the deadline for the release ?
> > There is no deadline, but the sooner the better.
> > The many new features deserve a release.
> >
> >> If it’s not too early I could resolve RNG-111 
> >> (https://issues.apache.org/jira/browse/RNG-111) over the next few days. I 
> >> couldn’t work on it before due to back to back college project evaluations 
> >> and my mid term examinations (which get over by tomorrow).
> > Great.
>
> My take on this...
>
> There is nothing outstanding.
>
> It would be nice to get the JSF generator (RNG-111) put in as it is a
> compliment to the SFC generator I just added (RNG-112). Let's give
> Abhishek some time to finish this work.


+1

> All the speed tests will need to be updated for the site. I have been
> considering if we should have more 'real' benchmarks. Currently we
> benchmark a single call to each method in UniformRandomProvider using
> the pattern recommended by JMH. I've built these benchmarks to require
> post-processing to subtract the JMH overhead. So the tables for the
> user-guide can either put in the raw data or as is currently done put in
> the relative scores following post-processing to subtract the JMH baseline.

No strong preference.
But I don't think that the userguide needs to expose details such as
baseline subtraction.

>
> However it is unclear if the current benchmark represents real-world
> observable differences. A candidate for a 'real' world test for
> nextInt() is a shuffle benchmark. This can be done using an actual array
> or just by generating indices to swap.

How is this more real world than the above (i.e. generating *one* int)?

> When using an actual array it is
> more realistic but relative performance is size dependent. When the
> array gets large the method is slowed by cache misses during the actual
> swap of the data. So the shuffle benchmark results would have to be
> report for different sizes.

A perhaps better illustration would be to report the time it takes to
compute pi to a certain accuracy (cf. class "ComputePi" in module
"commons-rng-examples").

> The other method to target would be nextLong(). I cannot think of a real
> use case for this so instead we could target nextDouble(). I tried a
> benchmark using a SmallMeanPoissonSampler which calls nextDouble() n+1
> times for a mean of n. This is a good start point for an algorithm that
> puts the random numbers to use. However it may be replicating what is
> already listed for the comparison of the different normalised Gaussian
> samplers. Currently the user guide has a table comparing the 3 different
> Gaussian samplers and then another table for the Marsaglia normalised
> Gaussian sampler. (Note: There is some redundancy here.) Since all the
> Gaussian samplers use nextDouble()/nextLong() then the output of this
> table is a rough guide to the relative speed of the RNG on 64-bit output.
>
> I would suggest dropping the table showing the Marsaglia normalised
> Gaussian sampler (it is redundant given the comparison of different
> Gaussian samplers)

+1

> and adding a shuffle benchmark for different array sizes.

-0
(cf. above)

>
> More on RNG-32 below...
>
> >
> > Best,
> > Gilles
> >
> >> Regards,
> >> Abhishek
> >>
> >> Sent from Mail for Windows 10
> >>
> >> From: Gilles Sadowski
> >> Sent: 30 August 2019 17:22
> >> To: Commons Developers List
> >> Subject: Re: [rng] Releasing 1.3
> >>
> >> Hi.
> >>
> >> Le lun. 10 juin 2019 à 17:17, Alex Herbert <alex.d.herb...@gmail.com> a 
> >> écrit :
> >>>
> >>> On 10/06/2019 15:31, Gilles Sadowski wrote:
> >>>>>> P.S. Thinking of releasing 1.3?
> >>>>> Not yet. I think there are a few outstanding items [...]
> >> Status?
> >>
> >> In particular could we resolve
> >>     https://issues.apache.org/jira/projects/RNG/issues/RNG-32
> >> and all its sub-tasks following the work done through GSoC?
>
> This could be rounded up. Each tickets need some finishing details.
>
>
> RNG-16: We did test LCGs that outperform the 48-bit LCG of
> Java.util.Random. Any 64-bit LCG which returns the upper 32-bits should
> be an OK generator. The ones we tested using increments 1 (Musl) and
> 1442695040888963407 (from Knuth) perform as:
>
> ---------
> DieHarder
> ---------
> KnuthShiftLCG : 1, 0, 0, 0, 0
> MuslShiftLCG : 0, 0, 0, 0, 0
> -------
> TestU01
> -------
> MuslShiftLCG : 15, 16, 15, 14, 15
> KnuthShiftLCG : 14, 11, 16, 17, 14
>
> They are bad on BigCrush but will probably be the fastest 32-bit
> generators in the library if a user wants a very fast 32-bit generator
> for simple random stuff (e.g. generating filenames).

If doing IO afterwards, generation time will be fairly insignificant.

> A single example
> would at least be a reference point and a comparison for the PCG
> generators that use a 64-bit LCG and then permute the output. It is
> possible to create one using the abstract class for the PCG generators:
>
> public class LcgShift32 extends AbstractPcg6432 {
>      public LcgShift32(long[] seed) {
>          super(seed);
>      }
>
>      @Override
>      protected int transform(long x) {
>          return (int)(x >>> 32);
>      }
> }

Is this faster than generating a "long" with a good 64-bit generator
and use it whole to get two "int"s?

> RNG-84: The PCG ticket should be resolved. I have not had time to work
> on the K-dimensionally distributed variants that should support the
> Jumpable interface. This can go on a new ticket when I get round to it.

Sure.

>
> RNG-17: We never got as far as testing a Lagged Fibonacci generator. The
> reference example int versions only output 24-bits so were not deemed
> suitable. There was a reference example that used 48-bit floating point
> values. However to add this would not fit in the model of the
> source32/64 packages. I think this ticket should be left open with links
> to the reference implementations and a task to test them using DieHarder
> and BigCrush from the native C implementation. If results are good then
> a double based LFG could be added to a new source64F package.

Are there any modern generators based on "double"?

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [rng] Releasing 1.3

Reply via email to