Le mar. 30 juil. 2019 à 22:07, Alex Herbert <alex.d.herb...@gmail.com> a écrit :
>
>
>
> > On 30 Jul 2019, at 19:28, Gilles Sadowski <gillese...@gmail.com> wrote:
> >
> > Hi.
> >
> > Le mar. 30 juil. 2019 à 15:38, Alex Herbert <alex.d.herb...@gmail.com 
> > <mailto:alex.d.herb...@gmail.com>> a écrit :
> >>
> >> On 30/07/2019 10:56, Gilles Sadowski wrote:
> >>> Hello.
> >>>
> >>> Le lun. 10 juin 2019 à 17:17, Alex Herbert <alex.d.herb...@gmail.com> a 
> >>> écrit :
> >>>>
> >>>> On 10/06/2019 15:31, Gilles Sadowski wrote:
> >>>>>>> P.S. Thinking of releasing 1.3?
> >>>>>> Not yet. I think there are a few outstanding items [...]
> >>> Anything missing?
> >>
> >> - RNG-110: The PR for SharedSharedDiscrete/ContinuousSampler should have
> >> a review [1]. I've left this while we finished GSoC phase 2 but it is 
> >> ready.
> >>
> >> I added factory methods for all samplers. For existing samplers this is
> >> just for consistency. Some however use internal delegates and the
> >> factory method can return the delegate directly which is an advantage.
> >>
> >> One issue to look at is how I handled GaussianSampler and
> >> LogNormalSampler. The samplers can only be shared state samplers if the
> >> input NormalizedGaussianSampler is a shared state sampler. I handled
> >> this with documentation. But this means a downstream user may be passed
> >> a SharedStateContinuousSampler, use it as such and receive an exception
> >> if it was created incorrectly.
> >
> > Could the library create it "incorrectly”?
>
> No. All the NormalizedGaussianSamplers in the library are suitable to pass to 
> the GaussianSampler and LogNormalSampler.
>
> >
> >>
> >> The alternative is two factory methods which must have different names
> >> due to type erasure:
> >>
> >> public static ContinuousSampler of(NormalizedGaussianSampler gaussian,
> >> double scale, double shape);
> >>
> >> public static
> >>     <T extends NormalizedGaussianSampler &
> >> SharedStateSampler<ContinuousSampler>>
> >>     SharedStateContinuousSampler
> >>     ofSharedState(T normalized,
> >>                   double mean,
> >>                   double standardDeviation) {
> >
> > Not nice, at first sight.
> >
> >> So the options are:
> >>
> >> - As current but has the pitfall of throwing exceptions if you do create
> >> a one with something that does not share state (i.e. not a sampler in
> >> the library).
> >
> > IIUC, it answers my question above.
> > I would not consider too much that the interfaces defined in our "client 
> > API"
> > module could be implemented by external codes.
> > Even within "Commons", we do not use other components' API…
>
> OK. User beware documentation. If using only our library it will not be an 
> issue.
>
> So I take it that you are fine with the PR?

Yes.
I didn't look at all the details but is that check
---CUT---
        if (!(source.normalized instanceof SharedStateSampler<?>)) {
            throw new UnsupportedOperationException("The underlying
sampler is not a SharedStateSampler");
        }
---CUT---
necessary?

> Just rereading it now and I’m not totally sold on the factory constructors 
> using .of(…). Here’s Joshua Bloch on the matter [1]:
>
> —
> Here are some common names for static factory methods:
>
>         • valueOf—Returns an instance that has, loosely speaking, the same 
> value as its parameters. Such static factories are effectively 
> type-conversion methods.
>
>         • of—A concise alternative to valueOf, popularized by EnumSet (Item 
> 32).
>
>         • getInstance—Returns an instance that is described by the parameters 
> but cannot be said to have the same value. In the case of a singleton, 
> getInstance takes no parameters and returns the sole instance.
>
>         • newInstance—Like getInstance, except that newInstance guarantees 
> that each instance returned is distinct from all others.
>
>         • getType—Like getInstance, but used when the factory method is in a 
> different class. Type indicates the type of object returned by the factory 
> method.
>
>         • newType—Like newInstance, but used when the factory method is in a 
> different class. Type indicates the type of object returned by the factory 
> method.
> —
>
> The ’newType’ may be appropriate as the factory method is returning an 
> instance of an interface not defined in the class. This is required for some 
> implementations such as the PoissonSampler which returns either a SmallMean 
> or LargeMeanPoissonSampler. So the Bloch naming could be ’newSampler’. But 
> then we have over verbose code using ‘Sampler' twice:
>
> new PoissonSampler(rng, mean)
>
> PoissonSampler.of(rng, mean)
>
> PoissonSampler.newSampler(rng, mean)
>
> IMO the later is too verbose. If we state that the sampler is entirely 
> defined by the input arguments to ‘of’ then it does satisfy "has, loosely 
> speaking, the same value as its parameters”.
>
> So ‘of’ is OK. The only other words I like are ‘from’ or ‘with’:
>
> PoissonSampler.of(rng, mean)
> PoissonSampler.from(rng, mean)
> PoissonSampler.with(rng, mean)
>

"with" seems the most appropriate but has the disadvantage of not
being in Bloch's list.
I would then stick with "of" because "from" looks like the arguments
will be transformed
(and it also is not in the list...).

Gilles

> WDYT?
>
> [1] http://www.informit.com/articles/article.aspx?p=1216151 
> <http://www.informit.com/articles/article.aspx?p=1216151>
> >
> >> - Another factory method to explicitly create a SharedStateSampler using
> >> a normalised Gaussian SharedStateSampler.
> >>
> >>
> >> A few things that are 90% done:
> >>
> >> - RNG-85: MiddleSquareWeylSequence generator
> >>
> >> This is simple code and now the modifications have been made to the
> >> ProviderBuilder it is possible to pass in a good quality increment for
> >> the Weyl sequence. I have code to build the increment that can be added
> >> to the SeedFactory. I did this months ago so will have to find it and
> >> create the PR.
> >>
> >> - RNG-95: DiscreteUniformSampler
> >>
> >> I now have a reference for the alternative algorithm for choosing int
> >> values from an interval. The code is done but should go after RNG-110 as
> >> the code uses 5 internal delegates for different algorithms. This would
> >> be optimised by the changes in RNG-110.
> >>
> >> - RNG-109: DiscreteProbabilityCollectionSampler to use an internal
> >> DiscreteSampler
> >>
> >> I have to create a benchmark to compare the AliasMethodSampler against
> >> the GuideTableSampler to see which is more suitable for a generic
> >> probability distribution. This should not take long.
> >>
> >> - RNG-94: RotateRotateMultiplyXorMultiplyXor
> >>
> >> Simple code that is based on the same idea of using an output hash
> >> function on a Weyl sequence like SplitMix. It is slightly slower but the
> >> hash function is better and more robust to low complexity increments. So
> >> we can add it using a seeded increment for the Weyl sequence. This would
> >> take a day to add the two hash function variants.
> >>
> >
> > Everything ready is fine to add before the next release, but equally
> > fine to add after it (and do another release in 2 months if wanted)
> > given the host of new features already implemented. :-)
>
> I’ll put what I have together in the coming week or so.
>
> >
> > Best,
> > Gilles
> >
> >> Maybe for later:
> >>
> >> - RNG-90: Improve nextInt(int)
> >>
> >> This could use the same algorithm as RNG-95. I have not done the testing
> >> yet. It also can be done for nextLong(long) which requires a 64-bit
> >> product multiplication to be computed as a 128-bit result. I have code
> >> for this but no performance tests.
> >>
> >> Not done but...
> >>
> >> The PCG family has extended generators: K-dimensionally equidistributed
> >> or Cryptographic. These have a much larger period and the
> >> equidistributed ones can be Jumpable.
> >>
> >>
> >> [1] https://github.com/apache/commons-rng/pull/58
> >>
> >>
> >>
> >>>
> >>> Regards,
> >>> Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to