Le ven. 19 juil. 2019 à 16:47, Alex Herbert <alex.d.herb...@gmail.com> a écrit :
>
> On 19/07/2019 15:09, Gilles Sadowski wrote:
> > Hi.
> >
> > Le ven. 19 juil. 2019 à 15:31, Alex Herbert <alex.d.herb...@gmail.com> a 
> > écrit :
> >> On 19/07/2019 14:15, Gilles Sadowski wrote:
> >>> Hello.
> >>>
> >>> Le ven. 19 juil. 2019 à 14:27, Alex Herbert <alex.d.herb...@gmail.com> a 
> >>> écrit :
> >>>> One principle reason for SharedStateDiscreteSampler
> >>>> and SharedStateContinuousSampler is to simplify the current design 
> >>>> approach
> >>>> for samplers that use internal delegates to provide optimised algorithms.
> >>>> However this creates an inefficient outer class that is just wrapping the
> >>>> implementation.
> >>>>
> >>>> The idea was to simplify the implementation of the SharedStateSampler<R>
> >>>> interface for these Discrete/Continuous Samplers. However it has wider
> >>>> application to moving away from public constructors to factory methods. 
> >>>> The
> >>>> SharedStateDiscreteSampler interface would allow the samplers package to
> >>>> move to a factory constructor model where samplers have no public
> >>>> constructor. This allows the factory method to choose the best
> >>>> implementation. This idea is recommended by Joshua Block in Effective 
> >>>> Java
> >>>> (see [1]).
> >>>>
> >>>> For example:
> >>>>
> >>>> UniformRandomProvider rng = ...;
> >>>> double mean = ...;
> >>>> PoissonSampler sampler = new PoissonSampler(rng, mean);
> >>>>
> >>>> vs (with some potential names):
> >>>>
> >>>> SharedStateDiscreteSampler sampler =
> >>>> PoissonSampler.newSharedStateDiscreteSampler(rng, mean);
> >>>> SharedStateDiscreteSampler sampler = PoissonSampler.newInstance(rng, 
> >>>> mean);
> >>>> SharedStateDiscreteSampler sampler = PoissonSampler.newSampler(rng, 
> >>>> mean);
> >>>> SharedStateDiscreteSampler sampler = 
> >>>> PoissonSampler.newPoissonSampler(rng,
> >>>> mean);
> >>>> SharedStateDiscreteSampler sampler = PoissonSampler.of(rng, mean);
> >>>>
> >>>> Currently there are some unreleased classes in v1.3 that use various
> >>>> construction methods with different approaches to fitting within the
> >>>> current design of returning an instance that supports DiscreteSampler and
> >>>> SharedStateSampler<R> and using specialised algorithms:
> >>>>
> >>>> AliasMethodDiscreteSampler
> >>>> MarsagliaTsangWangDiscreteSampler
> >>>> GuideTableDiscreteSampler
> >>>> GeometricSampler
> >>>>
> >>>> The PoissonSamplerCache could also benefit from this as it returns a
> >>>> DiscreteSampler even though the returned object is now also a
> >>>> SharedStateSampler<R>.
> >>>>
> >>>> I would advocate introducing these interfaces and switching to a factory
> >>>> method design for unreleased code. This has no binary compatibility 
> >>>> issues.
> >>>>
> >>>> Current released code that would also benefit from this design are those
> >>>> with internal delegates:
> >>>>
> >>>> PoissonSampler
> >>>> AhrensDieterMarsagliaTsangGammaSampler
> >>>> LargeMeanPoissonSampler
> >>>> ChengBetaSampler (no delegate but it chooses an algorithm)
> >>>>
> >>>> An example of factory code would be
> >>>>
> >>>> double alpha;
> >>>> double theta
> >>>> SharedStateContinuousSampler sampler =
> >>>> AhrensDieterMarsagliaTsangGammaSampler.of(rng, alpha, theta);
> >>>> SharedStateContinuousSampler sampler =
> >>>> AhrensDieterMarsagliaTsangGammaSampler.newGammaSampler(rng, alpha, 
> >>>> theta);
> >>>>
> >>>> Since the class name is so verbose in this case the 'of' method name does
> >>>> not appear out of place.
> >>>>
> >>>> Note that as the SharedStateSampler<R> interface is implemented by all
> >>>> (non-deprecated) distribution samplers a factory method could be added to
> >>>> every sampler. This would pave the way for removal of public constructors
> >>>> in a 2.0 release (whenever that happens).
> >>>>
> >>>> Overall the proposal is to:
> >>>>
> >>>> - Create SharedStateDiscreteSampler and SharedStateContinuousSampler
> >>>> - Simplify the implementation of SharedStateSampler<R>
> >>>> - Move to factory constructors for unreleased samplers
> >>>> - Add factory constructors to existing samplers to return optimised
> >>>> implementations
> >>> +1 (using "of" as the method name).
> >> OK. I will create a ticket.
> >>
> >> Note that when I did SharedStateSampler I avoided this implementation to
> >> minimise changes. However I recently revisited the change to the
> >> DiscreteUniformSampler to use a new faster sampling method for a uniform
> >> range (RNG-95). The result was an implementation with 5 internal classes
> >> to handle ranges:
> >>
> >> [n,n]
> > ?
> Edge case: It is currently possible to create a DiscreteUniformSampler
> where lower == upper as the bounds are inclusive. In this case the
> sample is fixed so I created a specialisation.
> >
> >> [n,m]
> >> [0,n]
> >>
> >> And specialisations for powers of 2. It works and is much faster than
> >> the current sampler but was not very clean. I'll update this when the
> >> new structure is in place.
> >>
> >> I am happy with the name 'of' for most samplers since the sampler name
> >> contains all the information. What method name would you recommend for a
> >> sampler that can sample from different distributions? This is the class
> >> and existing methods:
> >>
> >> MarsagliaTsangWangDiscreteSampler.createBinomialDistribution
> >> MarsagliaTsangWangDiscreteSampler.createPoissonDistribution
> >> MarsagliaTsangWangDiscreteSampler.createDiscreteDistribution
> >>
> >> Leave it using 'create' or change to:
> >>
> >> - 'new'
> >> - 'for'
> >> - 'of'
> >>
> >> In this case 'of' still seems OK.
> >>
> >> MarsagliaTsangWangDiscreteSampler.ofBinomialDistribution
> >> MarsagliaTsangWangDiscreteSampler.ofPoissonDistribution
> >> MarsagliaTsangWangDiscreteSampler.ofDiscreteDistribution
> > In my understanding of the convention, what follows "of" (suffix
> > and/or method's arguments) is the input while in this case, the
> > suffix (e.g. "BinomialDistribution") is the output.
> Input -> Output
>
> Generator+BinomialDistribution -> BinomialSampler
> Generator+PoissonDistribution -> PoissonSampler
> Generator+ProbabilityDistribution -> DiscreteSampler
>
> Given that the arguments are two-fold the suffix would get very verbose.
> So I prefer the subclass factories as below.
>
> >
> > How about:
> > ---CUT---
> > public abstract class MarsagliaTsangWangDiscreteSampler {
> >      // ...
> >
> >      public static class Binomial {
> >          private Binomial() {}
> >
> >          public static MarsagliaTsangWangDiscreteSampler
> > of(UniformRandomProvider rng,
> >
> >                 int trials,
> >
> >                 double probabilityOfSuccess) {
> >              // ...
> >          }
> >      }
> >
> >      public static class Poisson {
> >          // ...
> >      }
> >
> >      public static class Discrete {
> >          // ...
> >      }
> > }
> > ---CUT---
> > ?
> >
> > [A more self-describing name may be in order for the "DiscreteDistribution"
> > created from a user-defined list of probability.]
>
> In CM4 a similar class is EnumeratedIntegerDistribution. This requires
> you specify the integer random values for each probability. Here the
> values are assumed to start at 0 and increment. Note the return type is
> proposed to be SharedStateDiscreteSampler giving you:
>
> SharedStateDiscreteSampler sampler = 
> MarsagliaTsangWangDiscreteSampler.Binomial.of
> SharedStateDiscreteSampler sampler = 
> MarsagliaTsangWangDiscreteSampler.Poisson.of
> SharedStateDiscreteSampler sampler = 
> MarsagliaTsangWangDiscreteSampler.Enumerated.of

Yes, "Enumerated" of course...

>
> More verbose version:
>
> SharedStateDiscreteSampler sampler = 
> MarsagliaTsangWangDiscreteSampler.BinomialDistribution.of
> SharedStateDiscreteSampler sampler = 
> MarsagliaTsangWangDiscreteSampler.PoissonDistribution.of
> SharedStateDiscreteSampler sampler = 
> MarsagliaTsangWangDiscreteSampler.EnumeratedDistribution.of

Given the context, I think that it'll do without "Distribution". :-)

Best,
Gilles

>
> I shall have a think while doing the conversion.
>
> >
> > Gilles
> >
> >> Perhaps I'll just make all the changes and the names can be assessed
> >> when the PR is ready. There may be others that I have missed that will
> >> turn up when changing the code.
> >>
> >> Alex
> >>
> >>
> >>> Gilles
> >>>
> >>>> Thoughts on this are welcomed.
> >>>>
> >>>> Alex
> >>>>
> >>>> [1] https://www.baeldung.com/java-constructors-vs-static-factory-methods
> >>>>
> >>>> On Fri, 19 Jul 2019 at 10:35, Alex Herbert <alex.d.herb...@gmail.com> 
> >>>> wrote:
> >>>>
> >>>>> This interface has been added in v1.3:
> >>>>>
> >>>>> /**
> >>>>>    * Applies to samplers that can share state between instances. 
> >>>>> Samplers
> >>>>> can be created with a
> >>>>>    * new source of randomness that sample from the same state.
> >>>>>    *
> >>>>>    * @param <R> Type of the sampler.
> >>>>>    * @since 1.3
> >>>>>    */
> >>>>> public interface SharedStateSampler<R> {
> >>>>>       /**
> >>>>>        * Create a new instance of the sampler with the same underlying 
> >>>>> state
> >>>>> using the given
> >>>>>        * uniform random provider as the source of randomness.
> >>>>>        *
> >>>>>        * @param rng Generator of uniformly distributed random numbers.
> >>>>>        * @return the sampler
> >>>>>        */
> >>>>>       R withUniformRandomProvider(UniformRandomProvider rng);
> >>>>> }
> >>>>>
> >>>>> All of the DiscreteSampler and ContinuousSampler classes have been 
> >>>>> updated
> >>>>> to implement SharedStateSampler.
> >>>>>
> >>>>> This is the equivalent of:
> >>>>>
> >>>>> /**
> >>>>>    * Sampler that generates values of type {@code int} and can create 
> >>>>> new
> >>>>> instances to sample
> >>>>>    * from the same state with a given source of randomness.
> >>>>>    *
> >>>>>    * @since 1.3
> >>>>>    */
> >>>>> public interface SharedStateDiscreteSampler
> >>>>>       extends DiscreteSampler,
> >>>>> SharedStateSampler<SharedStateDiscreteSampler> {
> >>>>>       // Marker interface
> >>>>> }
> >>>>>
> >>>>>
> >>>>> If this combined marker interface is added to the library it would
> >>>>> simplify a lot of the code for samplers which have internal delegates to
> >>>>> avoid casting. With this interface they can hold a
> >>>>> SharedStateDiscreteSampler rather than a DiscreteSampler delegate and
> >>>>> directly use it to create new delegates.
> >>>>>
> >>>>> For example PoissonSampler code which wraps a specific implementation:
> >>>>>
> >>>>> /**
> >>>>>    * @param rng Generator of uniformly distributed random numbers.
> >>>>>    * @param source Source to copy.
> >>>>>    */
> >>>>> private PoissonSampler(UniformRandomProvider rng,
> >>>>>           PoissonSampler source) {
> >>>>>       super(null);
> >>>>>
> >>>>>       if (source.poissonSamplerDelegate instanceof 
> >>>>> SmallMeanPoissonSampler) {
> >>>>>           poissonSamplerDelegate =
> >>>>>
> >>>>> ((SmallMeanPoissonSampler)source.poissonSamplerDelegate).withUniformRandomProvider(rng);
> >>>>>       } else {
> >>>>>           poissonSamplerDelegate =
> >>>>>
> >>>>> ((LargeMeanPoissonSampler)source.poissonSamplerDelegate).withUniformRandomProvider(rng);
> >>>>>       }
> >>>>> }
> >>>>>
> >>>>> Becomes:
> >>>>>
> >>>>> private PoissonSampler(UniformRandomProvider rng,
> >>>>>           PoissonSampler source) {
> >>>>>       super(null);
> >>>>>       poissonSamplerDelegate =
> >>>>> source.poissonSamplerDelegate.withUniformRandomProvider(rng);
> >>>>> }
> >>>>>
> >>>>>
> >>>>> I propose to add:
> >>>>>
> >>>>> public interface SharedStateDiscreteSampler
> >>>>>       extends DiscreteSampler,
> >>>>> SharedStateSampler<SharedStateDiscreteSampler> {
> >>>>>       // Marker interface
> >>>>> }
> >>>>>
> >>>>> public interface SharedStateContinuousSampler
> >>>>>       extends ContinuousSampler,
> >>>>> SharedStateSampler<SharedStateContinuousSampler> {
> >>>>>       // Marker interface
> >>>>> }
> >>>>>
> >>>>>
> >>>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to