Le ven. 19 juil. 2019 à 16:47, Alex Herbert <alex.d.herb...@gmail.com> a écrit : > > On 19/07/2019 15:09, Gilles Sadowski wrote: > > Hi. > > > > Le ven. 19 juil. 2019 à 15:31, Alex Herbert <alex.d.herb...@gmail.com> a > > écrit : > >> On 19/07/2019 14:15, Gilles Sadowski wrote: > >>> Hello. > >>> > >>> Le ven. 19 juil. 2019 à 14:27, Alex Herbert <alex.d.herb...@gmail.com> a > >>> écrit : > >>>> One principle reason for SharedStateDiscreteSampler > >>>> and SharedStateContinuousSampler is to simplify the current design > >>>> approach > >>>> for samplers that use internal delegates to provide optimised algorithms. > >>>> However this creates an inefficient outer class that is just wrapping the > >>>> implementation. > >>>> > >>>> The idea was to simplify the implementation of the SharedStateSampler<R> > >>>> interface for these Discrete/Continuous Samplers. However it has wider > >>>> application to moving away from public constructors to factory methods. > >>>> The > >>>> SharedStateDiscreteSampler interface would allow the samplers package to > >>>> move to a factory constructor model where samplers have no public > >>>> constructor. This allows the factory method to choose the best > >>>> implementation. This idea is recommended by Joshua Block in Effective > >>>> Java > >>>> (see [1]). > >>>> > >>>> For example: > >>>> > >>>> UniformRandomProvider rng = ...; > >>>> double mean = ...; > >>>> PoissonSampler sampler = new PoissonSampler(rng, mean); > >>>> > >>>> vs (with some potential names): > >>>> > >>>> SharedStateDiscreteSampler sampler = > >>>> PoissonSampler.newSharedStateDiscreteSampler(rng, mean); > >>>> SharedStateDiscreteSampler sampler = PoissonSampler.newInstance(rng, > >>>> mean); > >>>> SharedStateDiscreteSampler sampler = PoissonSampler.newSampler(rng, > >>>> mean); > >>>> SharedStateDiscreteSampler sampler = > >>>> PoissonSampler.newPoissonSampler(rng, > >>>> mean); > >>>> SharedStateDiscreteSampler sampler = PoissonSampler.of(rng, mean); > >>>> > >>>> Currently there are some unreleased classes in v1.3 that use various > >>>> construction methods with different approaches to fitting within the > >>>> current design of returning an instance that supports DiscreteSampler and > >>>> SharedStateSampler<R> and using specialised algorithms: > >>>> > >>>> AliasMethodDiscreteSampler > >>>> MarsagliaTsangWangDiscreteSampler > >>>> GuideTableDiscreteSampler > >>>> GeometricSampler > >>>> > >>>> The PoissonSamplerCache could also benefit from this as it returns a > >>>> DiscreteSampler even though the returned object is now also a > >>>> SharedStateSampler<R>. > >>>> > >>>> I would advocate introducing these interfaces and switching to a factory > >>>> method design for unreleased code. This has no binary compatibility > >>>> issues. > >>>> > >>>> Current released code that would also benefit from this design are those > >>>> with internal delegates: > >>>> > >>>> PoissonSampler > >>>> AhrensDieterMarsagliaTsangGammaSampler > >>>> LargeMeanPoissonSampler > >>>> ChengBetaSampler (no delegate but it chooses an algorithm) > >>>> > >>>> An example of factory code would be > >>>> > >>>> double alpha; > >>>> double theta > >>>> SharedStateContinuousSampler sampler = > >>>> AhrensDieterMarsagliaTsangGammaSampler.of(rng, alpha, theta); > >>>> SharedStateContinuousSampler sampler = > >>>> AhrensDieterMarsagliaTsangGammaSampler.newGammaSampler(rng, alpha, > >>>> theta); > >>>> > >>>> Since the class name is so verbose in this case the 'of' method name does > >>>> not appear out of place. > >>>> > >>>> Note that as the SharedStateSampler<R> interface is implemented by all > >>>> (non-deprecated) distribution samplers a factory method could be added to > >>>> every sampler. This would pave the way for removal of public constructors > >>>> in a 2.0 release (whenever that happens). > >>>> > >>>> Overall the proposal is to: > >>>> > >>>> - Create SharedStateDiscreteSampler and SharedStateContinuousSampler > >>>> - Simplify the implementation of SharedStateSampler<R> > >>>> - Move to factory constructors for unreleased samplers > >>>> - Add factory constructors to existing samplers to return optimised > >>>> implementations > >>> +1 (using "of" as the method name). > >> OK. I will create a ticket. > >> > >> Note that when I did SharedStateSampler I avoided this implementation to > >> minimise changes. However I recently revisited the change to the > >> DiscreteUniformSampler to use a new faster sampling method for a uniform > >> range (RNG-95). The result was an implementation with 5 internal classes > >> to handle ranges: > >> > >> [n,n] > > ? > Edge case: It is currently possible to create a DiscreteUniformSampler > where lower == upper as the bounds are inclusive. In this case the > sample is fixed so I created a specialisation. > > > >> [n,m] > >> [0,n] > >> > >> And specialisations for powers of 2. It works and is much faster than > >> the current sampler but was not very clean. I'll update this when the > >> new structure is in place. > >> > >> I am happy with the name 'of' for most samplers since the sampler name > >> contains all the information. What method name would you recommend for a > >> sampler that can sample from different distributions? This is the class > >> and existing methods: > >> > >> MarsagliaTsangWangDiscreteSampler.createBinomialDistribution > >> MarsagliaTsangWangDiscreteSampler.createPoissonDistribution > >> MarsagliaTsangWangDiscreteSampler.createDiscreteDistribution > >> > >> Leave it using 'create' or change to: > >> > >> - 'new' > >> - 'for' > >> - 'of' > >> > >> In this case 'of' still seems OK. > >> > >> MarsagliaTsangWangDiscreteSampler.ofBinomialDistribution > >> MarsagliaTsangWangDiscreteSampler.ofPoissonDistribution > >> MarsagliaTsangWangDiscreteSampler.ofDiscreteDistribution > > In my understanding of the convention, what follows "of" (suffix > > and/or method's arguments) is the input while in this case, the > > suffix (e.g. "BinomialDistribution") is the output. > Input -> Output > > Generator+BinomialDistribution -> BinomialSampler > Generator+PoissonDistribution -> PoissonSampler > Generator+ProbabilityDistribution -> DiscreteSampler > > Given that the arguments are two-fold the suffix would get very verbose. > So I prefer the subclass factories as below. > > > > > How about: > > ---CUT--- > > public abstract class MarsagliaTsangWangDiscreteSampler { > > // ... > > > > public static class Binomial { > > private Binomial() {} > > > > public static MarsagliaTsangWangDiscreteSampler > > of(UniformRandomProvider rng, > > > > int trials, > > > > double probabilityOfSuccess) { > > // ... > > } > > } > > > > public static class Poisson { > > // ... > > } > > > > public static class Discrete { > > // ... > > } > > } > > ---CUT--- > > ? > > > > [A more self-describing name may be in order for the "DiscreteDistribution" > > created from a user-defined list of probability.] > > In CM4 a similar class is EnumeratedIntegerDistribution. This requires > you specify the integer random values for each probability. Here the > values are assumed to start at 0 and increment. Note the return type is > proposed to be SharedStateDiscreteSampler giving you: > > SharedStateDiscreteSampler sampler = > MarsagliaTsangWangDiscreteSampler.Binomial.of > SharedStateDiscreteSampler sampler = > MarsagliaTsangWangDiscreteSampler.Poisson.of > SharedStateDiscreteSampler sampler = > MarsagliaTsangWangDiscreteSampler.Enumerated.of
Yes, "Enumerated" of course... > > More verbose version: > > SharedStateDiscreteSampler sampler = > MarsagliaTsangWangDiscreteSampler.BinomialDistribution.of > SharedStateDiscreteSampler sampler = > MarsagliaTsangWangDiscreteSampler.PoissonDistribution.of > SharedStateDiscreteSampler sampler = > MarsagliaTsangWangDiscreteSampler.EnumeratedDistribution.of Given the context, I think that it'll do without "Distribution". :-) Best, Gilles > > I shall have a think while doing the conversion. > > > > > Gilles > > > >> Perhaps I'll just make all the changes and the names can be assessed > >> when the PR is ready. There may be others that I have missed that will > >> turn up when changing the code. > >> > >> Alex > >> > >> > >>> Gilles > >>> > >>>> Thoughts on this are welcomed. > >>>> > >>>> Alex > >>>> > >>>> [1] https://www.baeldung.com/java-constructors-vs-static-factory-methods > >>>> > >>>> On Fri, 19 Jul 2019 at 10:35, Alex Herbert <alex.d.herb...@gmail.com> > >>>> wrote: > >>>> > >>>>> This interface has been added in v1.3: > >>>>> > >>>>> /** > >>>>> * Applies to samplers that can share state between instances. > >>>>> Samplers > >>>>> can be created with a > >>>>> * new source of randomness that sample from the same state. > >>>>> * > >>>>> * @param <R> Type of the sampler. > >>>>> * @since 1.3 > >>>>> */ > >>>>> public interface SharedStateSampler<R> { > >>>>> /** > >>>>> * Create a new instance of the sampler with the same underlying > >>>>> state > >>>>> using the given > >>>>> * uniform random provider as the source of randomness. > >>>>> * > >>>>> * @param rng Generator of uniformly distributed random numbers. > >>>>> * @return the sampler > >>>>> */ > >>>>> R withUniformRandomProvider(UniformRandomProvider rng); > >>>>> } > >>>>> > >>>>> All of the DiscreteSampler and ContinuousSampler classes have been > >>>>> updated > >>>>> to implement SharedStateSampler. > >>>>> > >>>>> This is the equivalent of: > >>>>> > >>>>> /** > >>>>> * Sampler that generates values of type {@code int} and can create > >>>>> new > >>>>> instances to sample > >>>>> * from the same state with a given source of randomness. > >>>>> * > >>>>> * @since 1.3 > >>>>> */ > >>>>> public interface SharedStateDiscreteSampler > >>>>> extends DiscreteSampler, > >>>>> SharedStateSampler<SharedStateDiscreteSampler> { > >>>>> // Marker interface > >>>>> } > >>>>> > >>>>> > >>>>> If this combined marker interface is added to the library it would > >>>>> simplify a lot of the code for samplers which have internal delegates to > >>>>> avoid casting. With this interface they can hold a > >>>>> SharedStateDiscreteSampler rather than a DiscreteSampler delegate and > >>>>> directly use it to create new delegates. > >>>>> > >>>>> For example PoissonSampler code which wraps a specific implementation: > >>>>> > >>>>> /** > >>>>> * @param rng Generator of uniformly distributed random numbers. > >>>>> * @param source Source to copy. > >>>>> */ > >>>>> private PoissonSampler(UniformRandomProvider rng, > >>>>> PoissonSampler source) { > >>>>> super(null); > >>>>> > >>>>> if (source.poissonSamplerDelegate instanceof > >>>>> SmallMeanPoissonSampler) { > >>>>> poissonSamplerDelegate = > >>>>> > >>>>> ((SmallMeanPoissonSampler)source.poissonSamplerDelegate).withUniformRandomProvider(rng); > >>>>> } else { > >>>>> poissonSamplerDelegate = > >>>>> > >>>>> ((LargeMeanPoissonSampler)source.poissonSamplerDelegate).withUniformRandomProvider(rng); > >>>>> } > >>>>> } > >>>>> > >>>>> Becomes: > >>>>> > >>>>> private PoissonSampler(UniformRandomProvider rng, > >>>>> PoissonSampler source) { > >>>>> super(null); > >>>>> poissonSamplerDelegate = > >>>>> source.poissonSamplerDelegate.withUniformRandomProvider(rng); > >>>>> } > >>>>> > >>>>> > >>>>> I propose to add: > >>>>> > >>>>> public interface SharedStateDiscreteSampler > >>>>> extends DiscreteSampler, > >>>>> SharedStateSampler<SharedStateDiscreteSampler> { > >>>>> // Marker interface > >>>>> } > >>>>> > >>>>> public interface SharedStateContinuousSampler > >>>>> extends ContinuousSampler, > >>>>> SharedStateSampler<SharedStateContinuousSampler> { > >>>>> // Marker interface > >>>>> } > >>>>> > >>>>> > >>>>> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org