Re: [Math] Separating Sampling from Distributions

Thomas Neidhart Thu, 09 Oct 2014 07:58:22 -0700

On Thu, Oct 9, 2014 at 4:43 PM, Ole Ersoy <[email protected]> wrote:


>
>
> On 10/09/2014 08:43 AM, Gilles wrote:
>
>> On Thu, 09 Oct 2014 08:12:06 -0500, Ole Ersoy wrote:
>>
>>> Hello,
>>>
>>> Just sharing a few more thoughts on this WRT:
>>> https://issues.apache.org/jira/browse/MATH-1124
>>>
>>> (1) The issues currently are:
>>> You have to inject an RNG when using the constructor lengthening
>>> instantiation time and possibly increasing memory usage without
>>> benefit.
>>>
>>> (2)
>>> The design of the distribution is heavier than it needs to be.  For
>>> example if you subclass AbstractIntegerDistribution the code I pasted
>>> below, which I believe is used only in sampling, is included.  As a
>>> result of this anything that uses the distributions become heavier and
>>> more complicated than need be, including:
>>> - test code
>>> - subclasses
>>> - composites
>>> - etc.
>>>
>>> SAMPLING ONLY CODE IN AbstractIntegerDistribution
>>>
>>
>> I'm not sure for "AbstractIntegerDistribution", but
>> "inverseCumulativeProbability"
>> from "AbstractRealDistribution" is used in some classes in package
>> "o.a.c.m.stat".
>>
>
> I have not looked at this in detail, but is it possible that
> "o.a.c.m.stat" could be served in a more "Lightweight" way if this method
> were on a separate class in perhaps a o.a.c.m.montecarlo package?
>
> To me it seems more ideal to have something like
> AbstractIntegerDistributionSampler and AbstractRealDistributionSampler
> that own all the sampling related code.  I also prefer composition over
> inheritance when possible, but would have to look closer at how that might
> work.
>
>
>>  [...]
>>>
>>>
>>> Side Note:
>>> There is a logProbability method that just computes the log of a
>>> probability.  If someone needs to do this can't they just do
>>> FastMath.log(probability) directly?
>>>
>>
>> I think that, for some distribution, the computation of log(p) is more
>> accurate. This was a feature request from not long ago.
>>
>
> Just pasting the method below:
>
>     public double logProbability(int x) {
>         return FastMath.log(probability(x));
>     }
>
> So it retrieves the probability of the index and uses FastMath to compute
> the log.  We could do:
>
> FastMath(distribution.probability(x));
>
> It just seems a bit over engineered to have something this trivial in the
> API, unless I'm missing something.
>

This is the default implementation. For some distributions there is a more
efficient way to compute the logProbability directly, and for these
distributions the method is overridden (e.g. GeometricDistribution,
BinomialDistribution, ...).
Otherwise one would have to call probability(x) first, which in many cases
involves calling exp(...) and then compute the log of it, which is just a
waste of cpu.

Thomas

Re: [Math] Separating Sampling from Distributions

Reply via email to