On Thu, Oct 9, 2014 at 12:08 PM, Gilles <gil...@harfang.homelinux.org>
wrote:

> Hi.
>
> On Wed, 08 Oct 2014 21:45:30 +0200, Luc Maisonobe wrote:
>
>> Le 08/10/2014 19:32, Gilles a écrit :
>>
>>> > [...]
>>>>
>>>
>>> What do you think of the solution proposed in
>>>   https://issues.apache.org/jira/browse/MATH-1158
>>> ?
>>>
>>> [No hidden RNG, no default RNG, no hidden performance
>>> cost or delayed initialization, similar API, ...]
>>>
>>
>> I may have missed something but it does not really separate the sampling
>> from the distribution:
>>
>
> This is what Phil wrote (in the reply to same post):
>
>   > That is basically going back to not having sampling in the
>   > distribution classes.
>
> Doesn't this show that we don't seem to talk about the same things?
> How are we supposed to make progress when one summarizes a proposal
> with a blunt statement that misrepresents my attempt to go in the
> direction which you both advocated in an earlier post.
>
> [It's this kind of nonsense which I tried to avoid in comments for
> MATH-1154.]
>
> That said, I'm more inclined to agree with you: the sampling is still
> available from a distribution instance (and it _should_ be so: the
> previous discussion on this subject had led everyone to recognize that
> it made no sense, mathematically and programmatically, to have large
> swaths of duplicated code, as was the case before "sample()" was
> introduced).
>
> [I'm certainly not the one to want to go back to that previous state.
> And this proposal is nowhere near to it.]
>
>  the Sampler instance is an inner class that calls
>> the distribution sample(Rng) method. So this method still exists in the
>> distribution and can be called directly from there (apart from being
>> protected).
>>
>
> Well, that is exactly the point: the "sample(RNG)" takes that RNG
> argument and that's what everyone seemed to agree on (unless I'm
> mistaken).
> It is protected to enforce API separation: the distribution class provides
> the necessary hooks (to the method and fields) needed to define (override)
> the behaviour of "sample", while the caller will access the sampling
> through
> a dedicated, simple, class (implementing the "Sampler" interface).
>
>
>> The good thing is the generator is visible. I think we all agree the
>> generator must be visible, and it already is in the existing constructors.
>>
>
> That's the other point: some people have asked why there is a RNG
> argument although most of their usage do not need one.
> In my proposal, there is indeed no more RNG in the distribution object;
> only a user that needs it (for sampling, of course) will have to provide
> one to the factory method.
>
>
>> The difference I see between this solution and clearly specifying in
>> documentation what happens when the distribution is built without a
>> generator is that with your solution users can first build the
>> distribution and later on deciding they need to sample from it and then
>> create the sampler as an afterthought. I'm not sure this is worth the
>> change, but I may be mistaken here, I don't use sampling on
>> distributions so may well miss a classical use case.
>>
>
> That proposal would
> 1. fix the performance issue reported in MATH-1154,
>

this proposal would only fix it in 4.0, as we can not remove the rng
references in the distribution classes immediately.


> 2. fix that issue without resorting to the so called "smelly
>    workaround",
>

there is nothing smelly about it, and if you want to fix this issue in 3.4
there is not much choice (apart from the lazy initialization proposal).


> 3. simplify the "RealDistribution" interface namely by
>    removing the "reseedRandomGenerator" method (whose presence
>    has always been a flaw, IMO): the RNG is and stays under the
>    caller's explicit responsibility,
> 4. simplify the concrete distribution classes, by removing all
>    the constructors that take a RNG argument,
> 5. make the distribution classes thread-safe, by making the
>    "devilishly hard" problem (of ensuring RNG thread-safety) go
>    away.
> Moreover:
> 6. An application layer that only needs sampling could pass a
>    "Sampler" object: hiding the implicit distribution could make
>    the application cleaner (self-documented).
> 7. The modification can be backwards compatible, and user can
>    upgrade easily.
>

with the other points I agree, but I would make the sample(RandomGenerator)
method public, so that users have the choice to directly call it rather
than create a Sampler for this distribution.

Thomas

Reply via email to