Re: [rng] Split and Jump functions

Alex Herbert Sun, 28 Apr 2019 08:03:26 -0700


> On 28 Apr 2019, at 00:59, Bernd Eckenfels <e...@zusammenkunft.net> wrote:
> 
> Hello,
> 
> Just a question, I am unclear on the terminology, is „jump“ (did I miss the 
> discussion leading toot?) something invented here? It sounds to me like this 
> is a generator where the state can be cloned and it is „seekable“. It 
> probably makes sense to have those two dimensions separated anyway.

Hi Bernd, thanks for the input.

This thread started with the definition: 
Jump:

To create a new instance of the generator that is deterministically based on 
the state of the current instance but advanced a set number of iterations.

However it is not required to create a new instance at the same time as 
jumping. You are correct in that this is two functionalities: 

1. Jump forward in the sequence
2. Copy

However the two are coupled. Having jump on its own is useless (why move 
forward in the sequence without using it?). So a copy needs to be created 
somewhere before/after the jump.

The idea of a jump is to create a series of the generator at different points 
in the state. The generators can be used for parallel computations and will be 
ensured to not overlap in their output sequence for number of outputs skipped 
by the jump length.

FYI. The generators that support this have jump sizes of 2^64, 96, 128, 192, 
256 and 512. So this is a lot of output sequence to jump.

Copy on its own works but for what purpose? If you want a second generator at 
the moment you just create a new one (with a different seed). Duplicate copies 
of generators is prone to potential pitfalls where simulations are not as 
random as you intend. For a special use case where you wish to run multiple 
simulations with the same generator you can use the Restorable interface to 
save the state of one and re-create it in other instances.

The current thread came to the choice of:

>>> So the options are (in all cases returning the copy):
>>> 
>>> 1. createAndJumpCopy
>>> 2. copyAndJumpParent
>>> 3. jumpParentAndCopy
>>> 4. jump and copy separately

Jump and copy separately was ruled out to discourage misuse of copy. 

The current suggestion is 1. Create a copy and jump that ahead. The current 
instance is not affected.

I now consider this to be weaker for a variety of use cases than 2. This copies 
the current state for use and then jumps the parent ahead. So this alters the 
state of the parent generator.

Note that all other methods of a generator alter its state. So having jump 
alter its state is reasonable.

The most flexible API is to separate jump and copy into two methods. We can 
still support helper functions that take in a Jumpable generator and create a 
jump series of generators for parallel work. Separating jump and copy allows 
the functionality to be used in a larger number of ways than any other 
interface that attempts to combine jump and copy.

I am fine with having separate jump and copy. If so the copy method, being part 
of the Jumpable interface, will be functionally coupled with the jump method 
and should be described in Javadoc with the intended purpose to use it to copy 
the parent state either before or after a jump into a child generator.

As a precursor this API is very flexible:

JumpableUniformRandomProvider extends UniformRandomProvider {
    /** Jump and return same instance. */
    JumpableUniformRandomProvider jump();
    /** Copy the instance. */
    JumpableUniformRandomProvider copy();
}

Returning the same instance in jump() allows method chaining such as either:

rng.jump().copy();
rng.copy().jump();

This potential pitfall is that a user may do this:

UniformRandomProvider rng1 = rng.copy().jump();
UniformRandomProvider rng2 = rng.copy().jump();

Where rng1 and 2 will be the same, 1 jump ahead of the parent state. Or:

UniformRandomProvider rng1 = rng.jump();
UniformRandomProvider rng2 = rng.jump();

Where rng, rng1 and rng2 are the same instance all 2 jumps ahead of the start 
point.

I think our purpose is to provide an API for the generators that can jump and 
that is not too restrictive given the use cases we have so far thought up. 
There may be other ideas of use cases that cannot be done with the coupled 
functionality of:

JumpableUniformRandomProvider extends UniformRandomProvider {
    /** Copy the instance, then jump ahead. Return the copy of the previous 
state. */
    JumpableUniformRandomProvider jump();
}

JumpableUniformRandomProvider extends UniformRandomProvider {
    /** Copy the instance, then jump the copy ahead. Return the copy. The 
current instance is not affected. */
    JumpableUniformRandomProvider jump();
}

So the split functions without allowing method chaining:

JumpableUniformRandomProvider extends UniformRandomProvider {
    /** Jump the current instance ahead. */
    void jump();
    /** Copy the instance. This is intended to be used either before or after a 
call to jump()
     * to create a series of generators. */
    JumpableUniformRandomProvider copy();
}

WDYT?

Alex

> 
> Gruss
> Bernd
> 
> 
> --
> http://bernd.eckenfels.net
> 
> ________________________________
> Von: Gilles Sadowski <gillese...@gmail.com>
> Gesendet: Sonntag, April 28, 2019 12:34 AM
> An: Commons Developers List
> Betreff: Re: [rng] Split and Jump functions
> 
> Hello.
> 
>> 
>> 
>>> On 27 Apr 2019, at 14:49, Gilles Sadowski <gillese...@gmail.com> wrote:
>>> 
>>> Hi.
>>> 
>>> Le sam. 27 avr. 2019 à 15:05, Alex Herbert <alex.d.herb...@gmail.com 
>>> <mailto:alex.d.herb...@gmail.com>> a écrit :
>>>> 
>>>> I have created RNG-97 and RNG-98 for Jump and LongJump.
>>>> 
>>>> Please take a look and comment.
>>>> 
>>>> The documentation highlights the implementation detail that a jump or long 
>>>> jump creates a copy that is far ahead. The original generator is not 
>>>> effected.
>>>> 
>>>> The use case is thus:
>>>> 
>>>> rng1 = …;
>>>> rng2 = rng1.jump();
>>>> rng3 = rng2.jump();
>>>> rng4 = rng3.jump();
>>>> 
>>>> As opposed to:
>>>> 
>>>> rng1 = …;
>>>> rng2 = rng1.jump();
>>>> rng3 = rng1.jump();
>>>> rng4 = rng1.jump();
>>>> 
>>>> Where rng1 will be advanced each time leaving behind a copy generator.
>>>> 
>>>> In either case it will be an overlap problem if any of the children are 
>>>> then used for jumping. So as long as the documentation is clear then this 
>>>> is OK. The helper method to create a jump series (or long jump series) in 
>>>> RandomSource seems the best way to avoid incorrect usage.
>>> 
>>> +1
>>> 
>>> I think that the default should be to prevent a "jump" on the returned
>>> instances.
>>> An overload could be defined with a parameter (e.g. "allowFurtherJump") but 
>>> I'd
>>> leave it out until it is requested based on an actual use-case.
>> 
>> I presume you are talking about the helper method in RandomSource.
>> 
>> However it does open the possibility instead of this:
>> 
>> JumpableUniformRandomProvider {
>> UniformRandomProvider jump();
>> }
>> 
>> This only works if the state is modified for the current instance to allow 
>> chaining jumps.
>> 
>> Having typed all this up into a summary for the two tickets I feel that they 
>> implement the idea in the wrong way. I think the jump should advance the 
>> state of the current generator. This is the master generator created and 
>> used in the high level code that controls the number of jumps that are 
>> required. The returned copy should be a copy of where the generator was. The 
>> copy should not be used for further jumps. In this way the interface for 
>> jump could be made to return a UniformRandomProvider.
>> 
>> When done like that the jumpable RNG is the only thing you need to hold a 
>> reference to. And you can later decide (perhaps dynamically) if you need to 
>> do some more jumps to get another series. Each call to jump moves the master 
>> along and leaves behind a RNG that can be used for a set number of cycles 
>> (the jump length). So you can do:
>> 
>> JumpableUniformRandomProvider rng = …;
>> 
>> UniformRandomProvider[] series1 = RandomSource.createJumpSeries(rng);
>> // Do work with series1 and then maybe
>> UniformRandomProvider[] series2 = RandomSource.createJumpSeries(rng);
>> // Do work with series2, etc
>> UniformRandomProvider[] series3 = RandomSource.createJumpSeries(rng);
>> 
>> Or
>> 
>> JumpableUniformRandomProvider masterRng = …;
>> 
>> ExecutorService executor = Executors.newCachedThreadPool();
>> ArrayList<Future<Result>> futures = new ArrayList<>();
>> for (Input input : inputs) {
>> final UniformRandomProvider rng = masterRng.jump();
>> futures.add(executor.submit(new Callable<Result>() {
>> // Do something random with rng, then
>> return new Result(...);
>> }));
>> }
>> 
>> The later example uses ‘inputs’ as something where perhaps the size is not 
>> known such as an Iterable or likewise in Java 8 it could be written to 
>> consume a Stream.
> 
> That's a convincing example!
> 
>> Similarly the LongJumpableUniformRandomProvider interface can return a 
>> JumpableUniformRandomProvider so preventing the result from being used for 
>> another long jump but it can be used for (short) jumps.
>> 
>> Have a think on use cases but my feeling is that the interface is more 
>> powerful if you do advance the state and leave copies behind, rather than 
>> creating future copies which must be chained together to create a series.
> 
> OK to change the perspective. ;-)
> 
> Gilles
> 
>> 
>> Alex
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

Re: [rng] Split and Jump functions

Reply via email to