> On 10 Apr 2019, at 18:52, Gilles Sadowski <gillese...@gmail.com> wrote:
>
> Hi.
>
>>
>> On 10/04/2019 18:58, Gilles Sadowski wrote:
>>> Hi.
>>>
>>>> [... long quote skipped where I think we largely agree on the
>>>> conclusions ...]
>>>> So do we have a working idea to:
>>>>
>>>> - Add interface 'JumpableUniformRandomProvider'
>>> Do we need to add "createJumpable" factory methods in "RandomSource"
>>> methods or is there a way to avoid the duplication?
>>>
>>> As mentioned in an earlier post, it would be cleaner/nicer (?) to add
>>> methods
>>> UniformRandomProvider jump();
>>> boolean isJumpable();
>>> to "UniformRandomProvider".
>>> This would require dropping support of Java 6 and 7 and perhaps a good
>>> reason to do so (?) ...
>>
>> And move to V2.0 with Java 8 giving the opportunity to clean up other
>> deprecated stuff.
>
> No!
> Changing major version version would entail package names change
> (thus forcing user codes updates)
> We don't need this since there isn't any breaking change if we add
> default methods.
My bad. It is obvious when justified like that.
>
>>
>> Would the the default implementation be to throw an
>> UnsupportedOperationException?
>
> Yes.
>>
>> I'm +0 on this.
>>
>> I'm not against it but do think the UniformRandomProvider interface
>> could become quite cluttered with these extra methods that would be in
>> the minority (jump, longJump, isJumpable, isLongJumpable are not
>> generally available). It would also allow methods/classes that currently
>> use simple methods from UniformRandomProvider to have access to call
>> jump on the generator and spawn lots of sub generators. I think this is
>> bad. These methods should be written to explicitly require a Jumpable
>> instance.
>
> From this POV, I certainly agree.
>
>> My approach would have been to leave RandomSource as is and then state
>> that the returned generator can be tested to see if it is Jumpable using
>> instanceof. Someone who is writing code to use a Jumpable RNG should be
>> fine with that since they would have to know a priory what RandomSource
>> to create to get a Jumpable.
>
> If it makes more sense, I'm fine letting the user write the "ugly" code. ;-)
Not adding a dedicated method would mean everyone has to do this:
JumpableUniformRandomProvider rng = (JumpableUniformRandomProvider)
RandomSource.create(…)
But adding a mirror methods:
JumpableUniformRandomProvider RandomSource::createJumpable(…)
LongJumpableUniformRandomProvider RandomSource::createLongJumpable(…)
Does seem to add clutter to RandomSource. If we leave it out for now it can be
added in the future.
Is there scope for needing the Jumpable to be detected through the API? E.g.
add:
boolean RandomSource::isJumpable(RandomSource)
We would just need to maintain an EnumSet for Jumpable and likewise for
LongJumpable. Again more clutter to the RandomSource interface but perhaps less
so than a mirror create method that would throw IllegalArgumentException for
any RandomSource specified that is not Jumpable.
>
>> I would add a method to mimic RandomSource.unrestorable as
>> RandomSource.unjumpable. Or since they both would be doing the same
>> thing a new method RandomSource.restrict to 'restrict' functionality to
>> just the data generation methods in UniformRandomProvider. This restrict
>> method can be called by RandomSource.unrestorable and make that deprecated.
>
> Looks neat.
>
>>>
>>>> - Add interface 'LongJumpable... extends JumpableUniformRandomProvider'
>>> Same question...
>>>
>>>> - Test if a SplitMix variant with a self generated Weyl sequence can
>>>> pass tests of uniformity. This would just require a seed of long[2], one
>>>> for the state and one to use to derive the Weyl sequence increment.
>>> Is the new seed length a temporary workaround for the test,
>>> to be replaced with a new "SPLIT_MIX_64_K" provider, as
>>> mentioned in your previous message, if the test passes?
>>>
>>> Gilles
>>
>> I was hoping to avoid creating a duplicate class. But actually that
>> would be fine and easier for testing. The implementation is trivial anyway.
>
> Why duplicate?
> IIUC, shouldn't the existing "core" class define an additional constructor
> that accepts the "K" argument. Then the current "SPLIT_MIX_64"
> would use the default increment.
> [Same as with "TWO_CMRES" and "TWO_CMRES_SELECT", no?.]
OK. That was where I should have gone to start with. I’ll do that.
>
>> I've just finished an initial implementation of the MSWS RNG that uses a
>> self-generated Weyl sequence. It works. So the idea for this can be
>> applied to SPLIT_MIX_64_K by using the same Weyl sequence generation in
>> both. Perhaps moving the code to create the Weyl sequence increment to
>> NumberFactory.
>
> +1
OK. So I created a Jira ticket for the SPLIT_MIX_64_K. I’m not sure when I’ll
get around to doing this though. It seems less important than other tasks. It
may even be redundant if the only reason is to increase the state space for the
generator. Each generator would still only have a period of 2^64. You can just
create a lot of them with a weak assurance they will not overlap and that the
uniformity is statistically the same. Since there are of the order of 2^63
variants we are not going to be able to test this and would have to rely on the
theory behind it.
If we add Jumpable to SplitMix with jumps of 2^32 and 2^48 then you can create
either 2^32 rngs with no overlap for the first 2^32 output numbers or 2^16 rngs
that can each be jumped 2^16 times with no overlap for the first 2^32 output
numbers. That is a lot for a small parallel situation and does have the
assurance of no overlap. Any parallel usage where longer sequences are expected
to be used can use one of the XorShiRo generators.
>
> Regards,
> Gilles
>
>>>
>>>> Alex
>>>>
>>>>
>>>>> Regards,
>>>>> Gilles
>>>>>
>>>>>> Alex
>>>>>>
>>>>>>
>>>>>> [1] https://en.wikipedia.org/wiki/Weyl_sequence
>>>>>>
>>>>>> [2] The Jira ticket RNG-85 had a note about the seeding algorithm for
>>>>>> the generator being GPL. There are alternatives based on the
>>>>>> SplittableRandom seeding method that could be used instead to
>>>>>> create the
>>>>>> increment for the Weyl sequence. I've speed tested the provided
>>>>>> algorithm and it is about 85x slower than the one used in
>>>>>> SplittableRandom. Since that algorithm has an issue with the unsigned
>>>>>> shift not being modelled by the Binomial distribution an updated
>>>>>> algorithm could be used that would be novel so avoid the Oracle or GPL
>>>>>> licences.
>>>>>>
>>>>>>> Best,
>>>>>>> Gilles
>>>>>>>
>>>>>>>>>> Alex
>>>>>>>>>>
>>>>>>>>>> [1] https://github.com/aappleby/smhasher
>>>>>>>>>>
>>>>>>>>>> [2] Using Long.bitCount(long ^ (long >>> 1)) to count transitions
>>>>>>>>>>
>>>>>>>>>> [3] The SplitMix64 is a simple linear series and thus can be
>>>>>>>>>> jumped in
>>>>>>>>>> any power of 2 up to the maximum for a long (which causes sequence
>>>>>>>>>> wrapping). It can actually be jumped any number of iterations using
>>>>>>>>>> BigInteger arithmetic but jumping in powers of 2 can be implemented
>>>>>>>>>> using long arithmetic where the rollover bits beyond 64 are
>>>>>>>>>> naturally
>>>>>>>>>> discarded:
>>>>>>>>>>
>>>>>>>>>> long jumps = 1234567;
>>>>>>>>>>
>>>>>>>>>> long increment = 0x9e3779b97f4a7c15L;
>>>>>>>>>>
>>>>>>>>>> state = BigInteger.valueOf(state)
>>>>>>>>>>
>>>>>>>>>> .add(BigInteger.valueOf(increment).multiply(BigInteger.valueOf(jumps)))
>>>>>>>>>>
>>>>>>>>>> .longValue(); // narrowing primitive conversion
>>>>>>>>>>
>>>>>>>>>> int jumpPower = 32;
>>>>>>>>>>
>>>>>>>>>> state = BigInteger.valueOf(state)
>>>>>>>>>>
>>>>>>>>>> .add(BigInteger.valueOf(increment).shiftLeft(jumpPower))
>>>>>>>>>>
>>>>>>>>>> .longValue(); // narrowing primitive conversion
>>>>>>>>>>
>>>>>>>>>> // Same as above by discarding overflow bits
>>>>>>>>>>
>>>>>>>>>> state = state + (increment << jumpPower);
>>>>>>>>>>
>>>>>>>>>> This is based on my understanding of BigInteger and signed/unsigned
>>>>>>>>>> arithmetic and should be verified in tests.
>>>>>>>>>>
>>>>>>>>>> [4] Given the period of the SplitMix is 2^64, and the period
>>>>>>>>>> may be less
>>>>>>>>>> than this in practice it may be better to only support jumps of
>>>>>>>>>> e.g.
>>>>>>>>>> 2^32 in a manner similar to those provided for the XorShiRo
>>>>>>>>>> generators
>>>>>>>>>> where the state can be advanced a factor of the period, or just not
>>>>>>>>>> supports jumps. I can see the utility in jumping more than
>>>>>>>>>> Integer.MAX_VALUE (guaranteed unique outputs for the maximum
>>>>>>>>>> array size)
>>>>>>>>>> but less than 2^32 is tending towards not very many random
>>>>>>>>>> numbers from
>>>>>>>>>> the original instance before sequence overlap with the jumped
>>>>>>>>>> instance.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org