Can I get a review of a PR that changes the ProviderBuilder [1]?

The aim is to move the creation of the seed and the RNG into the RandomSourceInternal.

This allows for customisation of the seed creation on a per-RNG basis. The reason is that some new generators in the library for v1.3 require the seed is non-zero. For example this is a realistic possibility when the seed is an array of ints of size 2.

The code change has required updating the routines using the SeedConverter interface and adding a Seed2ArrayConverter interface:

public interface Seed2ArrayConverter<IN, OUT> extends SeedConverter<IN, OUT> {
    /**
     * Converts seed from input type to output type. The output type is 
expected to be an array.
     *
     * @param seed Original seed value.
     * @param outputSize Output size.
     * @return the converted seed value.
     */
    OUT convert(IN seed, int outputSize);
}

I've moved conversion to a new enum class for seed creation and conversion. Previously the ProviderBuilder used maps to store all the conversions. These are now explicitly written as conversion methods in the enum. The amount of code is the same and the conversions are the same. However use of the enum for conversion has removed the need to support the Seed2ArrayConverter interface in all the converters. This has deprecated some methods and classes. I've not yet marked them as so in the PR.

i.e.

private static final Map<Class<?>, SeedConverter<long[], ?>> CONV_LONG_ARRAY =
    new ConcurrentHashMap<Class<?>, SeedConverter<long[], ?>>();

does not have to become:

private static final Map<Class<?>, Seed2ArrayConverter<long[], ?>> 
CONV_LONG_ARRAY =
    new ConcurrentHashMap<Class<?>, Seed2ArrayConverter<long[], ?>>();

with the corresponding conversion as:

nativeSeed = CONV_LONG_ARRAY.get(source.getSeed()).convert((long[]) seed, 
source.getSeedSize());


Currently conversions of arrays to arrays ignore the seed size. Creation of a new seed is limited to a maximum of 128. This matches the previous functionality. It could be changed to create the full length seed. The impact would be more work done within synchronized blocks in the SeedFactory. I would expect the generation to be slower but the seed quality will be higher.

Only creation of arrays from int/long seeds will use the seed size. This does not operate within a synchronized block.


Note that RNG constructor is obtained using reflection. Previously was done on each invocation. However now that the method is within the RandomSourceInternal enum caching the constructor is a natural modification since it is always the same. I have not done this in the constructor for the RandomSourceInternal enum as:

- it adds overhead when building all the enums (e.g. RandomSourceInternal.values())

- it may throw lots of different types of exception. Rather than catching them all in the constructor for the enum they can now be thrown during creation of the RNG instance so the user gets the appropriate stack trace.


The change to create an array using the correct size has performance implications (see [2]):

- For small array sizes the creation is faster

- For large array sizes built using an int/long the creation is marginally slower (but the seed should be better as it uses the SplitMix algorithm rather than the self-seeding strategy of the BaseProvider [3])


Caching the constructor for use with reflection has improved performance.


[1] https://github.com/apache/commons-rng/pull/46

[2] https://issues.apache.org/jira/browse/RNG-75

[3] This could be tested by creating a new generator that implements the self-seeding strategy as its output and running it through the stress test applications.


Reply via email to