Hi all,

As those subscribed to the commit list may have noticed, I have added several 
new pseudo-random number generators to commons-math.

I should not have add such a feature without discussing on the list before. It 
was a wrong move and I apologize about it.
So I would like to start a discussion about this now.

Up to 2.1, we have few pseudo random number generators. We have an interface 
RandomGenerator implemented by three classes:
 - JDKRandomGenerator that extends the JDK provided generator
 - AbstractRandomGenerator as a helper for users generators
 - BitStreamGenerator which in turn is extended only by MersenneTwister

The JDK provided generator is a simple one that can be used only for very 
simple needs. The Mersenne Twister is on the other hand a fast generator with 
good properties well suited for Monte-Carlo simulation. It is equidistributed 
for generating vectors up to dimension 623 and has a huge period: 2^19937 - 1.

Since Mersenne-Twister inception in 1997, some new generators have been 
created, retaining the good properties of Mersenne twister but removing some of 
its (few) drawbacks. The main one is that if initialized with a bits pool 
containing lots of zeroes, the pool will take a very long time time to 
stabilize with a roughly balanced number of zeros and ones.

I would like to add such generators (well, I already did but can withdraw my 
commit). The ones I want to add are the WELL generators (Well Equidistributed 
Long period Linear) created by François Panneton, Pierre L'Ecuyer and Makoto 
Matsumoto. They are described in their 2006 paper: Improved
 * Long-Period Generators Based on Linear Recurrences Modulo 2, ransactions on 
Mathematical Software, 32, 1 (2006) which is available at 
<http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf>.

The paper describes several generators from one family. The most interesting 
ones are WELL1024 (a small one), WELL19937 and WELL44497 (two large ones with 
huge periods that are both Mersenne primes). The two large ones exist in a 
basic version that is not Maximally Equidistributed (i.e. equidistributed in 
all possible dimensions and number of bits up to some threshold) and also in an 
extended version that add some tempering to the output to get an ME generator.

The paper does not put any restriction on the algorithm. The reference 
implementation in C on the other hand is limited to non-commercial use only. In 
my implementation, I did not refer to this C implementation at all. I only 
compiled it on my personal computer to generate reference values and copied the 
values in the test cases. So this implementation is completely independent (and 
in fact the code is rather different since I used an abstract class for the 
global algorithm and one derived class for each specific generator. I rely on 
the JVM optimizer to do all the inlining and constants simplification that they 
did manually in the reference implementation.

Another difference with the reference implementation is that our 
BitStreamGenerator abstract class requires the generator to provide only bits 
blocks and not double, and it computes the double by itself. This computation 
uses the full IEEE754 width for the doubles, i.e. it uses 53 bits. The 
generators are guaranteed to be equidistributed on large vectors of 32 bits (up 
to dimension 1390 for the 44497 version, since 1390 * 32 < 44497). I guess this 
should work well also using chunks of 53 bits (but of course with smaller 
dimensions), but it is not mathematically proven.

So what do other think about this ?
Should this really be included (in which case I will open a JIRA issue and 
resolve it immediately) or should it be removed from commons-math ? Is the 
independent implementation and the off-line reference data generation 
sufficient to say this code is unencumbered ? Is the 53 bits use for double 
generation a good thing or should we reduce it to 32 bits to stay within the 
proven behavior ?

Once again, sorry for this too late discussion, I really should have started it 
before.
Luc

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to