----- "Phil Steitz" <phil.ste...@gmail.com> a écrit : > On 9/20/10 6:30 AM, luc.maison...@free.fr wrote: > > Hi all, > > > > As those subscribed to the commit list may have noticed, I have > > added several new pseudo-random number generators to > > commons-math. > > > > I should not have add such a feature without discussing on the > > list before. It was a wrong move and I apologize about it. So I > > would like to start a discussion about this now. > > Chalk it up to network latency ;) > > > > Up to 2.1, we have few pseudo random number generators. We have > > an interface RandomGenerator implemented by three classes: - > > JDKRandomGenerator that extends the JDK provided generator - > > AbstractRandomGenerator as a helper for users generators - > > BitStreamGenerator which in turn is extended only by > > MersenneTwister > > > > The JDK provided generator is a simple one that can be used only > > for very simple needs. The Mersenne Twister is on the other hand > > a fast generator with good properties well suited for Monte-Carlo > > simulation. It is equidistributed for generating vectors up to > > dimension 623 and has a huge period: 2^19937 - 1. > > > > Since Mersenne-Twister inception in 1997, some new generators > > have been created, retaining the good properties of Mersenne > > twister but removing some of its (few) drawbacks. The main one is > > that if initialized with a bits pool containing lots of zeroes, > > the pool will take a very long time time to stabilize with a > > roughly balanced number of zeros and ones. > > > > I would like to add such generators (well, I already did but can > > withdraw my commit). The ones I want to add are the WELL > > generators (Well Equidistributed Long period Linear) created by > > François Panneton, Pierre L'Ecuyer and Makoto Matsumoto. They are > > described in their 2006 paper: Improved * Long-Period Generators > > Based on Linear Recurrences Modulo 2, ransactions on Mathematical > > Software, 32, 1 (2006) which is available > > at<http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf>. > > > > The paper describes several generators from one family. The most > > interesting ones are WELL1024 (a small one), WELL19937 and > > WELL44497 (two large ones with huge periods that are both > > Mersenne primes). The two large ones exist in a basic version > > that is not Maximally Equidistributed (i.e. equidistributed in > > all possible dimensions and number of bits up to some threshold) > > and also in an extended version that add some tempering to the > > output to get an ME generator. > > > > The paper does not put any restriction on the algorithm. The > > reference implementation in C on the other hand is limited to > > non-commercial use only. In my implementation, I did not refer to > > this C implementation at all. I only compiled it on my personal > > computer to generate reference values and copied the values in > > the test cases. So this implementation is completely independent > > (and in fact the code is rather different since I used an > > abstract class for the global algorithm and one derived class for > > each specific generator. I rely on the JVM optimizer to do all > > the inlining and constants simplification that they did manually > > in the reference implementation. > > Sounds OK, but we should verify somehow that the algorithm itself is > not patented. Does the C source say anything about that? Might be > good to ask about this on legal-discuss or ask one of the authors. > I know this could apply to any of the algorithms we implement, but > the ones that are >100 yrs old are a little safer ;)
OK, I'll write to the authors off-list and ask them. I also needed to ask them a question about some of the constants since there seem to be some typos in the tables and I want to be sure to use the ones that have the good properties we need. Luc > > > > > Another difference with the reference implementation is that our > > BitStreamGenerator abstract class requires the generator to > > provide only bits blocks and not double, and it computes the > > double by itself. This computation uses the full IEEE754 width > > for the doubles, i.e. it uses 53 bits. The generators are > > guaranteed to be equidistributed on large vectors of 32 bits (up > > to dimension 1390 for the 44497 version, since 1390 * 32< > > 44497). I guess this should work well also using chunks of 53 > > bits (but of course with smaller dimensions), but it is not > > mathematically proven. > > > > So what do other think about this ? Should this really be > > included (in which case I will open a JIRA issue and resolve it > > immediately) or should it be removed from commons-math ? Is the > > independent implementation and the off-line reference data > > generation sufficient to say this code is unencumbered ? Is the > > 53 bits use for double generation a good thing or should we > > reduce it to 32 bits to stay within the proven behavior ? > > > > +1 for including assuming all is OK with IP > > Phil > > > > > Once again, sorry for this too late discussion, I really should > > have started it before. Luc > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org