Re: [Rd] Which function can change RNG state?

Paul Gilbert Sun, 08 Feb 2015 21:04:02 -0800


On 02/08/2015 09:33 AM, Dirk Eddelbuettel wrote:


On 7 February 2015 at 19:52, otoomet wrote:
| random numbers.   For instance, can I be sure that
| set.seed(0); print(runif(1)); print(rnorm(1))
| will always print the same numbers, also in the future version of R?  There

Yes, pretty much.

This is nearly correct. The user could change the uniform or normalgenerator, since there are options other than the defaults, which wouldmean the result would be different. And obviously if they changed printprecision then the printed result may be truncated differently.

I think you could prepare for future versions of R by saving informationabout the generators you are using. The precedent has already been set(R-1.7.0) that the default could change if there is a good reason. Agood reason might be that the RNG is found not to be so good relative toothers that become available. But I think the old generator wouldcontinue to be available, so people can reproduce old results. (PackagesetRNG has some utilities to help save and reset, but there is nothingespecially difficult or fancy, just a few details that need to beremembered.)


I've been lurking here over fifteen years, and while I am getting old and
forgetful I can remember exactly one such change where behaviour was changed,
and (one of the) generators was altered---if memory serves in the earlier
days of R 1.* days . [ Goes digging...] Yes, see `help(RNGkind)` which
details that R 1.7.0 made a change when "Buggy Kinderman-Ramage" was added as
the old value, and "Kinderman-Ramage" was repaired.  There once was a similar
fix in the very early days of the Mersenne-Twister which is why the GNU GSL
has two variants with suffixes _1998 and _1998.

I seem to recall a bit of change around R-0.49 but old and forgetfulwould cover this too. For me, a bigger change was an unadvertised changein Splus - they compiled against a different math library at some point.This changed the lower bits in results, mostly insignificant butaccumulated simulation results could amount to something fairlyimportant. The amount of time I spent trying to find why results wouldnot reproduce was one of my main motivations for starting to use R.


So your issue seems like pilot error to me:  don't attach the parallel package
if you do not plan to work in parallel.  But "do if you do", and see its fine
vignette on how it provides you reproducibility for multiple RNG streams.

In general, you can very much trust R (and R Core) in these matters.

Dirk


On 02/08/2015 09:40 AM, Gábor Csárdi wrote:> On Sat, Feb 7, 2015 at
> I don't know if there is intention to keep this reproducible across R
> versions, but it is already not reproducible across platforms (with
>the same R version):

>http://stackoverflow.com/questions/21212326/floating-point-arithmetic-and-reproducibility

The situation is better in some respects, and worse in others, than whatis described on stackoverflow. I think the point is made pretty wellthere that you should not be trying to reproduce results beyond machineprecision. My experience is that you can compare within a fuzz of 1e-14usually, even across platforms. (The package setRNG on CRAN has afunction random.number.test() which is run in the package's tests/ andmakes uniform and normal comparisons to 1e-14. It has passed checks onall R platforms since 2004. Actual, the checks have been done sinceabout 1995 but they were part of package dse earlier.) If youaccumulate lots of lower order parts (eg sum(simulated - true) in a longmonte-carlo) then the fuzz may need to get much larger, especiallycomparing across platforms. And you will have trouble with numericallyunstable calculations. Once-upon-a-time I was annoyed by this, but thenI realized that it was better not to do unstable calculations.

In addition to not being reproducible beyond machine precision across Rversions and across platforms, you can really not be guaranteed even onthe same platform and same version of R. You may get different resultsif you upgrade the OS and there has been a change in the math libraries.In my experience this happens rather often. I don't think there is anyspecific 32 vs 64 bit issue, but math libraries sometimes do things abit differently on different processors (eg processor bug fixes) so youcan occasionally get differences with everything the same except thehardware.



On 02/07/2015 10:52 PM, otoomet wrote:
> It turned out that this is because package "parallel", buried deep
> in my dependencies, calls runif() during it's initialization and
> in this way changes the random number sequence.

Guessing a bit about what you are saying: 1/you set the random seed2/you did some things which included loading package parallel 3/you ransome things for which you expected to get results comparable to someprevious run when you did 1/ and 2/ in the reverse order.

If I understand this correctly, I suggest you always do everythingexactly the same after you set the seed. There are lots of things thatcould generate random numbers without you really knowing. Thus, it isusually better to set the seed immediately before you start doinganything where you want the seed to have a known state. (There is aneven better suggestion in the somewhat dated vignette with package setRNG.)

Finally, if you do intend to use parallel sometimes then you haveadditional considerations. You would like to get the same results nomatter how many machines you are using. This may place some constraintson the generators you use, not all are equally easy to use in parallel.So if you are hoping to get the same results in parallel as you get on asingle machine then you better start out using generators on the singlemachine that you will be able to use in parallel.


Paul

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Which function can change RNG state?

Reply via email to