Hello, I am trying to update a package for the upcoming release of R, and my unit tests are affected by the change in the sample. I understand that to reproduce the old sampling, I need to set sample.kind=“Rounding” in RNGkind or set.seed. But I am confused by the behavior of the sample.kind argument in set.seed, as it doesn’t seem to change my results.
In particular, I was trying to understand what happens if you make a call to set.seed within a function to the global environment. So I set up a test as follows: ###Test set.seed f<-function(n,sample.kind){ #="Rounding" or "Rejection" cat("RNG at beginning\n") print(RNGkind()) # RNGkind(sample.kind=sample.kind) # cat("RNG at after set\n") # print(RNGkind()) set.seed(23,sample.kind=sample.kind) cat("RNG at after set seed\n") print(RNGkind()) sample(1:400000,size=n,replace=TRUE) } RNGkind(sample.kind="Rejection”) print(RNGkind()) n<-1000000 y<-f(n,"Rounding”) print(RNGkind()) y2<-f(n,"Rejection”) print(RNGkind()) all(y==y2) However, it didn’t do anything: > RNGkind(sample.kind="Rejection") > print(RNGkind()) [1] "Mersenne-Twister" "Inversion" "Rejection" > n<-1000000 > y<-f(n,"Rounding") RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rejection" Warning message: In set.seed(23, sample.kind = sample.kind) : non-uniform 'Rounding' sampler used > print(RNGkind()) [1] "Mersenne-Twister" "Inversion" "Rejection" > y2<-f(n,"Rejection") RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rejection" > print(RNGkind()) [1] "Mersenne-Twister" "Inversion" "Rejection" > all(y==y2) [1] TRUE If I run the same test with calls to RNGkind, however, it does change the method (and I discovered in answer to my question, it appears to change the global method, which is an unfortunate fact for what I am trying to do). ###Test RNGkind f<-function(n,sample.kind){ #="Rounding" or "Rejection" cat("RNG at beginning\n") print(RNGkind()) RNGkind(sample.kind=sample.kind) cat("RNG at after set\n") print(RNGkind()) set.seed(23) cat("RNG at after set seed\n") print(RNGkind()) sample(1:400000,size=n,replace=TRUE) } RNGkind(sample.kind="Rejection”) print(RNGkind()) n<-1000000 y<-f(n,"Rounding”) print(RNGkind()) y2<-f(n,"Rejection”) print(RNGkind()) all(y==y2) > RNGkind(sample.kind="Rejection") > print(RNGkind()) [1] "Mersenne-Twister" "Inversion" "Rejection" > n<-1000000 > y<-f(n,"Rounding") RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set [1] "Mersenne-Twister" "Inversion" "Rounding" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rounding" Warning message: In RNGkind(sample.kind = sample.kind) : non-uniform 'Rounding' sampler used > print(RNGkind()) [1] "Mersenne-Twister" "Inversion" "Rounding" > y2<-f(n,"Rejection") RNG at beginning [1] "Mersenne-Twister" "Inversion" "Rounding" RNG at after set [1] "Mersenne-Twister" "Inversion" "Rejection" RNG at after set seed [1] "Mersenne-Twister" "Inversion" "Rejection" > print(RNGkind()) [1] "Mersenne-Twister" "Inversion" "Rejection" > all(y==y2) [1] FALSE So clearly I should use RNGkind to change it, but what is the argument actually doing in set.seed? Thanks, Elizabeth Purdom > sessionInfo() R version 3.6.0 alpha (2019-04-09 r76363) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.6 Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] BiocManager_1.30.4 compiler_3.6.0 tools_3.6.0 ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.