The binary tree algorithm does not need additional scrambling. I have
written the R code for the algorithm in the last answer at:
https://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement/46807110#46807110
However, the algorithm will probably be outperformed by hash t
Thank you for your answer. Certainly, hash table must be faster than
binary tree.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Splus used a similar method for sampling from "bigdata" objects. One
problem was that sample() is used both for creating a sample and for
scrambling the order of a vector. Scrambling the order of a big vector
wastes time. It would be nice to be able to tell sample() that we don't
care about the
> From: "Pavel S. Ruzankin"
> Let us consider the current uniform sampling without replacement
> algorithm. It resides in function do_sample in
> https://svn.r-project.org/R/trunk/src/main/random.c
> Its complexity is obviously O(n), where the sample is selected from
> 1...n, since the algorith
See also:
P. Gupta, G. P. Bhattacharjee. (1984) An efficient algorithm for random
sampling without replacement. International Journal of Computer
Mathematics 16:4, pages 201-209.
http://dx.doi.org/10.1080/00207168408803438
Teuhola, J. and Nevalainen, O. 1982. Two efficient algorithms for random
If somebody is interested I can write the code. But somebody else has to
add the code for handling int / long int / double cases, since I do not
have enough experience in that.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinf
Let us consider the current uniform sampling without replacement
algorithm. It resides in function do_sample in
https://svn.r-project.org/R/trunk/src/main/random.c
Its complexity is obviously O(n), where the sample is selected from
1...n, since the algorithm has to create a vector of length n. S