On Wed, Nov 3, 2010 at 11:07 AM, Henrik Bengtsson <h...@biostat.ucsf.edu> wrote: > On Wed, Nov 3, 2010 at 11:02 AM, Henrique Dallazuanna <www...@gmail.com> > wrote: >> The resample function in the example section from sample help page does it >> or not? > > Yes, I just noticed that one [at the very end of the example in > help("sample")]. So, maybe resample() should be a function available > in R?
So for completeness, this has also be discussed in R-devel thread '[patch] add is.set parameter to sample()' started on 2010-03-23, cf. http://www.mail-archive.com/r-devel@r-project.org/msg19998.html. /Henrik > > /Henrik > >> >> On Wed, Nov 3, 2010 at 3:54 PM, Henrik Bengtsson >> <h...@biostat.ucsf.edu>wrote: >> >>> Hi, consider this one as an FYI, or a seed for further discussion. >>> >>> I am aware that many traps on sample() have been reported over the >>> years. I know that these are also documents in help("sample"). Still >>> I got bitten by this while writing >>> >>> sample(units, size=length(units)); >>> >>> where 'units' is an index (positive integer) vector. It works in all >>> cases as expected (=I expect) expect for length(units) == 1. I know, >>> it is well known. However, it got to make me wonder if it is possible >>> to use sample() to draw a single value from a set containing only one >>> value. I don't think so, unless you draw from a value that is <= 1. >>> >>> For instance, you can sample from c(10,10) by doing: >>> >>> > sample(rep(10, times=2), size=2); >>> [1] 10 10 >>> >>> but you cannot sample from c(10) by doing: >>> >>> > sample(rep(10, times=1), size=1); >>> [1] 9 >>> >>> unless you sample from a value <= 1, e.g. >>> >>> sample(rep(0.31, times=1), size=1); >>> [1] 0.31 >>> >>> sample(rep(-10, times=1), size=1); >>> [1] -10 >>> >>> Note also the related issue of sampling from a double vector of length 1, >>> e.g. >>> >>> > sample(rep(1.2, times=2), size=2); >>> [1] 1.2 1.2 >>> > sample(rep(1.2, times=1), size=1); >>> [1] 1 >>> >>> I the latter case 1.2 is coerced to an integer. >>> >>> All of the above makes sense when one study the code of sample(), but >>> sample() is indeed dangerous, e.g. imagine how many bootstrap >>> estimates out there quietly gets incorrect. >>> >>> >>> In order to cover all cases of length(units), including one, a solution is: >>> >>> sampleFrom <- function(x, size=length(x), ...) { >>> n <- length(x); >>> if (n == 1L) { >>> res <- x; >>> } else { >>> res <- sample(x, size=size, ...); >>> } >>> res; >>> } # sampleFrom() >>> >>> > sampleFrom(rep(10, times=2), size=2); >>> [1] 10 10 >>> >>> > sampleFrom(rep(10, times=1), size=1); >>> [1] 10 >>> >>> > sampleFrom(rep(0.31, times=1), size=1); >>> [1] 0.31 >>> >>> > sampleFrom(rep(-10, times=1), size=1); >>> [1] -10 >>> >>> > sampleFrom(rep(1.2, times=2), size=2); >>> [1] 1.2 1.2 >>> >>> > sampleFrom(rep(1.2, times=1), size=1); >>> [1] 1.2 >>> >>> >>> I want to add sampleFrom() to the wishlist of functions to be >>> available in default R. Alternatively, one can add an argument >>> 'sampleFrom=FALSE' to the existing sample() function. Eventually such >>> an argument can be made TRUE by default. >>> >>> /Henrik >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> >> >> -- >> Henrique Dallazuanna >> Curitiba-Paraná-Brasil >> 25° 25' 40" S 49° 16' 22" O >> >> [[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel