My apologies to Eik - I didn't see his solution and essentially replicated it. Sorry for the noise..
Dennis On Thu, Feb 10, 2011 at 3:46 PM, Dennis Murphy <djmu...@gmail.com> wrote: > Hi: > > On Thu, Feb 10, 2011 at 10:50 AM, Hui Du <hui...@dataventures.com> wrote: > >> >> Hi all, >> >> I have a dataset. Each time I want to sample N(i) elements >> from it and I want to repeated sampling M times. N(i) is varied from time to >> time. For example, >> >> dataset = 1:50; >> a = list(); >> >> M = 1000; >> >> I want to do something like >> for(i in 1:M) >> >> { >> a[[i]] = sample(dataset, sample(length(dataset), 1)) >> } >> > > For this specific example, isn't it the same as > > a <- sample(dataset, M, replace = TRUE) > > with > tabulate(sample(a, 1000, replace = TRUE)) > [1] 22 22 17 22 20 12 19 23 26 22 22 22 13 16 14 23 15 27 25 21 23 16 15 > 22 24 > [26] 19 23 27 20 19 19 16 14 21 16 23 16 27 15 18 21 26 14 22 15 25 28 14 > 20 19 > > representing the corresponding table of counts? This doesn't seem to me to > be the same as sampling N(i) elements from a (where I presume i represents > an iteration number) and then sampling from that M times. Here's an example > of that construct: > > # Sample m elements from x and resample from the subvector M times > sfun <- function(x, m, M = 1000) { > if(m > length(x)) stop('m must be less than length(x)') > idx <- sample(1:length(x), m) > sample(x[idx], M, replace = TRUE) > } > # Vector of the number of subelements to sample (your m) > mvec <- c(10, 20, 15, 30, 18) > > # Fix M = 1000 and use lapply() to generate each replicated set of > subsamples from a > > ll <- lapply(mvec, function(x) sfun(a, x, 1000)) > # length is right ( = length(mvec)) > > length(ll) > [1] 5 > # lengths of each sample from the subvector is right > > sapply(ll, length) > [1] 1000 1000 1000 1000 1000 > # Generate frequency tables for the sets of resampled subvectors > > sapply(ll, table) > [[1]] > > 5 6 14 19 20 26 28 29 41 50 > 122 91 93 111 102 99 91 105 92 94 > > [[2]] > > 1 2 3 5 7 12 13 15 25 27 31 32 33 35 37 39 41 44 45 47 > 62 55 65 47 65 44 39 44 49 48 47 42 54 48 42 51 58 48 42 50 > > [[3]] > > 2 7 9 10 17 18 20 22 24 32 33 42 44 46 50 > 71 82 64 63 65 76 54 62 72 57 64 74 63 62 71 > > [[4]] > > 3 6 7 8 9 10 11 12 14 15 16 17 20 21 22 24 25 28 33 35 40 41 42 43 44 > 45 > 31 35 35 35 37 37 38 33 33 38 43 25 37 39 37 25 42 36 32 26 32 38 21 34 31 > 36 > 46 48 49 50 > 27 29 31 27 > > [[5]] > > 2 5 6 7 8 11 12 14 16 19 21 22 26 28 38 40 42 48 > 53 56 54 57 55 47 45 53 57 62 66 65 51 58 57 57 50 57 > > I have no idea if this is what you had in mind. If not, please try again > and be more careful about explaining what you need. > > HTH, > Dennis > > But my question is that if there is more elegant solution >> for this, for example, without bothering loop, can I do the repeated >> sampling? >> >> Many thanks. >> >> >> HXD >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.