My apologies to Eik - I didn't see his solution and essentially replicated
it. Sorry for the noise..

Dennis

On Thu, Feb 10, 2011 at 3:46 PM, Dennis Murphy <djmu...@gmail.com> wrote:

> Hi:
>
> On Thu, Feb 10, 2011 at 10:50 AM, Hui Du <hui...@dataventures.com> wrote:
>
>>
>> Hi all,
>>
>>                I have a dataset. Each time I want to sample N(i) elements
>> from it and I want to repeated sampling M times. N(i) is varied from time to
>> time. For example,
>>
>>                dataset = 1:50;
>>                a = list();
>>
>>                M = 1000;
>>
>>                I want to do something like
>>                for(i in 1:M)
>>
>> {
>>                a[[i]] = sample(dataset, sample(length(dataset), 1))
>>                }
>>
>
> For this specific example, isn't it the same as
>
> a <- sample(dataset, M, replace = TRUE)
>
> with
> tabulate(sample(a, 1000, replace = TRUE))
>  [1] 22 22 17 22 20 12 19 23 26 22 22 22 13 16 14 23 15 27 25 21 23 16 15
> 22 24
> [26] 19 23 27 20 19 19 16 14 21 16 23 16 27 15 18 21 26 14 22 15 25 28 14
> 20 19
>
> representing the corresponding table of counts? This doesn't seem to me to
> be the same as sampling N(i) elements from a (where I presume i represents
> an iteration number) and then sampling from that M times.  Here's an example
> of that construct:
>
> # Sample m elements from x and resample from the subvector M times
> sfun <- function(x, m, M = 1000) {
>     if(m > length(x)) stop('m must be less than length(x)')
>     idx <- sample(1:length(x), m)
>     sample(x[idx], M, replace = TRUE)
>    }
> # Vector of the number of subelements to sample (your m)
> mvec <- c(10, 20, 15, 30, 18)
>
> # Fix M = 1000 and use lapply() to generate each replicated set of
> subsamples from a
> > ll <- lapply(mvec, function(x) sfun(a, x, 1000))
> # length is right ( = length(mvec))
> > length(ll)
> [1] 5
> # lengths of each sample from the subvector is right
> > sapply(ll, length)
> [1] 1000 1000 1000 1000 1000
> # Generate frequency tables for the sets of resampled subvectors
> > sapply(ll, table)
> [[1]]
>
>   5   6  14  19  20  26  28  29  41  50
> 122  91  93 111 102  99  91 105  92  94
>
> [[2]]
>
>  1  2  3  5  7 12 13 15 25 27 31 32 33 35 37 39 41 44 45 47
> 62 55 65 47 65 44 39 44 49 48 47 42 54 48 42 51 58 48 42 50
>
> [[3]]
>
>  2  7  9 10 17 18 20 22 24 32 33 42 44 46 50
> 71 82 64 63 65 76 54 62 72 57 64 74 63 62 71
>
> [[4]]
>
>  3  6  7  8  9 10 11 12 14 15 16 17 20 21 22 24 25 28 33 35 40 41 42 43 44
> 45
> 31 35 35 35 37 37 38 33 33 38 43 25 37 39 37 25 42 36 32 26 32 38 21 34 31
> 36
> 46 48 49 50
> 27 29 31 27
>
> [[5]]
>
>  2  5  6  7  8 11 12 14 16 19 21 22 26 28 38 40 42 48
> 53 56 54 57 55 47 45 53 57 62 66 65 51 58 57 57 50 57
>
> I have no idea if this is what you had in mind. If not, please try again
> and be more careful about explaining what you need.
>
> HTH,
> Dennis
>
>                But my question is that if there is more elegant solution
>> for this, for example, without bothering loop, can I do the repeated
>> sampling?
>>
>>                Many thanks.
>>
>>
>> HXD
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to