Tom,
 
I don't think the question was stupid (silly depends on what you consider 
silly), but you are right that the distinction probably does not matter that 
much.
 
If you go back to basic probability then remember that there are n! possible 
permutations (n is total number of observations).  The number of combinations 
(no repeats of same groupings) is n!/(m1! * m2! * ... * mk!) (m1-mk are the 
numbers of observations in each group).  To find the p-value for the exaustive 
permutation test you would compute how many of the permutations give as extreme 
or more extreeme of a statistic and divide that be the total number of 
permutations.  For the combinations you would do the same type of division, but 
it turns out that the denominators of the 2 fractions are the same (the m1! * 
... * mk!) and would cancel each other giving the exact same answer as the 
permutation test.
 
So for the exaustive case you get the same exact answer, but for large n it is 
more practical to sample from the set of permutations or combinations (with 
replacement usually) and calculate the proportion that have an extreeme 
statistic, basic sampling theory tells us that this proportion will approximate 
the true p-value.  If you are only doing a small number of 
permutations/combinations then the combinations would probably give a better 
estimate (so the original question is not completely silly).  However with the 
speed of modern computers it is fairly easy and fast to take a large number of 
permutations so that this is not really an issue.  It is probably faster to 
take a couple thousand extra permutations than to come up with a good algorithm 
for sampling the permutations.
 
So my advice is don't worry about the "shadow samples", just use the sample 
function, but do a lot of the random permutations.
 
Hope this helps,

________________________________

From: [EMAIL PROTECTED] on behalf of Tom Backer Johnsen
Sent: Fri 1/11/2008 10:57 AM
To: [EMAIL PROTECTED]
Subject: Re: [R] Randomization tests, grouped data



Tom Backer Johnsen wrote:
> The other day I was looking into one of the classics in resampling,
> Eugene Edgington's "Randomization Tests".  This type of test is simple
> to do in R with things like a simple correlation, the sample ()
> function is perfect for the purpose.
>
> However, things are more complex if you have grouped data, like a
> one-way ANOVA.  The reason is that you have to avoid the consideration
> of what Edgington calls "mirror samples", shuffles that only move data
> within the groups.  After all, one only wants to consider changes
> bewteen groups.  In that case (I think) the sample () function is too
> general.
>
> Are there something that can handle this in R?

After a few hours thinking on and off about the problem, I suspect
that the question may be stupid or silly (or both).  If that is the
case, I would very much like to know why.

Again,

Tom

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to