Have you considered limiting yourself to the unique combinations rather than every possible permutation? E.g. the permutations 1,2 | 3,4 and 2,1 | 4,3 give redundant results. The combn function (with the FUN) argument may be of help.
-- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 > -----Original Message----- > From: Matthew Finkbeiner [mailto:matthew.finkbei...@gmail.com] On > Behalf Of Matthew Finkbeiner > Sent: Monday, April 26, 2010 4:09 PM > To: Greg Snow > Cc: matthew.finkbei...@mq.edu.au; r-help@r-project.org > Subject: Re: [R] multiple paired t-tests without loops > > Yes, I suspect that I will end up using a sampling approach, but I'd > like to use an exact test if it's at all feasible. > > Here are two samples of data from 3 subjects: > Sample Subj C1 C2 > 44 1 0.0093 0.0077 > 44 2 0.0089 0.0069 > 44 3 0.051 0.0432 > 44 4 0.014 0.0147 > 44 5 0.0161 0.0117 > 45 1 0.0103 0.0086 > 45 2 0.0099 0.0078 > 45 3 0.0542 0.0458 > 45 4 0.0154 0.0163 > 45 5 0.0175 0.0129 > > > and then here is the script I've pieced together from things I've found > on the web (sorry for not citing the snippets!). any pointers on how > to > speed it up would be greatly appreciated. > > #---------------------------------- > # Utility function > # that returns binary representation of 1:(2^n) X SubjN > binary.v <- > function(n) > { > x <- 1:(2^n) > mx <- max(x) > digits <- floor(log2(mx)) > ans <- 0:(digits-1); lx <- length(x) > x <- matrix(rep(x,rep(digits, lx)),ncol=lx) > x <- (x %/% 2^ans) %% 2 > } > > library(plyr) > > > #first some global variables > TotalSubjects <- 5 > TotalSamples <- 2 > StartSample <- 44 > EndSample <- ((StartSample + TotalSamples)-1) > maxTs <- NULL > obsTs <- NULL > > > > > #create index array that drives the permuations for all samples > ind <- binary.v(TotalSubjects) > > #transpose ind so that the first 2^N items correspond to S1, > #the second 2^N correspond to S2 and so on... > transind <- t(ind) > > #get data file that is organized first by sample then by subj (e.g. > sample1 subject1 > # sample1 subject 2 ... sample 1 subject N) > #sampledatafile <- file.choose() > > samples <- read.table(sampledatafile, header=T) > > #this is the progress bar > pb <- txtProgressBar(min = StartSample, max = EndSample, style = 3) > setTxtProgressBar(pb, 1) > > start.t <- proc.time() > > #begin loop that analyzes data sample by sample > for (s in StartSample:EndSample) { > > S <- samples[samples$Sample==s,] #pick up data for current sample > > #reproduce data frame rows once for each permutation to be done > expanddata <- S[rep(1:nrow(S), each = 2^TotalSubjects),] > > > #create new array to hold the flipped (permuted) data > permdata = expanddata > > #permute the data > permdata[transind==1,3] <- expanddata[transind==1,4] #Cnd1 <- Cnd2 > permdata[transind==1,4] <- expanddata[transind==1,3] #Cnd2 <- Cnd1 > > #create permutation # as a factor in dataframe > PermN <- rep(rep(1:2^TotalSubjects, TotalSubjects),2) > > #create Sample# as a factor > Sample <- rep(permdata[,1],2) #Sample# is in the 1st Column > > #create subject IDs as a factor > Subj <- rep(permdata[,2],2) #Subject ID is in the 2nd Column > > #stack the permutated data > StackedPermData <- stack(permdata[,3:4]) > > #bind all the factors together > StackedPermData <- as.data.frame(cbind(Sample, Subj, PermN, > StackedPermData)) > > > #sort by perm > sortedstack <- > as.data.frame(StackedPermData[order(StackedPermData$PermN, > StackedPermData$Sample),]) > > > #clear up some memory > rm(expanddata, permdata, StackedPermData) > > #pull out data 1 perm at a time > res<-ddply(sortedstack, c("Sample", "PermN"), function(.data){ > > # Type combinations by Class > combs<-t(combn(sort(unique(.data[,5])),2)) > > # Applying the t-test for them > aaply(combs,1, function(.r){ > x1<-.data[.data[,5]==.r[1],4] # select first column > x2<-.data[.data[,5]==.r[2],4] # select first column > > tvalue <- t.test(x1,x2, paired = T) > > res <- c(tvalue$statistic,tvalue$parameter,tvalue$p.value) > names(res) <- c('stat','df','pvalue') > res > } > ) > } > ) > > # update progress bar > setTxtProgressBar(pb, s) > > #get max T vals > maxTs <- c(maxTs, tapply (res$stat, list (res$Sample), max)) > > #get observed T vals > obsTs <- c(obsTs, res$stat[length(res$stat)]) > > #here we need to save res to a binary file > > > } > #close out the progress bar > close(pb) > > end.t <- proc.time() - start.t > print(end.t) > > #get cutoffs > #these are the 2-tailed t-vals that maintain experimentwise error at > the > 0.05 level > lowerT <- quantile(maxTs, .025) > upperT <- quantile(maxTs, .975) > > > > > > > > > > > > On 4/27/2010 6:53 AM, Greg Snow wrote: > > The usual way to speed up permutation testing is to sample from the > > set of possible permutations rather than looking at all possible > > ones. > > > > If you show some code then we may be able to find some inefficiencies > > for you, but there is not general solution, poorly written uses of > > apply will be slower than well written for loops. In some cases > > rewriting critical pieces in C or fortran will help quite a bit, but > > we need to see what you are already doing to know if that will help > > or not. > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.