On 03/02/2011 11:38 PM, Anthony Dick wrote: > Hello all, > > I am re-posting my previous question with a simpler, more transparent, > commented code. > > I have been ramming my head against this problem, and I wondered if > anyone could lend a hand. I want to make parallel a bootstrap of a > linear mixed model on my 8-core mac. Below is the process that I want to > make parallel (namely, the boot.out<-boot(dat.res,boot.fun, R = nboot) > command). This is an extension to lmer of the bootstrapping linear > models example in Venables and Ripley. Please excuse my rather terrible > programming skills. I am always open to suggestions. Below the example I > describe what methods I have tried. > > library(boot) > library(lme4) > dat<-read.table("http://www2.fiu.edu/~adick/downloads/toy2.dat > <http://www2.fiu.edu/%7Eadick/downloads/toy2.dat>", header = T) > nboot<-1000 # number of bootstraps > attach(dat) > x<-dat[,2] # IV number 1 > y<-dat[,4] # DV > z<-dat[,3] # IV number 2 > subj<-dat[,1] # random factor > boot.fun<-function(data,i) { # function to resample residuals > d<-data > d$y<- d$fitted+d$res[i] # populate new y values based on > resampled residuals > as.numeric(coef(update(m2.fit,data=d))[1][[1]][1,c(1:4)]) > # update the linear model and output the coefficients > } > fit<-lmer(y~x*z + (1|(subj))) # the linear model > dat.res<-data.frame(y,x,z,subj, res=resid(fit), fitted=fitted(fit)) # > add residuals and fitted values to dat > boot.out<-boot(dat.res,boot.fun, R = nboot) # run the bootstrap using > the boot.fun > boot.out > > Methods attempted: > > Using the multicore package, I tried > boot.out<-collect(parallel(boot(dat.res,boot.fun, R = nboot))). This > returned a correct result, but did not speed things up. Not sure why... Hi Anthony,
When the individual calls passed on to the cluster are very short (which might be the case for your bootstrap), the overhead of running them parallel becomes very large, negating the positive effect of running the processes parallel. This could be an explanation for the lack of speed improvement. A solution could be to not send individual bootstrap calls to the cluster, but sets of calls. This decrease the overhead for parallel running. cheers, Paul > I also tried snowfall and snow. While I can create a cluster and run > simple processes (e.g., provided example from literature), I can't get > the bootstrap to run. For example, using snow: > > cl<- makeCluster(8) > clusterSetupRNG(cl) > clusterEvalQ(cl,library(boot)) > clusterEvalQ(cl,library(lme4)) > boot.out<-clusterCall(cl,boot(dat.res,boot.fun, R = nboot)) > stopCluster() > > returns the following error: > > Error in checkForRemoteErrors(lapply(cl, recvResult)) : > 8 nodes produced errors; first error: could not find function "fun" > > I am stuck and at the limit of my programming knowledge and am punting > to the R-help list. I need to run this process thousands of times, which > is the reason to make it parallel. Any suggestions are much appreciated. > > > Anthony > -- Paul Hiemstra, MSc Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.