Hi fellow users of R, My research requires the simultaneous optimization of several response functions. Therefore, I am using the nsga2 method in the mco package, which works beautifully.
However, I am running into a significant timing difference that is causing me grief. The details: Say I have the following function to optimize, using the results of the method rsm. > library(rsm) > #run the rsm analysis > rsm.ave<-rsm(ave ~ SO(x1, x2, x3, x4), data=heli) > rsm.sd<-rsm(log100s ~ SO(x1, x2, x3, x4), data=heli) > > #extract coefficients > coef.ave<-rsm.ave$coefficients > coef.sd<-rsm.sd$coefficients > > #function to optimize (maximize y1, minimize y2) > opt.func<-function(x){ + y<-numeric(2) + + y[1]<--1*(coef.ave[1]+coef.ave[2]*x[1]+coef.ave[3]*x[2]+coef.ave[4]*x[3]+coef.ave[5]*x[4]+ + coef.ave[6]*x[1]*x[2]+coef.ave[7]*x[1]*x[3]+coef.ave[8]*x[1]*x[4]+coef.ave[9]*x[2]*x[3]+ + coef.ave[10]*x[2]*x[4]+coef.ave[11]*x[3]*x[4]+ + coef.ave[12]*x[1]^2+coef.ave[13]*x[2]^2+coef.ave[14]*x[3]^2+coef.ave[15]*x[4]^2) + + y[2]<-coef.sd[1]+coef.sd[2]*x[1]+coef.sd[3]*x[2]+coef.sd[4]*x[3]+coef.sd[5]*x[4]+ + coef.sd[6]*x[1]*x[2]+coef.sd[7]*x[1]*x[3]+coef.sd[8]*x[1]*x[4]+coef.sd[9]*x[2]*x[3]+ + coef.sd[10]*x[2]*x[4]+coef.sd[11]*x[3]*x[4]+ + coef.sd[12]*x[1]^2+coef.sd[13]*x[2]^2+coef.sd[14]*x[3]^2+coef.sd[15]*x[4]^2 + return(y) + } > > library(mco) > print(system.time(nsga.res<-nsga2(opt.func, 4, 2, generations=150, > popsize=100, cprob=0.20, + cdist=100, mprob=0.20, mdist=100, lower.bounds=rep(-2, 4),upper.bounds=rep(2, 4)))) user system elapsed 2.42 0.00 2.43 That is impressive, and is exactly what I am looking for in my code. However, it has the drawback that the structure of the function to be optimized has to be built manually, and cannot be automatically built ffrom the rsm object. Also, it is hard on the eyes. Another way of achieving this end is to use the model.matrix method, which is advantageous in that it is completely general, and can easily be automated. > terms<-delete.response(terms(rsm.ave)) > opt.func2<-function(x, coef.ave, coef.sd, terms){ + y<-numeric(2) + x.df<-data.frame(t(x)) + names(x.df)=all.vars(terms) + X<-model.matrix(terms, data=x.df) + y[1]<-1-crossprod(t(X), coef.ave) + y[2]<-crossprod(t(X),coef.sd) + return(y) + } > > print(system.time(nsga.res2<-nsga2(opt.func2, 4, 2, coef.ave=coef.ave, > coef.sd=coef.sd, terms=terms, + generations=150, popsize=100, cprob=0.20, + cdist=100, mprob=0.20, mdist=100, lower.bounds=rep(-2, 4),upper.bounds=rep(2, 4)))) user system elapsed 59.42 0.00 60.48 My issue is self-evident: using this method resulted in a 30 fold increase in time. My question is why? If I time the individual components separately, nothing is unusual. My hunch is the "interaction" between the model.matrix and nsga2 methods. Any ideas on how to speed this process up, or circumvent the issue altogether? Thanks, Corey ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.