On Thu, Feb 17, 2011 at 10:02 AM, Alex F. Bokov <ahupxo...@sneakemail.com> wrote: > Motivation: during each iteration, my code needs to collect tabular data (and > use it only during that iteration), but the rows of data may vary. I thought > I would speed it up by preinitializing the matrix that collects the data with > zeros to what I know to be the maximum number of rows. I was surprised by > what I found... > > # set up (not the puzzling part) > x<-matrix(runif(20),nrow=4); y<-matrix(0,nrow=12,ncol=5); foo<-c();
There is no purpose in initializing foo here. Your assignment in the second version overwrites any assignment here. > # this is what surprises me... what the? >> system.time(for(i in 1:100000){n<-sample(1:4,1);y[1:n,]<-x[1:n,];}); > user system elapsed > 1.510 0.000 1.514 This version performs extraction from x and assignment into a submatrix of y. The second version performs only the extraction and assignment to a name in the evaluation environment, which is a much faster operation. >> system.time(for(i in 1:100000){n<-sample(1:4,1);foo<-x[1:n,];}); > user system elapsed > 1.090 0.000 1.085 > > These results are very repeatable. So, if I'm interpreting them correctly, > dynamically allocating 'foo' each time to whatever the current output size is > runs faster than writing to a subset of a preallocated 'y'? How is that > possible? > > And, more generally, I'm sure other people have encountered this type of > situation. Am I reinventing the wheel? Is there a best practice for storing > temporary loop-specific data? > > Thanks. > > PS: By the way, though I cannot write to foo[,] because the size is > different each time, I tried writing to foo[] and the runtime was worse than > either of the above examples. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.