First, in R there is no need to declare the dimensions of your objects before they are populated so couldn't you reduce some run time by not going through the double data.frame step ? > df<- data.frame() > df data frame with 0 columns and 0 rows > for(i in 1:100) for(j in 1:3) df[i,j]<- runif(1) > str(df) 'data.frame': 100 obs. of 3 variables: ...
Second, about populating an environment ?assign might work better for you > e<- new.env() > system.time(for(i in 1:10000) e$a[i]<- rnorm(1,i)) user system elapsed 0.97 0.00 0.96 > rm(e) > e<- new.env() > system.time(for(i in 1:10000) assign('a',rnorm(1,i),env=e)) user system elapsed 0.17 0.00 0.17 Third, how are you reading in the file? and what does that mean "not knowing in advance..." ? Bill's suggestion to not populate the data.frame line by line is probably the "real" solution to your problem, as otherwise it's a little like kicking a turtle to make it go faster...try to find a rabbit instead. Posting a minimal example of your file format would have really helped. Often using ?scan to read the whole (or big chunks of the) file into R, followed by a customized formatting function that utilizes ?grep and ?strsplit to reconstruct the data you want in columns, solves the NEED to populate a data frame line by line. Hope this helps Elai > One complication is I don't know the names of the columns I'm assigning to > before I read them off the file. And crazily, if I change this: > data$x[i] <- i + 0.1 > > where data is an environment and x a primitive vector, to use a computed > name instead: > > data[[colname]][i] <- i + 0.1 > > Then I get back to way-superlinear performance. Eventually I found I could > work around it like: > > eval(substitute(var[ix] <- data, > list(var=as.name(colname), ix=i, data = i+0.1)), > envir = data) > > but... as workarounds go that seems to be on the crazy nuts end of the > scale. Why does [[]] impose a performance penalty? > > Peter > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.