On Tue, 22 Aug 2006, hadley wickham wrote: >> The loop method took 195 secs. Just assigning to an answer of the correct >> length reduced this to 5 secs. e.g. use >> >> ADDRESSES <- character(length(VECTOR)-4) >> >> Moral: don't grow vectors repeatedly. > > Other languages (eg. Java) grow the size of the vector independently > of the number of observations in it (I think Java doubles the size > whenever the vector is filled), thus changing O(n) behaviour to O(log > n). I've always wondered why R doesn't do this. >
(redirected to r-devel, a better location for wonder of this type) This was apparently the intention at the beginnng of time, thus the LENGTH and TRUELENGTH macros in the source. In many cases, though, there is duplication as well as length change, eg x<-c(x, something) will set NAMED(x) to 2 by the second iteration, forcing duplication at each subsequent iteration. The doubling strategy would still leave us with O(n) behaviour, just with a smaller constant. The only case I can think of where the doubling strategy actually helps a lot is the one in Atte's example, assigning off the end of an existing vector. That wasn't legal in early versions of R (and I think most people would agree that it shouldn't be encouraged). A reAllocVector() function would clearly have some benefits, but not as many as one would expect. That's probably why it hasn't been done (which doesn't mean that it shouldn't be done). Providing the ability to write assignment functions that don't duplicate is a more urgent problem. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel