Hi: On Mon, Dec 13, 2010 at 4:28 PM, chandu <chandrasekhar.kar...@gmail.com>wrote:
> > Dear all, > > I am relatively new to R. I would like to know how can we write the > realizations (for example generated through rnorm or runif) in to a data > file. It would be very inefficient to first generate values and then write > them in to file using "write" function. Instead, is there a way to generate > 1 value at a time and append them in to the file. > On the contrary, it is very inefficient in R to generate one value at a time and then append it to a file. R can do vectorized calculations, so for generating random data, it takes one line of code; e.g., rnorm(1000000, 0, 5) generates a vector of 1000000 random numbers from a normal distribution with mean 0 and standard deviation 5. To generate the code and write it to a file can also take one line: write.csv(rnorm(1000000, 0, 5), file = 'myRandomNumbers.csv'), row.names = FALSE, quote = FALSE) On my system, it took 4.24 seconds to write 1000000 random numbers to a file (its size is 18.6 Mb). Now, let's try your for loop approach, without writing to a file: # Pre-allocate space for the vector: u <- vector('numeric', 1000000) system.time(for(i in seq_along(u)) u[i] <- rnorm(1, 0, 5)) user system elapsed 6.86 0.00 6.88 # Initialize an empty object and populate it one element at a time: u <- NULL > system.time(for(i in 1:1000000) u <- c(u, rnorm(1, 0, 5))) The reason the second one is so inefficient is because of two important features in R that generally don't arise in most programming languages: fixed memory for workspaces and lazy evaluation. Because you are repeatedly appending to a object that grows and grows (this is where the lazy evaluation come into play), R has to work harder to find new memory after a while and so it slows down precipitously as it expends more and more effort finding memory. I got impatient with waiting, so Timing stopped at: 571.83 9.08 585.53 > system.time(for(i in 1:1000) u <- c(u, rnorm(1, 0, 5))) user system elapsed 2.64 0.08 2.74 > system.time(for(i in 1:10000) u <- c(u, rnorm(1, 0, 5))) user system elapsed 27.47 0.50 28.08 Multiply the last one (total time is on the far right) by 100 to get a probable lower bound for how long this takes. There are more efficient ways to do this (use of the function force(), for example), but the point is that one thing you definitely do NOT want to do in R is to append to an object one value at a time. It wouldn't be much different if you were writing to an external file. The question might be trivial to many experts. I appreciate your help. > The question is far from trivial, and many people have put in great amounts of effort to make R efficient. If possible, vectorized operations are a good way to go because they are generally fast. Having said that, there are occasions where it is more efficient to perform loops, but for novices who are used to Fortran/C/Java looping constructs, it is usually the case that there are very fast ways to do the same thing in R without a loop using vectorized operations. HTH, Dennis > > Thank you > -- > View this message in context: > http://r.789695.n4.nabble.com/writing-sample-values-in-to-a-file-tp3086286p3086286.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.