Dear R community, I am still struggling a bit on how R does memory allocation and how to optimize my code to minimize working memory load. Simon (thanks!) and others gave me a hint to use the command "gc()" to clean up memory which works quite nice but appears to me to be more like a "fix" to a problem.
To give you an impression of what I am talking, here is a short code example + I will give rough measure (system track app) of my working memory needed for each computational step (R64bit latest version on WIN 7 64 bit system, 2 Cores, approx 4 GB Ram): ########################## # example 1: y= matrix(rep(1,50000000), nrow = 50000000/2 , ncol = 2) # used working memory increases from 1044 --> 1808 MB # (same command again, i.e.) y= matrix(rep(1,50000000), nrow = 50000000/2 , ncol = 2) # 1808 MB --> 2178 MB Why does memory increase? # (give the matrix column names) colnames(y) = c("col1", "col2") # 2178 MB --> 1781 MB Why does the size of an object decrease if I assign column labels? ### # example 2: y= matrix(rep(1,50000000), nrow = 50000000/2 , ncol = 2) 1016 --> 1780 MB y = data.frame(y) # increase from 1780 MB --> 3315 MB ########################## Why does it take so much extra memory to store this matrix as a data.frame? It is not the object per se (i.e. that data.frames need more memory) because if I use gc() memory size drops to 1387 MB. Does this mean that it may be more memory-efficient not to use any data.frames but matrices only? etc. This puzzles me a lot. From my experience these effects are also accentuated for larger objects. As an anecdotal comparison: I also used Stata in my last project due to these memory problems and I could do a lot of variable manipulations of the same (!) data with significant (I am talking about GB) less memory needed. Best, Marc ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.