On Tue, 2013-09-17 at 12:06 -0700, Ross Boylan wrote: > Saving and loading data is roughly doubling memory use. I'm trying to > understand and correct the problem. Apparently I had the process memories mixed up: R1 below was the one with 4G and R2 with 2G. So there's less of a mystery. However... > > R1 was an R process using just over 2G of memory. > I did save(r3b, r4, sflist, file="r4.rdata") > and then, in a new process R2, > load(file="r4.rdata") > > R2 used just under 4G of memory, i.e., almost double the original > process. The r4.rdata file was just under 2G, which seemed like very > little compression. > > r4 was created by > r4 <- sflist2stanfit(sflist) > > I presume that r4 and sflist shared most of their memory. > The save() apparently lost the information that the memory was shared, > doubling memory use. Still wondering if this is going on. > > R 2.15.1, 64 bit on linux. > > First, does my diagnosis sound right? The reports of memory use in R2 > are quite a bit lower than the process footprint; is that normal? > > gc() # after loading data > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 1988691 106.3 3094291 165.3 2432643 130.0 > Vcells 266976864 2036.9 282174979 2152.9 268661172 2049.8 > > rm("r4") > > gc() > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 1949626 104.2 3094291 165.3 2432643 130.0 > Vcells 190689777 1454.9 282174979 2152.9 268661172 2049.8 > > r4 <- sflist2stanfit(sflist) > > gc() > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 1970497 105.3 3094291 165.3 2432643 130.0 > Vcells 228827252 1745.9 296363727 2261.1 268661172 2049.8 > > It seems the recreated r4 used about 300M less memory than the one read in from disk. This suggests that some of the sharing was lost in the save/load process.
> > Even weirder, R1 reports memory use well beyond the memory I show the > process using (2.2G) Not a mystery after getting the right processes. Actually, I'm a little surprised the process memory is less than the max used memory; I thought giving back memory was not possible on Linux. > > gc() > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 3640941 194.5 5543382 296.1 5543382 296.1 > Vcells 418720281 3194.6 553125025 4220.1 526708090 4018.5 > > > Second, what can I do to avoid the problem? Now a more modest problem, though still a problem. > > I guess in this case I could not save r4 and recreate it, but is there a > more general solution? > > If I did myboth <- list(r4, sflist) and > save(myboth, file="myfile") > would that be enough to keep the objects together? Judging from the > size of the file, it seems not. > > Even if the myboth trick worked it seems like a kludge. > > Ross Boylan > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.