On Dec 11, 2009, at 11:08 AM, Tom Knockinger wrote:
Hi,
i am new to the R-project but until now i have found solutions for
every problem in toturials, R Wikis and this mailing list, but now i
have some problems which I can't solve with this knowledge.
I have some data like this:
# sample data
head1 = "a;b;c;d;e;f;g;h;i;k;l;m;n;o"
data1 = "1;1;1;1;1;1;1;1;1;1;1;1;1;1"
data2 = "2;2;2;2;2;2;2;2;2;2;2;2;2;2"
data3 = "3;3;3;3;3;3;3;3;3;3;3;3;3;3"
datastring = paste("", head1,data1,data2,data3,"",sep="\n")
# import operation
res = read.table(textConnection(datastring), header=TRUE, sep =
c(";"))
closeAllConnections()
# I use these two lines in a for-loop like this:
#for( j in 1:length(data)) {
# res[j] = read.table(textConnection(datastring[j]),
header=TRUE, sep = c(";"))
# closeAllConnections()
#}
I get these strings from a file which contains about 50 to 1000 of
them, so I can read them all into a list. I am not sure if there is
a better way to do this, but it works for me. Maybe you have some
suggestions for a better solution.
Now after this short introduction to the r-program I use, I have two
problems with this approach.
1) warnings
i get warnings like "unused connection 3 (datastring) closed" after
some other operations from time to time. But all connections should
already be closed, and I doesn't create new ones.
2) ram usage and program shutdowns
length(data) is usually between 50 to 1000. So it takes some space
in ram (approx 100-200 mb) which is no problem but I use some
analysis code which results in about 500-700 mb ram usage, also not
a real problem.
The results are matrixes of (50x14 to 1000x14) so they are small
enough to work with them afterwards: create plots, or make some more
analysis.
So i wrote a function which do the analysis one file after another
and keep only the results in a list. But after some about 2-4 files
my R process uses about 1500MB and then the troubles begin.
Windows?
The R console terminates or prints the error that no more space can
be allocated. So i have to do each file separate and save each
result in a file and restart R after 2 processed files. And do that
3-5 times so that all files are processed, which is a bit anoying.
I did some research on this problem and i find out that
-) after I import the data in the same variable the ram usage goes
up each time about 100-200mb instead of reusing or purging the old
data, which should be overwritten since they are no longer available
after i import a new file.
-) the same occures with the analysis functions which uses much more
space and also doesn't release the old no longer used variables. But
ls() doesn't shows them at all.
-) also after I cleared all variables with "rm(list=ls(all=TRUE))"
the used ram space is still the same.
So is there a possibility to get the ram space back? So i can do all
the analysis in one session and don't have to mess around with
additional files?
It is possible to call the garbage collector with gc(). Supposedly
that should not be necessary, since garbage collection is automatic,
but I have the impression that it helps prevent situations that
otherwise lead to virtual memory getting invoked on the Mac (which I
also thought should not be happening, but I will swear that it does.)
--
David
Thanks for your help
Tom
--
Preisknaller: GMX DSL Flatrate für nur 16,99 Euro/mtl.!
http://portal.gmx.net/de/go/dsl02
--
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla
Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.