On Aug 5, 2010, at 7:40 PM, noclue_ wrote:
I have a 64-bit windows box -
Intel Xeon CPU E7340 @ 2.4GHz 31.9GB of RAM
I have R 2.11.1 (64bit) running on it.
Dear noclue_;
What does this return?:
.Machine$sizeof.pointer
My csv data is 3.6 GB (with about 15 million obs, 120 variables.)
On my 64 bit setup with 24GB of RAM I can comfortably work with a
dataset that is around the same number of columns but (only) 4.5
million rows. Working with this size data.frame in 18GB of RAM was
somewhat uncomfortable because it would often "roll over" into virtual
memory and then modeling calls took forever.... well, twenty minutes
anyway.
I think you may be under-estimating the space requirements when
working with larger objects. Numerics take 8 bytes. Objects often need
to be copied and so space consumption can quickly double or triple.
> 8*15000000*120
[1] 1.44e+10
So that's 14 GB just to hold the object, not to do anything useful
with it.
--
David.
------------------------------------------------
I have successfully imported the data above into R. No problem.
Now I am trying to run 'rpart' on my data. But I got the following
error :
Error: cannot allocate vector of size 53.5 Mb
In addition: Warning messages:
1: In lapply(x, "is.na") :
Reached total allocation of 32764Mb: see help(memory.size)
2: In lapply(x, "is.na") :
Reached total allocation of 32764Mb: see help(memory.size)
3: In lapply(x, "is.na") :
Reached total allocation of 32764Mb: see help(memory.size)
4: In lapply(x, "is.na") :
Reached total allocation of 32764Mb: see help(memory.size)
===========================================================
Can anybody give me a hint on how to solve this?
Post better details?:
?sessionInfo
Thanks!
=========================================================
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.