On Aug 5, 2010, at 7:40 PM, noclue_ wrote:


I have a 64-bit windows box -
   Intel Xeon CPU E7340 @ 2.4GHz 31.9GB of RAM
I have R 2.11.1 (64bit) running on it.

Dear noclue_;

 What does this return?:

.Machine$sizeof.pointer


My csv data is 3.6 GB (with about 15 million obs, 120 variables.)

On my 64 bit setup with 24GB of RAM I can comfortably work with a dataset that is around the same number of columns but (only) 4.5 million rows. Working with this size data.frame in 18GB of RAM was somewhat uncomfortable because it would often "roll over" into virtual memory and then modeling calls took forever.... well, twenty minutes anyway.

I think you may be under-estimating the space requirements when working with larger objects. Numerics take 8 bytes. Objects often need to be copied and so space consumption can quickly double or triple.

> 8*15000000*120
[1] 1.44e+10

So that's 14 GB just to hold the object, not to do anything useful with it.

--
David.
------------------------------------------------
I have successfully imported the data above into R. No problem.

Now I am trying to run 'rpart' on my data. But I got the following error :

Error: cannot allocate vector of size 53.5 Mb
In addition: Warning messages:
1: In lapply(x, "is.na") :
 Reached total allocation of 32764Mb: see help(memory.size)
2: In lapply(x, "is.na") :
 Reached total allocation of 32764Mb: see help(memory.size)
3: In lapply(x, "is.na") :
 Reached total allocation of 32764Mb: see help(memory.size)
4: In lapply(x, "is.na") :
 Reached total allocation of 32764Mb: see help(memory.size)

===========================================================
Can anybody give me a hint on how to solve this?

Post better details?:

?sessionInfo


Thanks!
=========================================================


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to