On Wed, 17 Feb 2010, S. Few wrote:

Currently using R 2.92
Win XP


Goal: scale up computing capacity for large datasets (1-5 million records)

I realize under 32 bit versions of R the memory.limit is maxed at 4GB.

Q:
1. What are the limits under 64 bit versions of R? Are those limits OS
dependent?

I'm not sure exactly, but it is known, and large enough not to currently be a 
constraint.

2. Are there limits to the size of individual objects?”

Individual vectors are still limited to 2^31 entries (?2^32), so a matrix can 
have only 2 billion elements, a data frame can have only two billion rows and 
two billion columns, etc.   This is likely to be the binding constraint in the 
near future, but 2^31 integers is an 8Gb vector and 2^31 doubles is 16Gb.

There will also be some limit on the number of objects. I don't know if we even 
know what it is, but it will be large.

3. Are there limits or problems in using functions such as lm(),
glm(), rpart(), doBy package, MASS, etc?

I don't think so.  The differences should not be visible at the interpreted 
level, and packages that are not 64-bit clean in their compiled code will have 
broken already. Obviously, algorithms that scale slower than linearly in the 
sample size will get very painful.

     -thomas

Thomas Lumley                   Assoc. Professor, Biostatistics
tlum...@u.washington.edu        University of Washington, Seattle
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to