Re: [R] R's memory limitation and Hadoop

2014-09-16 Thread William Dunlap
> [*] I recall a student fitting a GLM with about 30 predictors to 1.5m > records: at the time (ca R 2.14) it did not fit in 4GB but did in 8GB. You can easily run out of memory when a few of the variables are factors, each with many levels, and the user looks for interactions between them. This

Re: [R] R's memory limitation and Hadoop

2014-09-16 Thread Prof Brian Ripley
On 16/09/2014 13:56, peter dalgaard wrote: Not sure trolling was intended here. Anyways: Yes, there are ways of working with very large datasets in R, using databases or otherwise. Check the CRAN task views. SAS will for _some_ purposes be able to avoid overflowing RAM by using sequential fi

Re: [R] R's memory limitation and Hadoop

2014-09-16 Thread Hadley Wickham
Hundreds of thousands of records usually fit into memory fine. Hadley On Tue, Sep 16, 2014 at 12:40 PM, Barry King wrote: > Is there a way to get around R’s memory-bound limitation by interfacing > with a Hadoop database or should I look at products like SAS or JMP to work > with data that has h

Re: [R] R's memory limitation and Hadoop

2014-09-16 Thread peter dalgaard
Not sure trolling was intended here. Anyways: Yes, there are ways of working with very large datasets in R, using databases or otherwise. Check the CRAN task views. SAS will for _some_ purposes be able to avoid overflowing RAM by using sequential file access. The biglm package is an example o

Re: [R] R's memory limitation and Hadoop

2014-09-16 Thread Jeff Newmiller
If you need to start your question with a false dichotomy, by all means choose the option you seem to have already chosen and stop trolling us. If you actually want an answer here, try Googling on the topic first (is "R hadoop" so un-obvious?) and then phrase a specific question so someone has a

Re: [R] R's memory limitation and Hadoop

2014-09-16 Thread John McKown
On Tue, Sep 16, 2014 at 6:40 AM, Barry King wrote: > Is there a way to get around R’s memory-bound limitation by interfacing > with a Hadoop database or should I look at products like SAS or JMP to work > with data that has hundreds of thousands of records? Any help is > appreciated. > __

[R] R's memory limitation and Hadoop

2014-09-16 Thread Barry King
Is there a way to get around R’s memory-bound limitation by interfacing with a Hadoop database or should I look at products like SAS or JMP to work with data that has hundreds of thousands of records? Any help is appreciated. -- __ *Barry E. King, Ph.D.* Analytics Modeler