Re: [R] R's memory limitation and Hadoop

peter dalgaard Tue, 16 Sep 2014 05:58:58 -0700

Not sure trolling was intended here.

Anyways:


Yes, there are ways of working with very large datasets in R, using databases 
or otherwise. Check the CRAN task views. 

SAS will for _some_ purposes be able to avoid overflowing RAM by using 
sequential file access. The biglm package is an example of using similar 
techniques in R. SAS is not (to my knowledge) able to do this invariably, some 
procedures may need to load the entire data set into RAM.

JMP's data tables are limited by available RAM, just like R's are.

R does have somewhat inefficient memory strategies (e.g., model matrices may 
include multiple columns of binary variables, each using 8 bytes per entry), so 
may run out of memory sooner than other programs, but it is not like the 
competition is not RAM-restricted at all.

- Peter D.

On 16 Sep 2014, at 14:27 , Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote:

> If you need to start your question with a false dichotomy, by all means 
> choose the option you seem to have already chosen and stop trolling us.
> If you actually want an answer here, try Googling on the topic first (is "R 
> hadoop" so un-obvious?) and then phrase a specific question so someone has a 
> chance to help you.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                      Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> --------------------------------------------------------------------------- 
> Sent from my phone. Please excuse my brevity.
> 
> On September 16, 2014 4:40:29 AM PDT, Barry King <barry.k...@qlx.com> wrote:
>> Is there a way to get around R’s memory-bound limitation by interfacing
>> with a Hadoop database or should I look at products like SAS or JMP to
>> work
>> with data that has hundreds of thousands of records?  Any help is
>> appreciated.
>> 
>> -- 
>> __________________________
>> *Barry E. King, Ph.D.*
>> Analytics Modeler
>> Qualex Consulting Services, Inc.
>> barry.k...@qlx.com
>> O: (317)940-5464
>> M: (317)507-0661
>> __________________________
>> 
>>      [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R's memory limitation and Hadoop

Reply via email to