I have recently been using R - more speciifcally the GUI packages Rattle and Rcmdr.
I like these products a lot and want to use them for some projects - the problem that I run into is when I start to try and run large datasets through them. The data sets are 10-15 million in record quantity and usually have 15-30 fields (both numerical and categorical). I saw that there were some packages that could deal with large datasets in R - bigmemory, ff, ffdf, biganalytics. My problem is that I am not much of a coder (and the reason I use the above mentioned GUIs). These GUIs do show the executable R code in the background - my thought was to run a small sample through the GUI, copy the code, and then incorporate some of the large data packages mentioned above - have anyone every tried to do this and would you have working examples. In terms of what I am trying to do to the data - really simple stuff - desriptive statistics, k-means clustering, and possibly some decision trees. Any help would be greatly appreciated. Thank you - John John Filben Cell Phone - 773.401.2822 Email - johnfil...@yahoo.com [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.