This is ridiculous! Please read "An Introduction to R" (ships with R) or other online R tutorial. There are many good ones. There are also probably online courses. Please make an effort to learn the basics before posting further here.
-- Bert On Sun, Aug 18, 2013 at 7:13 AM, Dylan Doyle <ddoyle....@gmail.com> wrote: > Hello all thank-you for your speedy replies , > > Here is the first few lines from the head function > > brewery_id brewery_name review_time review_overall review_aroma > review_appearance review_profilename > 1 10325 Vecchio Birraio 1234817823 1.5 > 2.0 2.5 stcules > 2 10325 Vecchio Birraio 1235915097 3.0 > 2.5 3.0 stcules > 3 10325 Vecchio Birraio 1235916604 3.0 > 2.5 3.0 stcules > 4 10325 Vecchio Birraio 1234725145 3.0 > 3.0 3.5 stcules > 5 1075 Caldera Brewing Company 1293735206 4.0 > 4.5 4.0 johnmichaelsen > 6 1075 Caldera Brewing Company 1325524659 3.0 > 3.5 3.5 oline73 > > beer_style review_palate review_taste beer_name > beer_abv beer_beerid > 1 Hefeweizen 1.5 1.5 Sausa > Weizen 5.0 47986 > 2 English Strong Ale 3.0 3.0 > Red Moon 6.2 48213 > 3 Foreign / Export Stout 3.0 3.0 Black Horse > Black Beer 6.5 48215 > 4 German Pilsener 2.5 3.0 > Sausa Pils 5.0 47969 > 5 American Double / Imperial IPA 4.0 4.5 > Cauldron DIPA 7.7 64883 > 6 Herbed / Spiced Beer 3.0 3.5 Caldera > Ginger Beer 4.7 52159 > > ' > I have only discovered how to import the data set , and run some basic r > functions on it my goal is to be able to answer questions like what are the > top 10 pilsner's , or the brewer with the highest abv average. Also using > two factors such as best beer aroma and appearance, which beer style should > I try. Let me know if i can give you any more information you might need to > help me. > > Thanks again , > > Dylan > >> > > > > On Sun, Aug 18, 2013 at 4:16 AM, Paul Bernal <paulberna...@gmail.com> wrote: > >> Thank you so much Steve. >> >> The computer I'm currently working with is a 32 bit windows 7 OS. And RAM >> is only 4GB so I guess thats a big limitation. >> El 18/08/2013 03:11, "Steve Lianoglou" <lianoglou.st...@gene.com> >> escribió: >> >> > Hi Paul, >> > >> > On Sun, Aug 18, 2013 at 12:56 AM, Paul Bernal <paulberna...@gmail.com> >> > wrote: >> > > Thanks a lot for the valuable information. >> > > >> > > Now my question would necessarily be, how many columns can R handle, >> > > provided that I have millions of rows and, in general, whats the >> maximum >> > > amount of rows and columns that R can effortlessly handle? >> > >> > This is all determined by your RAM. >> > >> > Prior to R-3.0, R could only handle vectors of length 2^31 - 1. If you >> > were working with a matrix, that meant that you could only have that >> > many elements in the entire matrix. >> > >> > If you were working with a data.frame, you could have data.frames with >> > 2^31-1 rows, and I guess as many columns, since data.frames are really >> > a list of vectors, the entire thing doesn't have to be in one >> > contiguous block (and addressable that way) >> > >> > R-3.0 introduced "Long Vectors" (search for that section in the release >> > notes): >> > >> > https://stat.ethz.ch/pipermail/r-announce/2013/000561.html >> > >> > It almost doubles the size of a vector that R can handle (assuming you >> > are running 64bit). So, if you've got the RAM, you can have a >> > data.frame/data.table w/ billion(s) of rows, in theory. >> > >> > To figure out how much data you can handle on your machine, you need >> > to know the size of real/integer/whatever and the number of elements >> > of those you will have so you can calculate the amount of RAM you need >> > to load it all up. >> > >> > Lastly, I should mention there are packages that let you work with >> > "out of memory" data, like bigmemory, biglm, ff. Look at the HPC Task >> > view for more info along those lines: >> > >> > http://cran.r-project.org/web/views/HighPerformanceComputing.html >> > >> > >> > > >> > > Best regards and again thank you for the help, >> > > >> > > Paul >> > > El 18/08/2013 02:35, "Steve Lianoglou" <lianoglou.st...@gene.com> >> > escribió: >> > > >> > >> Hi Paul, >> > >> >> > >> First: please keep your replies on list (use reply-all when replying >> > >> to R-help lists) so that others can help but also the lists can be >> > >> used as a resource for others. >> > >> >> > >> Now: >> > >> >> > >> On Aug 18, 2013, at 12:20 AM, Paul Bernal <paulberna...@gmail.com> >> > wrote: >> > >> >> > >> > Can R really handle millions of rows of data? >> > >> >> > >> Yup. >> > >> >> > >> > I thought it was not possible. >> > >> >> > >> Surprise :-) >> > >> >> > >> As I type, I'm working with a ~5.5 million row data.table pretty >> > >> effortlessly. >> > >> >> > >> Columns matter too, of course -- RAM is RAM, after all and you've got >> > >> to be able to fit the whole thing into it if you want to use >> > >> data.table. Once loaded, though, data.table enables one to do >> > >> split/apply/combine calculations over these data quite efficiently. >> > >> The first time I used it, I was honestly blown away. >> > >> >> > >> If you find yourself wanting to work with such data, you could do >> > >> worse than read through data.table's vignette and FAQ and give it a >> > >> spin. >> > >> >> > >> HTH, >> > >> >> > >> -steve >> > >> >> > >> -- >> > >> Steve Lianoglou >> > >> Computational Biologist >> > >> Bioinformatics and Computational Biology >> > >> Genentech >> > >> >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > >> > > ______________________________________________ >> > > R-help@r-project.org mailing list >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, reproducible code. >> > > >> > >> > >> > >> > -- >> > Steve Lianoglou >> > Computational Biologist >> > Bioinformatics and Computational Biology >> > Genentech >> > >> >> [[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.