Thank you Gabor, this is fantastic, easy to use and so powerful. I was instantly able to many things with .csv files that are much too large for my PC's memory. This is clearly my new favorite way to read in data, I love it!
Is it possible to use sqldf with a fixed width format that requires a file layout? For example, let's say you have a .dat file called madeup.dat, without a header row. The hypothetical file madeup.dat for discussion has 3 variables (state, zipcode, and score), is 10 characters wide, and has 20 rows (again, just a made-up file). Here is my fumbling attempt at code that will read in only state and score, and randomly select 10 obs: library(sqldf) # Source pulls in the development version of sqldf. source("http://sqldf.googlecode.com/svn/trunk/R/sqldf.R") #Open a connection to that file. MyConnection <- file("madeup.dat") # Read in only state and score variables, and randomly select only 10 rows. MyData <- sqldf("select state,score from MyConnection order by random(*) limit 10") # I think everything about this would work, except it should not currently know which # columns are to be brought in for the state variable (which would be 1-2), and that # the text columns for zipcode (3-7) should be ignored, and finally that score (text # columns 8-10) should be included again. If I have overlooked this, I apologize. # Thank you. -- View this message in context: http://www.nabble.com/Dealing-With-Extremely-Large-Files-tp19695311p19750580.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.