Re: [R] Dealing With Extremely Large Files

zerfetzen Tue, 30 Sep 2008 15:41:50 -0700

Thank you Gabor, this is fantastic, easy to use and so powerful.  I was
instantly able to many things with .csv files that are much too large for my
PC's memory.  This is clearly my new favorite way to read in data, I love
it!

Is it possible to use sqldf with a fixed width format that requires a file
layout?

For example, let's say you have a .dat file called madeup.dat, without a
header row.  The hypothetical file madeup.dat for discussion has 3 variables
(state, zipcode, and score), is 10 characters wide, and has 20 rows (again,
just a made-up file).

Here is my fumbling attempt at code that will read in only state and score,
and randomly select 10 obs:

library(sqldf)

# Source pulls in the development version of sqldf.
source("http://sqldf.googlecode.com/svn/trunk/R/sqldf.R";)

#Open a connection to that file.
MyConnection <- file("madeup.dat")

# Read in only state and score variables, and randomly select only 10 rows.
MyData <- sqldf("select state,score from MyConnection order by random(*)
limit 10")

# I think everything about this would work, except it should not currently
know which
# columns are to be brought in for the state variable (which would be 1-2),
and that
# the text columns for zipcode (3-7) should be ignored, and finally that
score (text
# columns 8-10) should be included again.  If I have overlooked this, I
apologize.
# Thank you.
-- 
View this message in context: 
http://www.nabble.com/Dealing-With-Extremely-Large-Files-tp19695311p19750580.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dealing With Extremely Large Files

Reply via email to