Gabor, thanks a lot - sqldf might be a solution. However, do you know if sqldf can also read in .txt files (with different delimiters)? The data I am dealing with is "|" - delimited. So, I was using read.table(...,sep="|") I looked at sqldf description - but did not see examples with .txt.
Thanks a lot! Dimitri On Fri, Oct 22, 2010 at 6:28 PM, Gabor Grothendieck <ggrothendi...@gmail.com> wrote: > On Fri, Oct 22, 2010 at 5:17 PM, Dimitri Liakhovitski > <dimitri.liakhovit...@gmail.com> wrote: >> I know I could figure it out empirically - but maybe based on your >> experience you can tell me if it's doable in a reasonable amount of >> time: >> I have a table (in .txt) with a 17,000,000 rows (and 30 columns). >> I can't read it all in (there are many strings). So I thought I could >> read it in in parts (e.g., 1 milllion) using nrows= and skip. >> I was able to read in the first 1,000,000 rows no problem in 45 sec. >> But then I tried to skip 16,999,999 rows and then read in things. Then >> R crashed. Should I try again - or is it too many rows to skip for R? >> > > You could try read.csv.sql in sqldf. > > library(sqldf) > read.csv.sql("myfile.csv", skip = 1000, header = FALSE) > or > read.csv.sql("myfile.csv, sql = "select * from file 2000, 1000") > > The first skips the first 1000 lines including the header and the > second one skips 1000 rows (but still reads in the header) and then > reads 2000 rows. You may or may not need to specify other arguments > as well. For example, you may need to specify eol = "\n" or other > depending on your line endings. > > Unlike read.csv, read.csv.sql reads the data directly into an sqlite > database (which it creates on the fly for you). The data does not go > through R during this operation. From there it reads only the data > you ask for into R so R never sees the skipped over data. After all > that it automatically deletes the database. > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.