I just tried it: for(i in 11:16){ #i<-11 start<-Sys.time() print(start) flush.console() filename<-paste("skipped millions- ",i,".txt",sep="") mydata<-read.csv.sql("myfilel.txt", sep="|", eol="\r\n", sql = "select * from file limit 1000000, (1000000*i-1)") write.table(mydata,file=filename,sep="\t",row.names=F) end<-Sys.time() print(end-start) flush.console() }
It started running at 9:52 am. Around 10:05 am I got this error: Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (error in statement: no such column: i) What does it mean? Thank you! Dimitri On Sat, Oct 23, 2010 at 9:44 AM, Dimitri Liakhovitski <dimitri.liakhovit...@gmail.com> wrote: > Oh, I understand - I did not realize it's reading in the whole file. > So, is there any way to make it read it in only once and the spit into > R just one piece (e.g., 1 million rows), write a regular file out > (e.g., a txt using write.table), and then grab the next million? > Because I was planning to do something like this (I have 17+ million rows): > > for(1:16){ > filename<-paste("million number ",i,".txt",sep="") > mydata<-read.csv.sql("myfile.txt", sep="|", eol="\r\n", sql = "select > * from file limit 1000000, (1000000*i-1)") > write.table(mydata,file=filename,sep="\t",row.names=F) > } > > But if each iteration it will be reading in the whole file for a long > time - it'll take a long time... Is there any way to make it read the > file in only once? I guess not - because there is not enough memory to > hold it in the first place... > Thanks again for your advice! > Dimitri > > > > On Sat, Oct 23, 2010 at 9:32 AM, Gabor Grothendieck > <ggrothendi...@gmail.com> wrote: >> On Sat, Oct 23, 2010 at 9:20 AM, Dimitri Liakhovitski >> <dimitri.liakhovit...@gmail.com> wrote: >>> This is very helpful, Gabor. >>> I've run the code to figure out the end of the line and here is what I >>> am seeing at the end of each line: \r\n >>> So, I specified like this: >>> mydata<-read.csv.sql("myfile.txt", sep="|", eol="\r\n", sql = "select >>> * from file limit 200, 100") >>> >>> However, again it's hanging again. Another typo? >>> >> >> I wonder if its just taking longer than you think. It does read the >> entire file into sqlite even if you only read a portion from sqlite to >> R so if the file is very long it will still take some time. Try >> creating a small file of a few hundred lines from your file and >> experiment on that until you get it working. >> >> -- >> Statistics & Software Consulting >> GKX Group, GKX Associates Inc. >> tel: 1-877-GKX-GROUP >> email: ggrothendieck at gmail.com >> > > > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.