On 08/02/2013 01:29 PM, Babu Guha wrote:
I have a comma delimited file with 62 fields of which some are comments.
There are about 1.5 million records/lines. Sme of the fields which has
comments and which i do not need have 40 characters. Of the 62 fields, I
will need at most 12 fields. What's best way to read in the fields I need.
If I read the entire file at once I will run out of memory. Could anyone
please suggest some solution?

Hi Babu,
Assuming that you know which fields you want, you could process the file line by line:

# say your file is "mydata.csv" and you want lines 1 to 12
mycon<-file("mydata.csv",open="r")
# assume you have exactly 1.5 million lines
mydata<-matrix(NA,nrow=1500000,ncol=12)
inputline<-"start"
lineindex<-1
while(nchar(inputline)) {
# read a line
 inputline<-readLines(mycon,1)
 if(nchar(inputline)) {
  mydata[lineindex,]<-
   unlist(sapply(strsplit(inputline,","),"[",1:12))
  lineindex<-lineindex+1
 }
}
close(mycon)

Jim

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to