Thus the post-processing, which I assume you'd have to do with scan() as well.
> tcon <- file(tfile, "r") # or tcon <- textConnection(t) > allfile <- readLines(tcon, n=10000) > strsplit(paste(allfile, collapse="\n"), "\"") [[1]] [1] "A " "Two line\nentry" "\n\n" "Three\nline\nentry" [5] " D E" Or similar, depending on exactly what you want the result to look like. On Thu, Oct 15, 2015 at 4:56 PM, William Dunlap <wdun...@tibco.com> wrote: > readLines() does not work for me since it breaks up > multiline fields that are enclosed in quotes. E.g., the > text file line > A "Two line\nentry" > should be imported as 2 strings, the second being > "Two line\nfield", not "\"Two line" with the next call to > readLines bringing in "fentry\"". > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Thu, Oct 15, 2015 at 1:44 PM, Sarah Goslee <sarah.gos...@gmail.com> wrote: >> I've always used system("wc -l myfile") to get the number of lines in >> advance. But here are two other R-only options, both using readLines >> instead of scan. There's probably something more efficient, too. >> >> Your setup: >> t <- 'A "Two line\nentry"\n\n"Three\nline\nentry" D E\n' >> tfile <- tempfile() >> cat(t, file=tfile) >> tcon <- file(tfile, "r") # or tcon <- textConnection(t) >> >> readLines() produces character(0) for nonexistent lines and "" for empty >> lines. >> >>> readLines(tcon, n=1) >> [1] "A \"Two line" >>> readLines(tcon, n=1) >> [1] "entry\"" >>> readLines(tcon, n=1) >> [1] "" >>> readLines(tcon, n=1) >> [1] "\"Three" >>> readLines(tcon, n=1) >> [1] "line" >>> readLines(tcon, n=1) >> [1] "entry\" D E" >>> readLines(tcon, n=1) >> character(0) >>> readLines(tcon, n=1) >> character(0) >> >> Or if the file isn't too large for memory, you can read the whole >> thing in then process it line by line: >> >> tcon <- file(tfile, "r") # or tcon <- textConnection(t) >> allfile <- readLines(tcon, n=10000) >> >>> length(allfile) >> [1] 6 >> >> On Thu, Oct 15, 2015 at 4:16 PM, William Dunlap <wdun...@tibco.com> wrote: >>> I would like to read a connection line by line with scan but >>> don't know how to tell when to quit trying. Is there any >>> way that you can ask the connection object if it is at the end? >>> >>> E.g., >>> >>> t <- 'A "Two line\nentry"\n\n"Three\nline\nentry" D E\n' >>> tfile <- tempfile() >>> cat(t, file=tfile) >>> tcon <- file(tfile, "r") # or tcon <- textConnection(t) >>> scan(tcon, what="", nlines=1) >>> #Read 2 items >>> #[1] "A" "Two line\nentry" >>>> scan(tcon, what="", nlines=1) # empty line >>> #Read 0 items >>> #character(0) >>> scan(tcon, what="", nlines=1) >>> #Read 3 items >>> #[1] "Three\nline\nentry" "D" "E" >>> scan(tcon, what="", nlines=1) # end of file >>> #Read 0 items >>> #character(0) >>> scan(tcon, what="", nlines=1) # end of file >>> #Read 0 items >>> #character(0) >>> >>> I am reading virtual line by virtual line because the lines >>> may have different numbers of fields. >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >> -- >> Sarah Goslee >> http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.