Hello,

Try the following, readaing your file into 'x', using readLines.



tc <- textConnection("
TABLE NO.  1
COL1 COL2 COL3 COL4 COL5 COL6 COL7 COL8 COL9 COL10 COL11 COL12 1.0010E+05 0.0000E+00 1.0000E+00 1.0000E+03 -1.0000E+00 1.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.0010E+05 1.0001E+01 1.0000E+00 1.0000E+03 -1.0000E+00 1.0000E+00 2.2737E-14 -2.2737E-14 0.0000E+00 1.9281E-08 0.0000E+00 0.0000E+00 1.0010E+05 2.4000E+01 1.0000E+00 2.0000E+03 -1.0000E+00 1.0000E+00 5.7541E-15 -5.7541E-15 0.0000E+00 5.1115E-13 0.0000E+00 0.0000E+00

TABLE NO.  1
COL1 COL2 COL3 COL4 COL5 COL6 COL7 COL8 COL9 COL10 COL11 COL12 1.0010E+05 0.0000E+00 1.0000E+00 1.0000E+03 -1.0000E+00 1.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.0010E+05 1.0001E+01 1.0000E+00 1.0000E+03 -1.0000E+00 1.0000E+00 2.2737E-14 -2.2737E-14 0.0000E+00 1.9281E-08 0.0000E+00 0.0000E+00 1.0010E+05 2.4000E+01 1.0000E+00 2.0000E+03 -1.0000E+00 1.0000E+00 5.7541E-15 -5.7541E-15 0.0000E+00 5.1115E-13 0.0000E+00 0.0000E+00
")

x <- readLines(tc)
close(tc)

#------------------------ starts here
x <- x[ x != "" ]

i1 <- grep("TABLE", x)
i2 <- grep("COL", x)
y <- x[-c(i1, i2)]

tc <- textConnection(y)
dat <- read.table(tc)
close(tc)

cnames <- unlist(strsplit(x[2], " "))
names(dat) <- cnames[cnames != ""]


Hope this helps,

Rui Barradas
Em 18-10-2012 14:57, Fisher Dennis escreveu:
R 2.15.1
OS X

Colleagues,

I am reading a 1 GB file into R using read.table.  The file consists of 100 
tables, each of which is headed by two lines of characters.
The first of these lines is:
        TABLE NO.  1
The second is a list of column headers.

For example:
TABLE NO.  1
  COL1        COL2        COL3        COL4        COL5        COL6        COL7  
      COL8        COL9        COL10       COL11       COL12
   1.0010E+05  0.0000E+00  1.0000E+00  1.0000E+03 -1.0000E+00  1.0000E+00  
0.0000E+00  0.0000E+00  0.0000E+00  0.0000E+00  0.0000E+00  0.0000E+00
   1.0010E+05  1.0001E+01  1.0000E+00  1.0000E+03 -1.0000E+00  1.0000E+00  
2.2737E-14 -2.2737E-14  0.0000E+00  1.9281E-08  0.0000E+00  0.0000E+00
   1.0010E+05  2.4000E+01  1.0000E+00  2.0000E+03 -1.0000E+00  1.0000E+00  
5.7541E-15 -5.7541E-15  0.0000E+00  5.1115E-13  0.0000E+00  0.0000E+00

Later something similar appears:
TABLE NO.  1
  COL1        COL2        COL3        COL4        COL5        COL6        COL7  
      COL8        COL9        COL10       COL11       COL12
   1.0010E+05  0.0000E+00  1.0000E+00  1.0000E+03 -1.0000E+00  1.0000E+00  
0.0000E+00  0.0000E+00  0.0000E+00  0.0000E+00  0.0000E+00  0.0000E+00
   1.0010E+05  1.0001E+01  1.0000E+00  1.0000E+03 -1.0000E+00  1.0000E+00  
2.2737E-14 -2.2737E-14  0.0000E+00  1.9281E-08  0.0000E+00  0.0000E+00
   1.0010E+05  2.4000E+01  1.0000E+00  2.0000E+03 -1.0000E+00  1.0000E+00  
5.7541E-15 -5.7541E-15  0.0000E+00  5.1115E-13  0.0000E+00  0.0000E+00

I will use the term "problematic lines" to refer to the repeated occurrences of 
the two non-data lines

read.table is not successful in reading the table because of these problematic lines (I 
get around the first "TABLE NO." line using the skip option)

My word-around has been to:
        1.  read the table with readLines
        2.  remove the problematic lines
        3.  write the file to disk
        4.  read the file with read.table.
However, this process is slow.

I though about using "comment.char" as a means of avoiding reading the problematic lines. 
 However, comment.char does not accept ="[A-Z]"

Are there any clever workarounds for this?

Dennis


Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to