Colleagues, Using R2.7.0 in OS X, I am having trouble understanding the command textConnection. My situation is as follows: 1. I am trying to read a lengthy file (45000 lines) that has headers ~ every 1000 lines. read.table (or its variants) fail because of the recurrent headers. 2. My present approach is the following: a. use readLines to read the file, save as an array b. use grep to find the recurrent headers (not including the first set) c. delete the recurrent headers from the array d. write the array to a temp file e. read the temp file using read.table f. delete the temp file 3. My understanding is to textConnection might enable me to replace steps d-f with a single step akin to read.table(textConnection(array)). This appears to work but it is very slow. I executed code on successively larger chunks of the array: for (Each in 1000 * 1:45) { cat("N lines =", Each, "\t", date(), "\n") A <- read.table(textConnection(Z[1:Each]), header=T) } yielding: N lines = 1000 Sun Oct 12 07:09:48 2008 N lines = 2000 Sun Oct 12 07:09:48 2008 N lines = 3000 Sun Oct 12 07:09:48 2008 N lines = 4000 Sun Oct 12 07:09:50 2008 N lines = 5000 Sun Oct 12 07:09:52 2008 N lines = 6000 Sun Oct 12 07:09:56 2008 N lines = 7000 Sun Oct 12 07:10:01 2008 N lines = 8000 Sun Oct 12 07:10:09 2008 N lines = 9000 Sun Oct 12 07:10:18 2008 N lines = 10000 Sun Oct 12 07:10:31 2008 N lines = 11000 Sun Oct 12 07:10:46 2008 N lines = 12000 Sun Oct 12 07:11:04 2008 N lines = 13000 Sun Oct 12 07:11:25 2008 N lines = 14000 Sun Oct 12 07:11:51 2008 N lines = 15000 Sun Oct 12 07:12:20 2008 N lines = 16000 Sun Oct 12 07:12:54 2008 N lines = 17000 Sun Oct 12 07:13:32 2008 N lines = 18000 Sun Oct 12 07:14:16 2008 N lines = 19000 Sun Oct 12 07:15:04 2008 N lines = 20000 Sun Oct 12 07:15:58 2008 N lines = 21000 Sun Oct 12 07:16:58 2008 N lines = 22000 Sun Oct 12 07:18:04 2008 N lines = 23000 Sun Oct 12 07:19:17 2008 N lines = 24000 Sun Oct 12 07:20:36 2008 N lines = 25000 Sun Oct 12 07:22:02 2008 N lines = 26000 Sun Oct 12 07:23:36 2008
Any clever ideas will be greatly appreciated. Dennis Dennis Fisher MD P < (The "P Less Than" Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-415-564-2220 www.PLessThan.com [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.