The small reproducible example below works, but is way too slow on the real problem. The real problem is attempting to extract ~2920 repeated arrays from a 60 Mb file and takes ~80 minutes. I'm wondering how I might re-engineer the script to avoid opening and closing the file 2920 times as is the case now. That is, is there a way to keep the file open and peel out the arrays and stuff them into a list of data.tables, as is done in the small reproducible example below, but in a significantly faster way?
wha <- " INITIAL PRESSURE HEAD INITIAL TEMPERATURE SET TO 4.000E+00 DEGREES C VS2DH - MedSand for TL test TOTAL ELAPSED TIME = 0.000000E+00 sec TIME STEP 0 MOISTURE CONTENT Z, IN m X OR R DISTANCE, IN m 0.500 0.075 0.1475 0.225 0.1475 0.375 0.1475 0.525 0.1475 0.675 0.1475 blah blah blah TEMPERATURE, IN DECREES C Z, IN m X OR R DISTANCE, IN m 0.500 0.075 1.1475 0.225 2.1475 0.375 3.1475 0.525 4.1475 0.675 5.1475 blah blah blah TOTAL ELAPSED TIME = 8.6400E+04 sec TIME STEP 0 MOISTURE CONTENT Z, IN m X OR R DISTANCE, IN m 0.500 0.075 0.1875 0.225 0.1775 0.375 0.1575 0.525 0.1675 0.675 0.1475 blah blah blah TEMPERATURE, IN DECREES C Z, IN m X OR R DISTANCE, IN m 0.500 0.075 1.1475 0.225 2.1475 0.375 3.1475 0.525 4.1475 0.675 5.1475 blah blah blah" example_content <- textConnection(wha) srchStr1 <- ' MOISTURE CONTENT' srchStr2 <- 'TEMPERATURE, IN DECREES C' lines <- readLines(example_content) mc_list <- NULL for (i in 1:length(lines)){ # Look for start of water content if(grepl(srchStr1, lines[i])){ mc_list <- c(mc_list, i) } } tmp_list <- NULL for (i in 1:length(lines)){ # Look for start of temperature data if(grepl(srchStr2, lines[i])){ tmp_list <- c(tmp_list, i) } } # Store the water content arrays wc <- list() # Read all the moisture content profiles for(i in 1:length(mc_list)){ lineNum <- mc_list[i] + 3 mct <- read.table(text = wha, skip=lineNum, nrows=5, col.names=c('depth','wc')) wc[[i]] <- mct } # Store the water temperature arrays tmp <- list() # Read all the temperature profiles for(i in 1:length(tmp_list)){ lineNum <- tmp_list[i] + 3 tmpt <- read.table(text = wha, skip=lineNum, nrows=5, col.names=c('depth','tmp')) tmp[[i]] <- tmpt } # quick inspection length(wc) wc[[1]] # Looks like what I'm after, but too slow in real world problem [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.