On Friday, October 4, 2002, at 09:21 AM, Jerry Preston wrote: > Hi!,
Howdy. > I am look for a better way and a faster way to deal with a 4 - 8 meg > data > file. This file has been saved as an .cvs file for excel to read in. A "better" way, is pretty open to interpretation, so here's my interpretation. > All I am interested in is the first three cells of ',' delimited data. > > Die,Row 0, Column 11 > Test Result,1 > Score,1 > PMark Score,0 > k Score,0 > Score,0 > > Defects,0 > > Mark Measurements,276 > Measurement > 0,0,8.030399,740.998413,21.542923,16.721996,817.562500,22.048611,881.06 > 2500, > 29.847174,11.604215,17.210899,685.522644,16.721996,0,0 > Measurement > 1,1,12.346605,804.399353,25.516476,8.607447,817.562500,8.607447,881.062 > 500,2 > 6.055706,28.836847,20.028336,748.923584,9.931009,0,0 > > open( FI, $file_path ) || die "unable to open $file_path $!\n"; > @file_data = <FI>; > close FI; That array assignment above "slups" the entire file into memory. Let's not do that. Remove it and the close line. > LINE: foreach $_ ( @file_data ) { Then we can make a simple change right here: LINE: while (< FI>) { This line reads one line and assigns it to $_, but we process each line as it comes in now instead of slurping huge files into memory. Much better. > if( /^Die,Row/ ) { > ($row, $col) = /(\d+)/g; > } Better would be: if (/^Die,Row (\d+), Column (\d+)/) { ($row, $col) = ($1, $2); } > if( /^PMark Measurements/ ) { > ($cnt) = /(\d+)/g; The if should probably be an elsif here, so it's only checked if the first if failed. In other words, its a Die,Row line or a Mark Measurements line, not both. I believe you also have a rogue P in that pattern. And again, better is: elsif (/^Mark Measurements,(\d+)/) { $cnt = $1; } Below this your code gets pretty confusing. Here are some thoughts: * use strict; at the top of the program. This forces you to declare your variables before you use them, making your code easier for us the read and thus help you with. Always, always, ALWAYS do this! * Why are we checking the line type again below? Let's do everything we need to do with a line before we move on. (The /^Measurement/ check below should be /^Mark Measurement/ too, I think.) * Yikes, while (<FI>) {, are we reading from a file handle we closed? Let's not do that. Try to rethink your logic (or explain it to us and let us help you rethink it) too handle one line at a time. That's a pretty good general rule for parsing. * Do you realize that @fields = split /,/, $_; Fills the fields array with all the values on a line that you could then walk through? WARNING: This only works if commas do not appear in the fields, but that looks true in your example data. Clean it up a bit. Help us read it and send it back to us if you're still having problems. James Gray > if( $cnt > $max ) { > $max = $cnt; > } > if( $cnt > 0 ) { > $row_col[ $jp++ ] = "$row,$col"; > while( <FI> ) { > if( /^Die,Row/ ) { > ($row, $col) = /(\d+)/g; > $row_col[ $jp++ ] = "$row,$col"; > next LINE; > } > $Z=0; > if( /^Measurement/ ) { > (@data) = split( /\,/ ); > if( $data[ 2 ] > 0 ) { > ($meas) = ($data[0]) =~ /(\d+)/g; > $data{ "$row,$col" }{ $meas } = $data[2]; > $data{ "$row,$col" }{ $meas }{ $data[1] } = $data[2]; > } > } > } > } > } > } > @file_data = (); > > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]