Beautiful; I should've known that brian d foy would have come up with a solution--I even have a copy of that book!
Thanks, --Marc On Thu, Apr 21, 2011 at 3:10 PM, Brian Fraser <frase...@gmail.com> wrote: > http://www.effectiveperlprogramming.com/blog/314 > > Brian. > > On Thu, Apr 21, 2011 at 2:42 PM, Marc Perry <marcperrys...@gmail.com>wrote: > >> Hi, >> >> I was parsing a collection of HTML files where I wanted to extract a >> certain >> block from each file, like this: >> >> > ./script.pl *.html >> >> my $accumulator; >> my $capture_counter; >> >> while ( <> ) { >> if ( /<h1>/.../labelsub/ ) { >> $accumulator .= $_ unless /labelsub/; >> if ( /labelsub/ && !$capture_counter ) { >> print $accumulator; >> $capture_counter = 1; >> } >> else { >> next; >> } >> } >> else { >> next; >> } >> } >> continue { # flush out the variables and clean up >> if ( eof ) { >> close ARGV; >> $accumulator = ''; >> $capture_counter = ''; >> } >> } >> >> The bit about the $capture_counter is because some of the files have >> multiple blocks of text that could be accumulated, and I only want the >> first >> block in the file. >> >> This usually works fine, until I encountered an input file that did not >> contain the string 'labelsub' after the first '<h1>' regex pattern match. >> Then the conditional if test continued to search in the incoming lines in >> the next file (because I am processing a whole batch using the while (<>) >> operator), which it eventually found, and then printed nothing, because at >> the end-of-file of the previous file, the script flushed the contents of >> the >> accumulator. >> >> One solution is to just run the same script individually on each file, but >> I >> was wondering if there was a way to reset the 'state' of the range >> operator >> pattern match at the end of the physical file (or at any other time for >> that >> matter)? >> >> Thanks, >> >> --Marc >> > >