I'm trying to learn web scraping and am stopped at the basic point of
scraping a portion
of a web page. I'm able to scrape a full page and save it as *.xml or
*.htm, and I think
I understand regex, but the following fails:
**************
# Prints a portion of a red cross web page to a new htm file.
use strict;
use warnings;
use LWP::Simple;
use WWW::Mechanize;
my $url =
'http://www.redcrossnca.org/ServiceCenters/montgomery.php3';
getstore( $url, 'c://redcross.htm' );
open PAGE, 'c://redcross.htm';
while( my $line = <PAGE> ) {
$line =~ /Health and Safety Classes/
print "$1\n";
}
close PAGE;
********
Once I get the syntax straight I'll go after more detailed scrapes.
Ken
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>