I'm trying to learn web scraping and am stopped at the basic point of scraping a portion of a web page. I'm able to scrape a full page and save it as *.xml or *.htm, and I think
I understand regex, but the following fails:

**************
# Prints a portion of a red cross web page to a new htm file.

use strict;

use warnings;

use LWP::Simple;

use WWW::Mechanize;

my $url =

'http://www.redcrossnca.org/ServiceCenters/montgomery.php3';

getstore( $url, 'c://redcross.htm' );

open PAGE, 'c://redcross.htm';
while( my $line = <PAGE> ) {
$line =~ /Health and Safety Classes/
print "$1\n";
}

close PAGE;
********

Once I get the syntax straight I'll go after more detailed scrapes.

Ken

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to