Very Basic Web Scrape

kc68 Fri, 07 Apr 2006 13:15:43 -0700

I'm trying to learn web scraping and am stopped at the basic point ofscraping a portionof a web page. I'm able to scrape a full page and save it as *.xml or*.htm, and I think

I understand regex, but the following fails:


**************
# Prints a portion of a red cross web page to a new htm file.

use strict;

use warnings;

use LWP::Simple;

use WWW::Mechanize;

my $url =

'http://www.redcrossnca.org/ServiceCenters/montgomery.php3';

getstore( $url, 'c://redcross.htm' );

open PAGE, 'c://redcross.htm';
while( my $line = <PAGE> ) {
$line =~ /Health and Safety Classes/
print "$1\n";
}

close PAGE;
********

Once I get the syntax straight I'll go after more detailed scrapes.

Ken

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Very Basic Web Scrape

Reply via email to