Multiple Page Scrape

kc68 Tue, 06 Jun 2006 07:12:51 -0700

The script below scrapes a House of Representatives vote page which is inxml and saves it in a spreadsheet which is best opened as an xls readonly. How can I:

1) scrape multiple vote pages into individual spreadsheets with a singlescript?

2) Only scrape columns C, F, G, H in the result here? I'd also prefer tohave the spreadsheet as a csv, but that doesn't work by just changing*.xls to *.csv Thanks in advance.


Ken

#!/bin/perl

use strict;
use warnings;

use WWW::Mechanize;

my $output_dir = "c:/training/bc";

my $starting_url = "http://clerk.house.gov/evs/2005/roll667.xml";;

my $browser = WWW::Mechanize->new();

$browser->get( $starting_url );

foreach my $line (split(/[\n\r]+/, $browser->content)) { print $line;}

open OUT, ">$output_dir/vote667.xls" or die "Can't open file:$!";

foreach my $line (split(/[\n\r]+/, $browser->content)) {

print OUT "$line";}

close OUT;



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Multiple Page Scrape

Reply via email to