Thanks, but complicated for true beginners. First issue was which of
three choices was XML::Simple - I chose to install XML-Simple-DTD Reader
over XML-Simpler or Test-XML-Simple. I later read that XML::Simple
probably comes with active Perl.
Then I read the FAQ for XML::Simple and found that "Although you can get
by without using any options, you shouldn't even consider using
XML::Simple in production until you know what these two options do:
forcearray keyattr"
I'm starting to understand hashes, but sample code would help. Thank you.
Ken
****************
On Tue, 06 Jun 2006 11:58:21 -0400, Anthony Ettinger
<[EMAIL PROTECTED]> wrote:
Since it's native xml format, I would use XML::Simple to parse it into
a hash, then you can format however you want by looping through the
hash.
On 6/6/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
The script below scrapes a House of Representatives vote page which is
in
xml and saves it in a spreadsheet which is best opened as an xls read
only. How can I:
1) scrape multiple vote pages into individual spreadsheets with a single
script?
2) Only scrape columns C, F, G, H in the result here? I'd also prefer
to
have the spreadsheet as a csv, but that doesn't work by just changing
*.xls to *.csv Thanks in advance.
Ken
#!/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $output_dir = "c:/training/bc";
my $starting_url = "http://clerk.house.gov/evs/2005/roll667.xml";
my $browser = WWW::Mechanize->new();
$browser->get( $starting_url );
foreach my $line (split(/[\n\r]+/, $browser->content)) { print $line;}
open OUT, ">$output_dir/vote667.xls" or die "Can't open file:$!";
foreach my $line (split(/[\n\r]+/, $browser->content)) {
print OUT "$line";}
close OUT;
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>