web::scraper xpath

shawn wilson Thu, 09 Dec 2010 10:01:09 -0800

i decided to use another module to get my data but, i'm having a bit
of an issue with xpath.


the data i want looks like this:

<table class="someclass" style="width:508px;" id="Any_20">
 <tbody>
  <tr>
   <td>name</td>
   <td>attribute</td>

   <td>name2</td>
   <td>attribute2</td>

   <td>possible name3</td>
   <td>possible attribute3</td>

   <td>
....
   </tr><tr>
more of the same format


with this code, i'm only getting the first line of data (ie, <td> ...
</td>). i realize that i'm only getting the first and second td which
is fine, but how do i get multiple rows? i'm also grabbing the html
from a file so that i don't needlessly keep hitting up their web
server.

#!/usr/bin/perl

use strict;
use warnings;


use LWP::UserAgent;
use LWP::Simple;
use Web::Scraper;
use Data::Dumper::Simple;

my( $infile ) = $ARGV[ 0 ] =~ m/^([\ A-Z0-9_.-]+)$/ig;

my $pagedata = scraper {
   process '//*/tab...@class="someclass"]', 'table[]' => scraper {
      process '//tr/td[1]', 'name' => 'TEXT';
      process '//tr/td[2]', 'attr' => 'TEXT';
   };
};


open( FILE, "< $infile" );

my $content = do { local $/; <FILE> };

   my $res = $pagedata->scrape( $content )
      or die "Can't define content to parser $!";

print Dumper( $res );

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

web::scraper xpath

Reply via email to