On Dec 9, 10:00 am, ag4ve...@gmail.com (shawn wilson) wrote: > i decided to use another module to get my data but, i'm having a bit > of an issue with xpath. > > the data i want looks like this: > > <table class="someclass" style="width:508px;" id="Any_20"> > <tbody> > <tr> > <td>name</td> > <td>attribute</td> > > <td>name2</td> > <td>attribute2</td> > > <td>possible name3</td> > <td>possible attribute3</td> > > <td> > .... > </tr><tr> > more of the same format > > with this code, i'm only getting the first line of data (ie, <td> ... > </td>). i realize that i'm only getting the first and second td which > is fine, but how do i get multiple rows? i'm also grabbing the html > from a file so that i don't needlessly keep hitting up their web > server. > > #!/usr/bin/perl > > use strict; > use warnings; > > use LWP::UserAgent; > use LWP::Simple; > use Web::Scraper; > use Data::Dumper::Simple; > > my( $infile ) = $ARGV[ 0 ] =~ m/^([\ A-Z0-9_.-]+)$/ig; > > my $pagedata = scraper { > process '//*/tab...@class="someclass"]', 'table[]' => scraper { > process '//tr/td[1]', 'name' => 'TEXT'; > process '//tr/td[2]', 'attr' => 'TEXT'; > }; > > }; > > open( FILE, "< $infile" ); > > my $content = do { local $/; <FILE> }; > > my $res = $pagedata->scrape( $content ) > or die "Can't define content to parser $!"; > > print Dumper( $res );
I don't get XML::Scraper but, alternatively with XML::LibXML, a possible way: use XML::XPath; use XML::LibXML; my $parser = XML::LibXML->new; my $content = $parser->parse_file( $infile); my @nodes = $content->findnodes("//tabl...@class='someclass']/tbody/tr" ); foreach my $node ( @nodes ) { print XML::XPath::XMLParser::as_string($node); } output: <tr> <td>name</td> <td>attribute</td> <td>name2</td> <td>attribute2</td> <td>possible name3</td> <td>possible attribute3</td> </tr> -- Charles DeRykus -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/