...
Two Web questions in one day! It's hard to know exactly how you're going
to your
code Wesley, but the stuff below should be a good starter. It pulls in the
web
site and parses it using HTML::TreeBuilder. It looks for all table row
<tr>
elements that contain exactly five table data <td> elements, which is all
the
item details plus a few stragglers. The real item data has an item number
in the
format #9999 in the second <td> element, so ignore everything that's not
like
that. Finally the description and price are pulled from the relevant
elements,
and the numeric price value extracted with a regex. Everything that falls
within
your price bracket is then printed. I didn't restrict it to 2006 stuff as
there
weren't any at the time I wrote this, but it's easy to see how to do it I
hope.
HTH,
Rob
use strict;
use warnings;
use LWP::Simple;
use HTML::TreeBuilder;
my $html = get
'http://www.apmex.com/shop/buy/Silver_American_Eagles.asp?orderid=0';
my $tree = HTML::TreeBuilder->new_from_content($html);
my @tr = $tree->find_by_tag_name('tr');
foreach my $tr (@tr) {
my @td = $tr->find_by_tag_name('td');
next unless @td == 5;
my ($number, $desc, $price) = map $_->as_trimmed_text, @td[1, 2, 4];
next unless $number =~ /#\d+/;
my ($dollars) = $price =~ /\$([\d\.]+)/;
next unless $dollars >= 20 and $dollars < 100;
print $desc, "\n", $price, "\n\n"
}
Thanks for your example script using HTML::Treebuilder, however I'm trying
to figure out why it appears to grab some items but not others. I've removed
the $20-100 limitation (I didn't need it, I really just need to poll one
item) but am still missing some of the items. For example, the most obvious,
are the 2 1986-2006 eagle at the top of the page, the script grabs one but
not the other, any idea why ? Does it have to do with it looking for the 5
td's ?
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>