...
Two Web questions in one day! It's hard to know exactly how you're going to your code Wesley, but the stuff below should be a good starter. It pulls in the web site and parses it using HTML::TreeBuilder. It looks for all table row <tr> elements that contain exactly five table data <td> elements, which is all the item details plus a few stragglers. The real item data has an item number in the format #9999 in the second <td> element, so ignore everything that's not like that. Finally the description and price are pulled from the relevant elements, and the numeric price value extracted with a regex. Everything that falls within your price bracket is then printed. I didn't restrict it to 2006 stuff as there weren't any at the time I wrote this, but it's easy to see how to do it I hope.

HTH,

Rob


use strict;
use warnings;

use LWP::Simple;
use HTML::TreeBuilder;

my $html = get 'http://www.apmex.com/shop/buy/Silver_American_Eagles.asp?orderid=0';

my $tree = HTML::TreeBuilder->new_from_content($html);

my @tr = $tree->find_by_tag_name('tr');

foreach my $tr (@tr) {

  my @td = $tr->find_by_tag_name('td');
  next unless @td == 5;

  my ($number, $desc, $price) = map $_->as_trimmed_text, @td[1, 2, 4];
  next unless $number =~ /#\d+/;

  my ($dollars) = $price =~ /\$([\d\.]+)/;
  next unless $dollars >= 20 and $dollars < 100;

  print $desc, "\n", $price, "\n\n"
}

Thanks for your example script using HTML::Treebuilder, however I'm trying to figure out why it appears to grab some items but not others. I've removed the $20-100 limitation (I didn't need it, I really just need to poll one item) but am still missing some of the items. For example, the most obvious, are the 2 1986-2006 eagle at the top of the page, the script grabs one but not the other, any idea why ? Does it have to do with it looking for the 5 td's ?



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to