Wesley Bresson wrote: > > I'm pretty new to Perl, my past experience has been in modifying other > peoples code in order to do what I want it to do but now I'm trying to > write > my own to do a specific task that I can't find code for and am having > issues. I am trying to retrieve data from a webpage, say > http://www.apmex.com/shop/buy/Silver_American_Eagles.asp?orderid=0 for > example, the price of a 2006 1oz Silver American Eagle in the 20-99 price > break quantity. Should I use Regex to do that or would I be better off with > HTML::Parser ? I've attemped Regex since I seem to understand it better but > haven't had much success it getting it to pull the right price. > HTML::Parser > I understand even less than Regex but I've read that its a more reliable > way > of pulling webpage data ? I can't seem to find "easy" to understand > documentation on it though so I'm even farther away from getting it to work > then Regex, Any advice ?
Two Web questions in one day! It's hard to know exactly how you're going to your code Wesley, but the stuff below should be a good starter. It pulls in the web site and parses it using HTML::TreeBuilder. It looks for all table row <tr> elements that contain exactly five table data <td> elements, which is all the item details plus a few stragglers. The real item data has an item number in the format #9999 in the second <td> element, so ignore everything that's not like that. Finally the description and price are pulled from the relevant elements, and the numeric price value extracted with a regex. Everything that falls within your price bracket is then printed. I didn't restrict it to 2006 stuff as there weren't any at the time I wrote this, but it's easy to see how to do it I hope. HTH, Rob use strict; use warnings; use LWP::Simple; use HTML::TreeBuilder; my $html = get 'http://www.apmex.com/shop/buy/Silver_American_Eagles.asp?orderid=0'; my $tree = HTML::TreeBuilder->new_from_content($html); my @tr = $tree->find_by_tag_name('tr'); foreach my $tr (@tr) { my @td = $tr->find_by_tag_name('td'); next unless @td == 5; my ($number, $desc, $price) = map $_->as_trimmed_text, @td[1, 2, 4]; next unless $number =~ /#\d+/; my ($dollars) = $price =~ /\$([\d\.]+)/; next unless $dollars >= 20 and $dollars < 100; print $desc, "\n", $price, "\n\n" } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>