Hi All,
 
I'm trying to only get the text from w/in a certain table in the HTML
source.   Right now I am getting all the text in the source.  Here is my
script.... I made notes in the script.
 
 #!/usr/bin/perl -w
 
 use HTML::TokeParser::Simple;
 use LWP::Simple;
 
 my $url = " http://www.kcprofessional.com/us/product-details.asp?search=v1
<http://www.kcprofessional.com/us/product-details.asp?search=v1&searchtext=0
1970&x=0&y=0> &searchtext=01970&x=0&y=0";
 my $page = get($url) ||
  die "Could not load URL\n";
 
 # new() takes either a file name, or a reference to a string
 # that contains the HTML document
 
 open LGDESC, "> largedecs.txt" 
 or die "Cannot open largedecs.txt for writing: $!";
 
 my $parser = HTML::TokeParser->new(\$page) ||
  die "Could not parse page";
 
# ---------- with this I get Use of uninitialized value in print..  I think
LWP::Simple is returning that? ---------
 
# my $tag = $parser->get_tag("table") foreach (1..28);
# my $attr = $tag->[1]->{"colspan"};
# print $tag->[0], ":",  $attr, "\n";

# ---------- I have get_attr # out because I get  can't locate in @INC --
(using ActiveState Perl)

 while ($parser->get_tag("tr")) {
  $parser->get_tag("td");
# $parser->get_attr("colspan");
  print LGDESC $parser->get_text(), "\n";
 }
 
 close LGDESC;
# ----------- end --------------
 
Can someone please help me to narrow down my returned text,
 
 
Brian Volk
HP Products
317.298.9950 x1245
 <mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED]
 
 

Reply via email to