Dan Muey wrote: >> I am trying to use HTML::TokeParser >> From the cpan page for this I used this example : >> >> while (my $token = $p->get_tag("a")) { >> my $url = $token->[1]{href} || "-"; >> my $text = $p->get_trimmed_text("/a"); >> print "$url\t$text \n"; >> } >> >> Worked great :: >> So I tried to do something similar with the img tag :: >> >> while (my $token = $p->get_tag("img")) { >> my $src = $token->[1]{src} || "-"; >> my $alt = $token->[1]{alt} || "-"; >> print "$src\t$alt\n"; >> } >> >> and I get nothing, even thought I know there are lots of image tags > > I tried commenting out the 'a'; version and the 'img' version worked! > So both chunks of code work they just don't work if you run then back > to back. > I tried undef $token; > I tried using different names for the tokens ( $token and $tokenq > respectively. > I tried removing 'my' from before $token and basically it seems that > you can only get results from get_tag once. > > Is there any way to reset this so that I can do both chunks of code > above, > one after the other, IE call $p->get_tag() more than once?
Hi Dan. The HTML::TokeParser constructor will take a filehandle as its parameter, so you can do this: use strict; use warnings; use HTML::TokeParser; use Fcntl qw(:seek); # to import the SEEK_SET constant open my $html, '<', 'sample.htm' or die $!; my $p = new HTML::TokeParser($html); while (my $token = $p->get_tag("a")) { my $url = $token->[1]{href} || "-"; my $text = $p->get_trimmed_text("/a"); print "$url\t$text \n"; } seek $html, 0, SEEK_SET; # rewind to start of file $p = new HTML::TokeParser($html); while (my $token = $p->get_tag("img")) { my $src = $token->[1]{src} || "-"; my $alt = $token->[1]{alt} || "-"; print "$src\t$alt\n"; } close $html; HTH, Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]