Dan Muey wrote:
>> I am trying to use HTML::TokeParser
>> From the cpan page for this I used this example :
>>
>> while (my $token = $p->get_tag("a")) {
>> my $url = $token->[1]{href} || "-";
>> my $text = $p->get_trimmed_text("/a");
>> print "$url\t$text \n";
>> }
>>
>> Worked great ::
>> So I tried to do something similar with the img tag ::
>>
>> while (my $token = $p->get_tag("img")) {
>> my $src = $token->[1]{src} || "-";
>> my $alt = $token->[1]{alt} || "-";
>> print "$src\t$alt\n";
>> }
>>
>> and I get nothing, even thought I know there are lots of image tags
>
> I tried commenting out the 'a'; version and the 'img' version worked!
> So both chunks of code work they just don't work if you run then back
> to back.
> I tried undef $token;
> I tried using different names for the tokens ( $token and $tokenq
> respectively.
> I tried removing 'my' from before $token and basically it seems that
> you can only get results from get_tag once.
>
> Is there any way to reset this so that I can do both chunks of code
> above,
> one after the other, IE call $p->get_tag() more than once?
Hi Dan.
The HTML::TokeParser constructor will take a filehandle as its
parameter, so you can do this:
use strict;
use warnings;
use HTML::TokeParser;
use Fcntl qw(:seek); # to import the SEEK_SET constant
open my $html, '<', 'sample.htm' or die $!;
my $p = new HTML::TokeParser($html);
while (my $token = $p->get_tag("a")) {
my $url = $token->[1]{href} || "-";
my $text = $p->get_trimmed_text("/a");
print "$url\t$text \n";
}
seek $html, 0, SEEK_SET; # rewind to start of file
$p = new HTML::TokeParser($html);
while (my $token = $p->get_tag("img")) {
my $src = $token->[1]{src} || "-";
my $alt = $token->[1]{alt} || "-";
print "$src\t$alt\n";
}
close $html;
HTH,
Rob
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]