Hello,

I am a Perl newcomer, and I'm trying to use the TokeParser module to extract
text from an HTML file. Here's the Perl code:

use HTML::TokeParser;
my $p = HTML::TokeParser->new("test.htm");
while ($p -> get_tag('b'))
    {
    print $p -> get_text(),"\n";
    }

This works only on bold tags that are not 'inside' other tags. For the
following HTML:

<html>
<body>
<h1>Head 1</h1>
<b>Bolded</b>
<p><b><u>Bolded and underlined</u></b></p>
<p>New line</p>
</body>
</html>

I only get a printout of "Bolded", but not "Bolded and underlined" as I
expect.

What could be going on?

Thanks!


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to