Problem with HTML::TreeBuilder and look_down()

Craig Mon, 14 Jun 2010 16:04:52 -0700

Hello All,

I'm new to Perl, having only a week or twos experience, but
experienced in other programming languages.


I'm writing a script that will read a html file from disc and print
the relevant parts for me. I have many html files all of them have a
similar format, but some format variations are causing me problems.
Each interesting part of the file is in a <p> tag, there are many in
each file. I'm using right() to navigate my way though the list of <p>
tags and then grab the interesting bits with regex.

This works fine for most files but there is a problem with this html
segment:
<p align="left"> In the second picture you have both a header tank
bracket for a 4 cylinder, and for a V6. <br /><br />You got lucky,
free parts from GTM! :-)<br /><br />Bye,<br /><br />Bertram<br /><br /
><b><font color="blue"><h6>Audi Turbo power: PROPER BO! </b></font
id="blue"></h6><br /><br /><img src="member_images/
Bertram_Bakker_geelklein.jpg" border="0"><br /></p>
    <hr></p>

When printed with as_HTML() I have:
In the second picture you have both a header tank bracket for a 4
cylinder, and for a V6. <br /><br />You got lucky, free parts from
GTM! :-)<br /><br />Bye,<br /><br />Bertram<br /><br /><b><font
color="blue"></font></b>

The leading <p... is normally removed, so I'm happy with that but some
later tags are missing or re-ordered:
<font color="blue"><h6>Audi Turbo power: PROPER BO! </b></font
id="blue"></h6><br /><br /><img src="member_images/
Bertram_Bakker_geelklein.jpg" border="0"><br /></p>
    <hr>
v.s
<font color="blue"></font></b>

Can anyone give me a clue as to why this is happening?

Thanks,
Craig


-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Problem with HTML::TreeBuilder and look_down()

Reply via email to