On Fri, Aug 14, 2009 at 5:35 PM, Shawn H. Corey wrote:
> Roman Makurin wrote:
>>
>> dump result is html encoded entities:
>>
>> @0.1.5.1
>> > title="Ссылка ">@0.1.5.1.0
>>
>> all html entities are valid unicode code points of symbols. But why
>> HTML::TreeBuilder convert symbols to entities ?
>
>
Roman Makurin wrote:
dump result is html encoded entities:
@0.1.5.1
@0.1.5.1.0
all html entities are valid unicode code points of symbols. But why
HTML::TreeBuilder convert symbols to entities ?
Because some browsers do not understand Unicode. Or they didn't.
If I just do
print $conten
Hi All.
I have a problem with HTML::TreeBuilder. Here is sample code without any error
checking:
$ua = new LWP::UserAgent -timeout=>10;
$resp = $ua->get($url);
$content = decode('encoding_of_web_page', $resp->content);
decode_entities($content);
$r = HTML::TreeBuilder->new_from_content($conten