Re: HTML::TreeBuilder encode symbols as html entities

2009-08-14 Thread Roman Makurin
On Fri, Aug 14, 2009 at 5:35 PM, Shawn H. Corey wrote: > Roman Makurin wrote: >> >> dump result is html encoded entities: >> >> @0.1.5.1 >>  > title="Ссылка ">@0.1.5.1.0 >> >> all html entities are valid unicode code points of symbols. But why >> HTML::TreeBuilder convert symbols to entities ? > >

Re: HTML::TreeBuilder encode symbols as html entities

2009-08-14 Thread Shawn H. Corey
Roman Makurin wrote: dump result is html encoded entities: @0.1.5.1 @0.1.5.1.0 all html entities are valid unicode code points of symbols. But why HTML::TreeBuilder convert symbols to entities ? Because some browsers do not understand Unicode. Or they didn't. If I just do print $conten

HTML::TreeBuilder encode symbols as html entities

2009-08-14 Thread Roman Makurin
Hi All. I have a problem with HTML::TreeBuilder. Here is sample code without any error checking: $ua = new LWP::UserAgent -timeout=>10; $resp = $ua->get($url); $content = decode('encoding_of_web_page', $resp->content); decode_entities($content); $r = HTML::TreeBuilder->new_from_content($conten