subject:"HTML\:\:TokeParser munging characters"

Re: HTML::TokeParser munging characters

2013-12-28 Thread Lars Noodén

On 12/28/2013 05:52 PM, Shawn Wilson wrote: > The parser has done what its supposed to. IDK you can alter the > encoding in it. Maybe you can and that's what you're looking for > (encoding or character set). I'd first try binmode UTF-8 but you'll > probably just end up handling this with a regex.

Re: HTML::TokeParser munging characters

2013-12-28 Thread Shawn Wilson

The parser has done what its supposed to. IDK you can alter the encoding in it. Maybe you can and that's what you're looking for (encoding or character set). I'd first try binmode UTF-8 but you'll probably just end up handling this with a regex. "Lars Noodén" wrote: >If there is a better list

HTML::TokeParser munging characters

2013-12-28 Thread Lars Noodén

If there is a better list for discussing HTML::TokeParser, I can post there. I have a code snippet which successfully extracts a piece of a web page. However, something goes south with the conversion to text. What should come out as the following Temperature 3.2°C Humidity 94%