On 16 Dec 2013, at 2:18 AM, peter <peterchutchin...@gmail.com> wrote:
> I am impressed at how helpful you have been on this Jonathon.
> 
> It does say in the mht file that it is windows-1252 encoded.
> 
> It turns out that 
>     s.decode('cp1252').encode('utf-8')
> 
> is working correctly. I mistakenly thought it was not
> 
> because I got this error 
> UnicodeEncodeError: 'charmap' codec can't encode character u'\u2018' in 
> position 193: character maps to <undefined>
> This was from a print statement. It turns out you get this error when trying 
> to print the left single quotation mark that is correcty coded in unicode. So 
> this is why I was having such problems. This error is presumably because the 
> print statement is working in dos mode, this character is not in the dos 
> character set. So using print to check out what is going on in python is not 
> a good idea when using unicode.
> 
> Obvious with hindsight, not so obvious without hindsight.
>  
> Thanks again Jonathon for all your support on this.
> 

You're welcome.

I'm sympathetic, having struggled to understand character encoding myself. 

FWIW, I've leaning more and more toward the Python 3 approach to text, decoding 
all my strings to unicode on input, encoding as needed on output. (Python 3's 
standardization on Unicode almost persuades me that Python 3 was a good idea.)

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to