Re: utf8 silly question

Jeff Epler Tue, 21 Jun 2005 11:36:35 -0700

If you want to work with unicode, then write    
    us = u"\N{COPYRIGHT SIGN} some text"
You can also write this as
    us = unichr(169) + u" some text"



When you have a Unicode string, you can convert it to a particular
encoding stored in a byte string with
    bs = us.encode("utf-8")


It's generally a mistake to use the .encode() method on a byte string,
but that's what code like
    bs = "\xa9 some text"
    bs = bs.encode("utf-8")
does.  It can lull you into believing it works, if the test data only
has US ASCII contents, then break when you go into production and have
non-ASCII strings.

Jeff

pgpPxBy1C6yly.pgp
Description: PGP signature

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: utf8 silly question

Reply via email to