Re: string to unicode

Tim Roberts Tue, 16 Aug 2011 17:38:17 -0700

Artie Ziff <artie.z...@gmail.com> wrote:
>
>if I am using the standard csv library to read contents of a csv file 
>which contains Unicode strings (short example: 
>'\xe8\x9f\x92\xe8\x9b\x87'),


You need to be rather precise when talking about this.  That's not a
"Unicode string" in Python terms.  It's an 8-bit string.  It might be UTF-8
encoding.  If so, it maps to two Unicode code points, U+87D2 and U+86C7,
which are both CJK ideograms.  Is that what you expected?

  C:\Dev\videology\sw\viewer>python
  Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
(Intel)] on win32
  Type "help", "copyright", "credits" or "license" for more information.
  >>> x = '\xe8\x9f\x92\xe8\x9b\x87'
  >>> x.decode('utf8')
  u'\u87d2\u86c7'
-- 
Tim Roberts, t...@probo.com
Providenza & Boekelheide, Inc.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: string to unicode

Reply via email to