Can I get the 8bit-string representation of any unicode string

wanghz Sun, 12 Feb 2006 07:16:33 -0800

Hello, everyone.

I have a problem when I'm processing unicode strings.  Is it possible
to get the 8bit-string representation of any unicode string?


Suppose I get a unicode string:
  a = u'\xc8\xce\xcf\xcd\xc6\xeb';
then, by
  a.encode('latin-1');
I can get the 8bit-string representation of it, that is, the physical
storage format of this string.

But for another kind of unicode string, say:
  b = u'\u4efb\u8d24\u9f50';
I have to:
  b.encode('utf-8')
to get the 8bit-string format of it.

Since these unicode strings are given by an external library function,
I don't know which kind a unicode string belongs to before I get it at
runtime.  So, I wonder if there is a unified way to get the 8bit-string
representation, say, byte-by-byte, of any unicode string?

Thank you very much.

-- 
http://mail.python.org/mailman/listinfo/python-list

Can I get the 8bit-string representation of any unicode string

Reply via email to