Re: convert \uXXXX to native character set?

Bengt Richter Tue, 21 Dec 2004 15:05:04 -0800

On Mon, 20 Dec 2004 12:49:39 +0200, Miki Tebeka <[EMAIL PROTECTED]> wrote:


>Hello Joe,
>
>>     Is there any library to convert HTML page with \uXXXX encoded text to
>>    native character set, e.g. BIG5.
>Try: help("".decode)
>
But the OP wants to en-code, I think. E.g. (I don't know what Chinese for ichi 
is ;-)

 >>> ichi = u'\u4e00'
 >>> ichi
 u'\u4e00'
 >>> ichi.encode('big5')
 '\xa4@'

UIAM that created two str bytes constituting big5 code for
the single horizontal stroke glyph whose unicode code is u'\u4e00'

 >>> list(ichi.encode('big5'))
 ['\xa4', '@']

going from big5-encoded str back to unicode then takes de-coding:

 >>> '\xa4@'.decode('big5')
 u'\u4e00'

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: convert \uXXXX to native character set?

Reply via email to