subject:"Encode exception for chinese text"

Re: Encode exception for chinese text

2006-05-19 Thread John Machin

MvL wrote: > Also, *by definition*, though :-) Ah yes, indeed; and thanks for reminding me. Aside: Similar definition, but not similar design: IMHO utf-8 sits on top of ASCII like a rose on a stalk, whereas gb18030 sits on top of gb2312 like a rhinoceros on a unicycle :-) Cheers, John -- http://

Re: Encode exception for chinese text

2006-05-19 Thread Martin v. Löwis

John Machin wrote: > 1. *By definition*, you can encode *any* Unicode string into utf-8. > Proves nothing. > 2. \u00a0 [no-break space] has no equivalent in gb2312, nor in the > later gbk alias cp936. It does have an equivalent in the latest Chinese > encoding, gb18030. Also, *by definition*, thou

Re: Encode exception for chinese text

2006-05-19 Thread Vinayakc

Hey Serge, john, Thank you very much. I was really not aware of these facts. Anyways this is happening only for one in millions so I can ignore this for now. Thanks again, Vinayakc -- http://mail.python.org/mailman/listinfo/python-list

Re: Encode exception for chinese text

2006-05-19 Thread Serge Orlov

Vinayakc wrote: > Yes serge, I have removed the first character but it is still giving > encoding exception. Then I guess this character was used as a poor man indentation tool at least in the beginning of your text. It's up to you to decide what to do with that character, you have several choices

Re: Encode exception for chinese text

2006-05-19 Thread John Machin

1. *By definition*, you can encode *any* Unicode string into utf-8. Proves nothing. 2. \u00a0 [no-break space] has no equivalent in gb2312, nor in the later gbk alias cp936. It does have an equivalent in the latest Chinese encoding, gb18030. 3. gb2312 is outdated. It is not really an "appropriate"

Re: Encode exception for chinese text

2006-05-19 Thread Vinayakc

Yes serge, I have removed the first character but it is still giving encoding exception. -- http://mail.python.org/mailman/listinfo/python-list

Re: Encode exception for chinese text

2006-05-19 Thread Serge Orlov

Vinayakc wrote: > Hi all, > > I am new to python. > > I have written one small application which reads data from xml file and > tries to encode data using apprpriate charset. > I am facing problem while encoding one chinese paragraph with charset > "gb2312". > > code is: > > encoded_str = str_data.

Re: Encode exception for chinese text

2006-05-19 Thread swordsp

Are you sure all the characters in original text are in "gb2312" charset? Encoding with "utf8" seems work for this character (u'\xa0'), but I don't know if the result is correct. Could you give a subset of str_data in unicode? -- http://mail.python.org/mailman/listinfo/python-list

Encode exception for chinese text

2006-05-19 Thread Vinayakc

Hi all, I am new to python. I have written one small application which reads data from xml file and tries to encode data using apprpriate charset. I am facing problem while encoding one chinese paragraph with charset "gb2312". code is: encoded_str = str_data.encode("gb2312") The type of str_da

Re: Encode exception for chinese text

Re: Encode exception for chinese text

Re: Encode exception for chinese text

Re: Encode exception for chinese text

Re: Encode exception for chinese text

Re: Encode exception for chinese text

Re: Encode exception for chinese text

Re: Encode exception for chinese text

Encode exception for chinese text

9 matches

Site Navigation

Mail list logo

Footer information