the same strings, different utf-8 repr values?
I have two files: test.py: -- # -*- encoding : utf8 -*- print 'in this file', repr('中文') # tt.txt is saved as utf8 encoding f = file('tt.txt') line1 = f.readline().strip() print 'another file', repr(line1) --- tt.txt: 中文 test --- run test.py and I get the following output: in this file '\xe4\xb8\xad\xe6\x96\x87' another file '\xef\xbb\xbf\xe4\xb8\xad\xe6\x96\x87' and I cann't encode line1 like: line1.decode('utf8').encode('gbk') get this error: UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in position 0: illegal multibyte sequence why did I get the different repr values? -- http://mail.python.org/mailman/listinfo/python-list
Re: the same strings, different utf-8 repr values?
got it. thanks. John Machin 写道: > [EMAIL PROTECTED] wrote: > > I have two files: > > > > test.py: > > -- > > # -*- encoding : utf8 -*- > > print 'in this file', repr('中文') > > > > # tt.txt is saved as utf8 encoding > > f = file('tt.txt') > > line1 = f.readline().strip() > > print 'another file', repr(line1) > > --- > > > > tt.txt: > > > > 中文 > > test > > --- > > run test.py and I get the following output: > > in this file '\xe4\xb8\xad\xe6\x96\x87' > > another file '\xef\xbb\xbf\xe4\xb8\xad\xe6\x96\x87' > > > > and I cann't encode line1 like: > >line1.decode('utf8').encode('gbk') > > get this error: > > UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in > > position 0: > > illegal multibyte sequence > > > > why did I get the different repr values? > > Because whatever you used to "save as" that file has retained or > inserted a BOM (byte order mark, U+FEFF) at the start of the file > before encoding as UTF-8. It's the '\xef\xbb\xbf' at the start of the > file, and also the u'\ufeff' that is giving the gbk codec indigestion. > You can remove it in your script. > > HTH > John -- http://mail.python.org/mailman/listinfo/python-list
[OT]Could anyone send me a copy of "timeout sockets for jython"
The information about this module: http://www.xhaus.com/alan/python/timeout.html I can't access the download url due to the severe network issue these days, and I need to use this module for work. Could anyone do me a favor to send a copy? the download url : http://cvs.sourceforge.net/viewcvs.py/jython/jython/Lib/socket.py?rev=1.16&view=log Thanks. -- http://mail.python.org/mailman/listinfo/python-list