the same strings, different utf-8 repr values?

2006-09-07 Thread slowness . chen
I have two files:

test.py:
--
# -*- encoding : utf8 -*-
print 'in this file', repr('中文')

# tt.txt is saved as utf8 encoding
f = file('tt.txt')
line1 = f.readline().strip()
print 'another file', repr(line1)
---

tt.txt:

中文
test
---
run test.py and I get the following output:
in this file '\xe4\xb8\xad\xe6\x96\x87'
another file '\xef\xbb\xbf\xe4\xb8\xad\xe6\x96\x87'

and I cann't encode line1 like:
   line1.decode('utf8').encode('gbk')
get this error:
UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in
position 0:
illegal multibyte sequence

why did I get the different repr values?

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: the same strings, different utf-8 repr values?

2006-09-07 Thread Slowness Chen
got it. thanks.
John Machin 写道:

> [EMAIL PROTECTED] wrote:
> > I have two files:
> >
> > test.py:
> > --
> > # -*- encoding : utf8 -*-
> > print 'in this file', repr('中文')
> >
> > # tt.txt is saved as utf8 encoding
> > f = file('tt.txt')
> > line1 = f.readline().strip()
> > print 'another file', repr(line1)
> > ---
> >
> > tt.txt:
> > 
> > 中文
> > test
> > ---
> > run test.py and I get the following output:
> > in this file '\xe4\xb8\xad\xe6\x96\x87'
> > another file '\xef\xbb\xbf\xe4\xb8\xad\xe6\x96\x87'
> >
> > and I cann't encode line1 like:
> >line1.decode('utf8').encode('gbk')
> > get this error:
> > UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in
> > position 0:
> > illegal multibyte sequence
> >
> > why did I get the different repr values?
>
> Because whatever you used to "save as" that file has retained or
> inserted a BOM (byte order mark, U+FEFF) at the start of the file
> before encoding as UTF-8. It's the '\xef\xbb\xbf' at the start of the
> file, and also the u'\ufeff' that is giving the gbk codec indigestion.
> You can remove it in your script.
> 
> HTH
> John

-- 
http://mail.python.org/mailman/listinfo/python-list

[OT]Could anyone send me a copy of "timeout sockets for jython"

2007-01-24 Thread Slowness Chen
The information about this module:
http://www.xhaus.com/alan/python/timeout.html

I can't access the download url due to the severe network issue these
days, and I need to use this module for work.
Could anyone do me a favor to send a copy? the download url :

http://cvs.sourceforge.net/viewcvs.py/jython/jython/Lib/socket.py?rev=1.16&view=log

Thanks.

-- 
http://mail.python.org/mailman/listinfo/python-list