Re: Python 2.1 / 2.3: xreadlines not working with codecs.open

Eric Brunel Tue, 28 Jun 2005 05:35:31 -0700

On Thu, 23 Jun 2005 14:23:34 +0200, Eric Brunel <[EMAIL PROTECTED]> wrote:


> Hi all,
>
> I just found a problem in the xreadlines method/module when used with 
> codecs.open: the codec specified in the open does not seem to be taken into 
> account by xreadlines which also returns byte-strings instead of unicode 
> strings.
>
> For example, if a file foo.txt contains some text encoded in latin1:
>
>>>> import codecs
>>>> f = codecs.open('foo.txt', 'r', 'utf-8', 'replace')
>>>> [l for l in f.xreadlines()]
> ['\xe9\xe0\xe7\xf9\n']
>
> But:
>
>>>> import codecs
>>>> f = codecs.open('foo.txt', 'r', 'utf-8', 'replace')
>>>> f.readlines()
> [u'\ufffd\ufffd']
>
> The characters in latin1 are correctly "dumped" with readlines, but are still 
> in latin1 encoding in byte-strings with xreadlines.

Replying to myself. One more funny thing:

>>> import codecs, xreadlines
>>> f = codecs.open('foo.txt', 'r', 'utf-8', 'replace')
>>> [l for l in xreadlines.xreadlines(f)]
[u'\ufffd\ufffd']

So f.xreadlines does not work, but xreadlines.xreadlines(f) does. And this 
happens in Python 2.3, but also in Python 2.1, where the implementation for 
f.xreadlines() calls xreadlines.xreadlines(f) (?!?). Something's escaping me 
here... Reading the source didn't help.

At least, it does provide a workaround...
-- 
python -c "print ''.join([chr(154 - ord(c)) for c in 
'U(17zX(%,5.zmz5(17;8(%,5.Z65\'*9--56l7+-'])"
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python 2.1 / 2.3: xreadlines not working with codecs.open

Reply via email to