[issue10370] py3 readlines() reports wrong offset for UnicodeDecodeError

Brian Warner Mon, 08 Nov 2010 17:29:36 -0800

New submission from Brian Warner <war...@users.sourceforge.net>:

I noticed that the UnicodeDecodeError exception produced by trying to do 
open(fn).readlines() (i.e. using the default ASCII encoding) on a file that's 
actually UTF-8 reports the wrong offset for the first undecodeable character. 
From what I can tell, it reports (offset%4096) instead of the actual offset.


I've attached a test case. It emits "all good" when run against py2.x (well, 
after converting the print() expressions back into statements), but reports an 
error at offset 4096 (reported as "0") on py3.1.2 and py3.2a3 . I'm running on 
a debian (sid) x86 box.

The misreported offset does not occur with read(), just with readlines().

----------
components: IO
files: test.py
messages: 120830
nosy: warner
priority: normal
severity: normal
status: open
title: py3 readlines() reports wrong offset for UnicodeDecodeError
type: behavior
versions: Python 3.1, Python 3.2
Added file: http://bugs.python.org/file19552/test.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10370>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10370] py3 readlines() reports wrong offset for UnicodeDecodeError

Reply via email to