Hi, I tried using seek to reverse a text file after reading about the subject in the documentation:
https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects https://docs.python.org/3/library/io.html#io.TextIOBase.seek The script "reverse_text_by_seek3.py" produces expected result on a UTF-8 encoded text file "Moon-utf8.txt" (several lines of Chinese characters): $ ./reverse_text_by_seek3.py Moon-utf8.txt [0, 10, 11, 27, 28, 44, 60, 76, 92] 低头思故乡 举头望明月 疑似地上霜 床前明月光 李白(唐) 静夜思 or $ ./reverse_text_by_seek3.py Moon-utf8.txt seek [0, 10, 11, 27, 28, 44, 60, 76, 92] 低头思故乡 举头望明月 疑似地上霜 床前明月光 李白(唐) 静夜思 However, an exception is raised if a file with the same content encoded in GBK is provided: $ ./reverse_text_by_seek3.py Moon-gbk.txt [0, 7, 8, 19, 21, 32, 42, 53, 64] 低头思故乡 举头望明月 Traceback (most recent call last): File "./reverse_text_by_seek3.py", line 21, in <module> print(f.readline(), end="") UnicodeDecodeError: 'gbk' codec can't decode byte 0xaa in position 8: illegal multibyte sequence While everything works fine again when a seek operation is applied after each readline invocation: $ ./reverse_text_by_seek3.py Moon-gbk.txt seek [0, 7, 8, 19, 20, 31, 42, 53, 64] 低头思故乡 举头望明月 疑似地上霜 床前明月光 李白(唐) 静夜思 Some of the printed positions are also different. A python2 counterpart "reverse_text_by_seek2.py" is written, which decodes the lines upon printing instead of reading, no exception occurs. It's just fun doing this, not for anything useful. Can anyone reproduce the above results? What's really happening here? Is it a bug? Other information: Distribution: Arch Linux Python3 package: 3.4.3-2 (official) Python2 package: 2.7.10-1 (official) $ uname -rvom 4.1.2-2-ARCH #1 SMP PREEMPT Wed Jul 15 08:30:32 UTC 2015 x86_64 GNU/Linux $ env | grep -e LC -e LANG LC_ALL=en_US.UTF-8 LC_COLLATE=C LANG=en_US.UTF-8
reverse_text_by_seek3.py
Description: Binary data
¾²Ò¹Ë¼ Àî°×£¨ÌÆ£© ´²Ç°Ã÷Ô¹â ÒÉËƵØÉÏ˪ ¾ÙÍ·ÍûÃ÷Ô µÍͷ˼¹ÊÏç
éå¤æ æç½ï¼åï¼ åºåææå çä¼¼å°ä¸é 举头æææ ä½å¤´ææ 乡
reverse_text_by_seek2.py
Description: Binary data
-- https://mail.python.org/mailman/listinfo/python-list