New submission from STINNER Victor <[email protected]>:
In Python3, the following pattern becomes common:
with open(fullname, 'rb') as fp:
coding, line = tokenize.detect_encoding(fp.readline)
with open(fullname, 'r', encoding=coding) as fp:
...
It opens the file is opened twice, whereas it is unnecessary: it's possible to
reuse the raw buffer to create a text file. And I don't like the
detect_encoding() API: pass the readline function is not intuitive.
I propose to create tokenize.open_python() function with a very simple API:
just one argument, the filename. This function calls detect_encoding() and only
open the file once.
Attached python adds the function with an unit test and a patch on the
documentation. It patchs also functions currently using detect_encoding().
open_python() only supports read mode. I suppose that it is enough.
----------
components: Library (Lib), Unicode
files: open_python.patch
keywords: patch
messages: 120600
nosy: haypo
priority: normal
severity: normal
status: open
title: tokenize.open_python(): open a Python file with the right encoding
versions: Python 3.2
Added file: http://bugs.python.org/file19518/open_python.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com