New submission from STINNER Victor <victor.stin...@haypocalc.com>:

In Python3, the following pattern becomes common:

        with open(fullname, 'rb') as fp:
            coding, line = tokenize.detect_encoding(fp.readline)
        with open(fullname, 'r', encoding=coding) as fp:
            ...

It opens the file is opened twice, whereas it is unnecessary: it's possible to 
reuse the raw buffer to create a text file. And I don't like the 
detect_encoding() API: pass the readline function is not intuitive.

I propose to create tokenize.open_python() function with a very simple API: 
just one argument, the filename. This function calls detect_encoding() and only 
open the file once.

Attached python adds the function with an unit test and a patch on the 
documentation. It patchs also functions currently using detect_encoding().

open_python() only supports read mode. I suppose that it is enough.

----------
components: Library (Lib), Unicode
files: open_python.patch
keywords: patch
messages: 120600
nosy: haypo
priority: normal
severity: normal
status: open
title: tokenize.open_python(): open a Python file with the right encoding
versions: Python 3.2
Added file: http://bugs.python.org/file19518/open_python.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10335>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to