New submission from STINNER Victor <victor.stin...@haypocalc.com>:

open() uses the locale encoding in Python 3 when opening text file if the 
encoding argument is not specified (implicit). Some functions use locale 
encoding, but it's not the right encoding. I see at least three cases where the 
encoding should be changed:

 - UTF-8 should be used instead for portability: it's a bug in the module
 - ASCII must be used instead: the module doesn't support non-ASCII characters 
(old file formats, old network protocols, some fields of a document, etc.)
 - ASCII can be used instead: it's just a micro-optimization, the ASCII 
encoding is  a little bit faster

To detect the usage of the implicit locale encoding, some functions can be 
monkeypatched:

 - builtins.open, io.open, _pyio.open
 - io.TextIOWrapper, _pyio.TextIOWrapper
 - more functions using directly or indirectly open/TextIOWrapper may be 
patched to emit the warning earlier

Attached open_hook.patch implements these hooks (hacks?) in the site module: it 
emits a ResourceWarning. Use python -Werror to raise an error if the locale 
encoding is used implicitly. If you really want to use the locale encoding, use 
encoding='locale' to make quiet the warning.

Quite all functions in Python uses the implicit locale encoding. For example, 
Python doesn't start with the patch and -Werror. If you use -Werror, you have 
to patch *all* calls to open()/TextIOWrapper to be able to locate real bugs, or 
the program will stop before hitting the real problems. Each time you have to 
check what is the real expected encoding, it takes a lot of time.

I started this huge project. I'm using ASCII most of the time (especially in 
Python tests), I don't know if it's correct. It will require a second step to 
ensure that the function really don't use/support non-ASCII characters.

I will use this issue for my commits, attach patches, and more generally 
discuss this topic.

----------
components: Unicode
files: open_hook.patch
keywords: patch
messages: 139473
nosy: haypo
priority: normal
severity: normal
status: open
title: open: avoid the locale encoding when possible
versions: Python 3.3
Added file: http://bugs.python.org/file22520/open_hook.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12451>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to