New submission from Serhiy Storchaka <storchaka+cpyt...@gmail.com>:

Many tests use open() with the locale encoding for writing or reading files. 
They are passed because the written and read data a ASCII, and file paths are 
ASCII. But they do not test the case of non-ASCII data and file paths. In 
general, most of uses of the locale encoding should be changed.

1. In some cases it is enough to open the file in binary mode. For example when 
create an empty file, or use just fileno of the opened file.

2. In some cases the file should be opened in binary mode. For example, when 
compile the content of the file or parse it as XML, because the correct 
encoding is determined by the content (BOM, encoding coockie, XML declaration).

3. tokenize.open() or tokenize.detect_encoding() should be used when we read a 
Python source as a text.

4. os.fsdecode() and os.fsencode() may be used if the test file contains file 
paths and is read by bash or other external program.

5. encoding='ascii' should be specified if the test data always ASCII-only.

6. encoding='utf-8' should be specified if the test data can contain arbitrary 
Unicode characters.

7. Encoding different from 'ascii', 'latin1' and 'utf-8' should be used if 
arbitrary encodings should be supported.

8. Implicit locale encoding should be only used if the test is purposed to test 
the implicit encoding.

It is preferable to add non-ASCII characters in the test data.

I am working on a large patch for this (>50% is ready). Some parts of it may be 
extracted as separate PRs, and the rest will be exposed as a large PR. If 
changes are required not only in tests. separate issues will be opened.

----------
components: Tests
messages: 371994
nosy: inada.naoki, serhiy.storchaka, vstinner
priority: normal
severity: normal
status: open
title: Avoid using the locale encoding for open() in tests
type: enhancement
versions: Python 3.10, Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41063>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to