On 2020/04/03 22:15:06, hanwenn wrote: > On 2020/04/03 22:00:02, dak wrote: > > Is this likely related to the problems in `make check` that James currently > > experiences? > > Yes. > > Unfortunately, the default encoding depends on the environment > > " > In text mode, if encoding is not specified the encoding used is platform > dependent: locale.getpreferredencoding(False) is called to get the > > " > > this means that -depending on locale settings- you may get ascii or utf-8 > encoding. > > I didn't get a problem at first, but if I set encoding='ascii' in the > open_write_file definition, I also get encoding errors.
It's even more weird than that, Python changed its default in version 3.7. See also one of my commit messages from January: commit e0c78a4c710c51e1ea87d2b144c0ae713923a2af Author: Jonas Hahnfeld <hah...@hahnjo.de> Date: Wed Jan 15 16:39:56 2020 +0100 Issue 5663/1: Use codecs.open() to decode as utf-8 This is in preparation for Python 3.5 where the default encoding depends on the value of the LANG environment variable. As far as I can tell, this was changed later on and at least Python 3.7 and version 3.8 always default to 'utf-8' on Linux. As I'm proposing to make Python 3.5 the required minimum, we can't rely on this and need to force 'utf-8' when reading files that could contain Unicode. So likely James is using Python 3.5 or 3.6, that's why some of us (with other versions of Python) are not seeing the issue. As such: LGTM! Please note that codecs.open() is not needed anymore in Python 3, it was only needed for compatibility with Python 2.4. We should likely replace all occurrences with plain open() as this patch does. https://codereview.appspot.com/563810043/