On 1/8/21 4:38 PM, Peter Maydell wrote: > On Fri, 8 Jan 2021 at 15:16, Philippe Mathieu-Daudé <f4...@amsat.org> wrote: >> >> When decodetree.py was added in commit 568ae7efae7, QEMU was >> using Python 2 which happily reads UTF-8 files in text mode. >> Python 3 requires either UTF-8 locale or an explicit encoding >> passed to open(). Now that Python 3 is required, explicit >> UTF-8 encoding for decodetree sources. >> >> This fixes: >> >> $ /usr/bin/python3 scripts/decodetree.py test.decode >> Traceback (most recent call last): >> File "scripts/decodetree.py", line 1397, in <module> >> main() >> File "scripts/decodetree.py", line 1308, in main >> parse_file(f, toppat) >> File "scripts/decodetree.py", line 994, in parse_file >> for line in f: >> File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode >> return codecs.ascii_decode(input, self.errors)[0] >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 80: >> ordinal not in range(128) >> >> Reported-by: Peter Maydell <peter.mayd...@linaro.org> >> Signed-off-by: Philippe Mathieu-Daudé <f4...@amsat.org> >> --- >> scripts/decodetree.py | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/scripts/decodetree.py b/scripts/decodetree.py >> index 47aa9caf6d1..fa40903cff1 100644 >> --- a/scripts/decodetree.py >> +++ b/scripts/decodetree.py >> @@ -1304,7 +1304,7 @@ def main(): >> >> for filename in args: >> input_file = filename >> - f = open(filename, 'r') >> + f = open(filename, 'r', encoding='utf-8') >> parse_file(f, toppat) >> f.close() > > Should we also be opening the output file explicitly as > utf-8 ? (How do we say "write to sys.stdout as utf-8" for > the case where we're doing that?)
I have been wondering about it, but the content written in the output file is plain C code using only ASCII, which any locale is able to process. But indeed maybe we prefer ignore the user locale... I'm not sure.