Thomas Wouters <tho...@python.org> added the comment:

Py_CompileString() in Python 3.9 and later, using the PEG parser, appears to no 
longer honours source encoding cookies. A reduced test case:

    #include "Python.h"
    #include <stdio.h>

    const char *src = (
    "# -*- coding: Latin-1 -*-\n"
    "'''\xc3'''\n");

    int main(int argc, char **argv)
    {
        Py_Initialize();
        PyObject *res = Py_CompileString(src, "some_path", Py_file_input);
        if (res) {
            fprintf(stderr, "Compile succeeded.\n");
            return 0;
        } else {
            fprintf(stderr, "Compile failed.\n");
            PyErr_Print();
            return 1;
        }
    }

Compiling and running the resulting binary with Python 3.8 (or earlier):

    % ./encoding_bug
    Compile succeeded.

With 3.9 and PYTHONOLDPARSER=1:

    % PYTHONOLDPARSER=1 ./encoding_bug
    Compile succeeded.

With 3.9 (without the env var) or 3.10:
    % ./encoding_bug
    Compile failed.
      File "some_path", line 2
        '''�'''
             ^
    SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xc3 in 
position 0: unexpected end of data

Writing the same bytes to a file and making python3.9 or python3.10 import them 
works fine, as does passing the bytes to compile():

    Python 3.10.0+ (heads/3.10-dirty:7bac598819, Nov 16 2021, 20:35:12) [GCC 
8.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> b = open('encoding_bug.py', 'rb').read()
    >>> b
    b"# -*- coding: Latin-1 -*-\n'''\xc3'''\n"
    >>> import encoding_bug
    >>> encoding_bug.__doc__
    'Ã'
    >>> co = compile(b, 'some_path', 'exec')
    >>> co
    <code object <module> at 0x7f447e1b0c90, file "some_path", line 1>
    >>> co.co_consts[0]
    'Ã'


It's just Py_CompileString() that fails. I don't understand why, and I do 
believe it's a regression.

----------
nosy: +gregory.p.smith

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue45822>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to