[issue31778] ast.literal_eval supports non-literals in Python 3
New submission from David Bieber : # Overview ast.literal_eval supports some non-literals in Python 3. The behavior of this function runs contrary to the documented behavior. # The Issue The [documentation](https://docs.python.org/3/library/ast.html#ast.literal_eval) says of the function "It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing." However, literal_eval is capable of evaluating expressions with certain operators, particular the operators "+" and "-". As has been explained previously, the reason for this is to support "complex" literals such as 2+3j. However, this has unintended consequences which I believe to be indicative of a bug. Examples of the unintended consequences include `ast.literal_eval('1+1') == 2` `ast.literal_eval('2017-10-10') == 1997`. I would expect each of these calls to literal_eval to throw a ValueError, as the input string is not a literal. Instead, literal_eval successfully evaluates the input string, in the latter case giving an unexpected result (since the intent of the string is to represent a 21st century date.) Since issue arose as a [Python Fire issue](https://github.com/google/python-fire/issues/97), where the behavior of Python Fire was unexpected for inputs such as those described above (1+1 and 2017-10-10) only in Python 3. For context, Python Fire is a CLI library which uses literal_eval as part of its command line argument parsing procedure. I think the resolution to this issue is having literal_eval raise a ValueError if the ast of the input represents anything other than a Python literal, as described in the documentation. That is, "The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None." Additional operations, such as the binary operations "+" and "-", unless they explicitly create a complex number, should produce a ValueError. If that resolution is not the direction we take, I also would appreciate knowing if there is another built in approach for determining if a string or ast node represents a literal. # Reproducing The following code snippets produce different behaviors in Python 2 and Python 3. ```python import ast ast.literal_eval('1+1') ``` ```python import ast ast.literal_eval('2017-10-10') ``` # References - The Python Fire issue is here: https://github.com/google/python-fire/issues/97 - Prior discussion of a similar issue: https://bugs.python.org/issue22525 - I think is where complex number support was originally added: https://bugs.python.org/issue4907 - In this thread, https://bugs.python.org/issue24146, one commenter explains that literal_eval's support for "2+3" is an unintentional side effect of supporting complex literals. -- messages: 304294 nosy: David Bieber priority: normal severity: normal status: open title: ast.literal_eval supports non-literals in Python 3 versions: Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue31778> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31778] ast.literal_eval supports non-literals in Python 3
Change by David Bieber : -- type: -> behavior ___ Python tracker <https://bugs.python.org/issue31778> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31778] ast.literal_eval supports non-literals in Python 3
David Bieber added the comment: # Replies > Rolling back previous enhancements would break existing code. I sympathize completely with the need to maintain backward compatibility. And if this is the reason that this issue gets treated only as a documentation issue, rather than as a behavior issue, I can appreciate that. If that is the case and literal_eval is not going to only evaluate literals, then for my use case I'll need a way to determine from a string whether it represents a literal. I can implement this myself using ast.parse and walking the resulting tree, looking for non-literal AST nodes. Would such an "is_literal" function would be more appropriate in the ast module than as a one-off function in Python Fire? > The key promise that literal_eval makes is that it will not permit arbitrary > code execution. I disagree that this is the only key promise, and here's my reasoning. The docstring has two sentences, and each one makes a promise: 1. "Safely evaluate an expression node or a string containing a Python expression." 2. "The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None." (1) says that evaluation is safe -- this is the key promise that you reference. (2) is also a promise though, that only certain types are allowed. While one could argue that the behavior of the function is not specified for inputs violating that criteria, I think the clear correct thing to do is to raise a ValueError if the value doesn't meet the criteria. This is what was done in Python 2, where the docstring for literal_eval is these same two sentences (modulo the inclusion of bytes and sets). It's my opinion that Python 2's behavior better matches the docstring as well as the behavior implied by the function's name. # Additional observations 1. Python 2 _does_ support parsing complex literals, but does not support treating e.g. '1+1' as a literal. ast.literal_eval('1+1j') # works in both Python 2 and Python 3 ast.literal_eval('1j+1') # raises a ValueError in Python 2, returns 1+1j in Python 3 ast.literal_eval('1+1') # raises a ValueError in Python 2, returns 2 in Python 3 2. Python 3 supports parsing addition and subtraction at any level of nesting. ast.literal_eval('(1, (0, 1+1+1))') # raises a ValueError in Python 2, returns (1, (0, 3)) in Python 3 In my opinion, Python 2's behavior is correct in these situations since it matches the documentation and only evals literals as defined in the documentation. # Source The relevant code in Python 2.7.3 is [here](https://github.com/enthought/Python-2.7.3/blob/69fe0ffd2d85b4002cacae1f28ef2eb0f25e16ae/Lib/ast.py#L67). It explicitly allows NUM +/- COMPLEX, but not even COMPLEX +/- NUM. The corresponding code for Python 3 is [here](https://github.com/python/cpython/blob/ef611c96eab0ab667ebb43fdf429b319f6d99890/Lib/ast.py#L76). You'll notice it supports adding and subtracting arbitrary numeric types (int, float, complex). --- Thanks for your replies and for hearing me out on this issue. -- ___ Python tracker <https://bugs.python.org/issue31778> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com