New submission from Brian Skinn <bsk...@alum.mit.edu>:
doctest requires code examples have PS1 as ">>> " and PS2 as "... " -- that is, each is three printed characters, followed by a space: ``` $ cat ell_err.py import doctest class Foo: """Test docstring. >>>print("This is a test sentence.") ...a test... """ doctest.run_docstring_examples( Foo(), {}, optionflags=doctest.ELLIPSIS, ) $ python3.8 --version Python 3.8.0a3 $ python3.8 ell_err.py Traceback (most recent call last): ... ValueError: line 3 of the docstring for NoName lacks blank after >>>: ' >>>print("This is a test sentence.")' $ cat ell_print.py import doctest class Foo: """Test docstring. >>> print("This is a test sentence.") ...a test... """ doctest.run_docstring_examples( Foo(), {}, optionflags=doctest.ELLIPSIS, ) $ python3.8 ell_print.py Traceback (most recent call last): ... ValueError: line 4 of the docstring for NoName lacks blank after ...: ' ...a test...' ``` AFAICT, this behavior is consistent across 3.4.10, 3.5.7, 3.6.8, 3.7.3, and 3.8.0a3. **However**, in this `ell_print.py` above, that "PS2" line isn't actually meant to be a continuation of the 'source' portion of the example; it's meant to be the *output* (the 'want') of the example, with a leading ellipsis to be matched per `doctest.ELLIPSIS` rules. The regex currently used to look for the 'source' of an example is (https://github.com/python/cpython/blob/4f5a3493b534a95fbb01d593b1ffe320db6b395e/Lib/doctest.py#L583-L586): ``` (?P<source> (?:^(?P<indent> [ ]*) >>> .*) # PS1 line (?:\n [ ]* \.\.\. .*)*) # PS2 lines \n? ``` Since this pattern is compiled with re.VERBOSE (https://github.com/python/cpython/blob/4f5a3493b534a95fbb01d593b1ffe320db6b395e/Lib/doctest.py#L592), the space-as-fourth-character in PS1/PS2 is not explicitly matched. I propose changing the regex to: ``` (?P<source> (?:^(?P<indent> [ ]*) >>>[ ] .*) # PS1 line (?:\n [ ]* \.\.\.[ ] .*)*) # PS2 lines \n? ``` This will then *explicitly* match the trailing space of PS1; it *shouldn't* break any existing doctests, because the parsing code lower down has already been requiring that space to be present in PS1, as shown for `ell_err.py` above. This will also require an *explicit trailing space* to be present in order for a line starting with three periods to be interpreted as a PS2 line of 'source'; otherwise, it will be treated as part of the 'want'. I made this change in my local user install of 3.8's doctest.py, and it works as I expect on `ell_print.py`, passing the test: ``` $ python3.8 ell_print.py $ $ cat ell_wrongprint.py import doctest class Foo: """Test docstring. >>> print("This is a test sentence.") ...a foo test... """ doctest.run_docstring_examples( Foo(), {}, optionflags=doctest.ELLIPSIS, ) $ python3.8 ell_wrongprint.py ********************************************************************** File "ell_wrongprint.py", line ?, in NoName Failed example: print("This is a test sentence.") Expected: ...a foo test... Got: This is a test sentence. ``` For completeness, the following piece of regex in the 'want' section (https://github.com/python/cpython/blob/4f5a3493b534a95fbb01d593b1ffe320db6b395e/Lib/doctest.py#L589): ``` (?![ ]*>>>) # Not a line starting with PS1 ``` should probably also be changed to: ``` (?![ ]*>>>[ ]) # Not a line starting with PS1 ``` I would be happy to put together a PR for this; I would plan to take a ~TDD style approach, implementing a few tests first and then making the regex change. ---------- components: Library (Lib) messages: 340788 nosy: bskinn priority: normal severity: normal status: open title: Tweak doctest 'example' regex to allow a leading ellipsis in 'want' line type: enhancement versions: Python 3.8 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36714> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com