[issue38663] Untokenize does not round-trip ws before bs-nl

2019-11-03 Thread Terry J. Reedy
Terry J. Reedy added the comment: Since these posts were more or less copied to pydev list, I am copying my response on the list here. --- > **tl;dr:** Various posts, linked below, discuss a much better replacement for > untokenize. If that were true, I would be interested. But as explained

[issue38663] Untokenize does not round-trip ws before bs-nl

2019-11-03 Thread Edward K Ream
Edward K Ream added the comment: This post: https://groups.google.com/d/msg/leo-editor/DpZ2cMS03WE/5X8IDzpgEAAJ discusses unit testing. The summary states: "I've done the heavy lifting on issue 38663. Python devs should handle the details of testing and packaging." I'll leave it at that. In

[issue38663] Untokenize does not round-trip ws before bs-nl

2019-11-03 Thread Edward K Ream
Edward K Ream added the comment: This post https://groups.google.com/d/msg/leo-editor/DpZ2cMS03WE/VPqtB9lTEAAJ discusses a complete rewrite of tokenizer.untokenize. To quote from the post: I have "discovered" a spectacular replacement for Untokenizer.untokenize in python's tokenize library m

[issue38663] Untokenize does not round-trip ws before bs-nl

2019-11-03 Thread Edward K Ream
Edward K Ream added the comment: The original bug report used a Leo-only function, g.toUnicode. To fix this, replace: result = g.toUnicode(tokenize.untokenize(tokens)) by: result_b = tokenize.untokenize(tokens) result = result_b.decode('utf-8', 'strict') -- ___

[issue38663] Untokenize does not round-trip ws before bs-nl

2019-11-01 Thread Edward K Ream
New submission from Edward K Ream : Tested on 3.6. tokenize.untokenize does not round-trip whitespace before backslash-newlines outside of strings: from io import BytesIO import tokenize # Round tripping fails on the second string. table = ( r''' print\ ("abc") ''', r''' print \ ("ab