New submission from Evan:

The changes to shlex due to land in 3.6 use a predefined set of characters to 
"augment" wordchars, however this set is incomplete. For example, 'foo,bar' 
should be parsed as a single token, but it is split on the comma:

$ echo foo,bar
foo,bar

>>> import shlex
>>> list(shlex.shlex('foo,bar', punctuation_chars=True))
['foo', ',', 'bar']

(For context on where this was encountered, see 
https://github.com/kislyuk/argcomplete/issues/161)

Instead of trying to enumerate all possible wordchars, I think a more robust 
solution is to use whitespace_split to include *all* characters not otherwise 
considered special.

Ideally this would be fixed before 3.6 is released to avoid needing to maintain 
backwards compatibility with the current behaviour, although I understand the 
timeline may make this difficult.

I've attached a patch with proposed changes, including updates to the tests to 
demonstrate the effective difference. I can make the corresponding 
documentation changes if we want this merged.

(I've added everyone to the nosy list from http://bugs.python.org/issue1521950 
where these changes originated.)

----------
components: Library (Lib)
files: without_augmenting_chars.diff
keywords: patch
messages: 279980
nosy: Andrey.Kislyuk, cvrebert, eric.araujo, eric.smith, evan_, ezio.melotti, 
python-dev, r.david.murray, robodan, vinay.sajip
priority: normal
severity: normal
status: open
title: shlex.split should not augment wordchars
type: behavior
versions: Python 3.6
Added file: http://bugs.python.org/file45333/without_augmenting_chars.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28595>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to