Andrew Jewett added the comment:
Proposed solution and patch to follow. Please let me know if I am posting it
in the wrong place.
The main problem with shlex is that the shlex interface is inadequate to handle
unicode. Specifically it is no longer feasible to provide a list of every
Andrew Jewett added the comment:
Not to get side-tracked, but on a related note, it would be nice if there was a
python module which defined sets of unicode characters corresponding to
different categories (similar to the categories listed here:
http://www.fileformat.info/info/unicode
Andrew Jewett added the comment:
> That can be done programmatically using the unicodedata module.
> The regex module (that will hopefully be include in 3.3) is
> also able to match characters that belongs to specific categories.
Ezio: Thanks. (New to me, actually) Is this what
Andrew Jewett added the comment:
After posting that, I noticed that the second example I listed in my previous
post (a language where words contain any non-whitespace, non-parenthesis
character) can now be implemented in the current version of shlex.py by setting
"whitespace_true
Andrew Jewett added the comment:
Alright. I'll think about it a little more and post my suggestion there,
perhaps. Thanks Victor.
--
___
Python tracker
<https://bugs.python.org/i