[issue17668] re.split loses characters matching ungrouped parts of a pattern

Tomasz J. Kotarba Mon, 08 Apr 2013 21:59:58 -0700

Tomasz J. Kotarba added the comment:

Hi,
I can still see one piece of functionality I have mentioned missing. Using my 
first example, even when one uses '^(>(.*))$' one cannot get ['', '>Homo 
sapiens catenin (cadherin-associated)', ''] as one will get a four-element list 
and need to deal with the third element of the returned list (i.e. the match 
for a group).  Having a parameter I have described before which allows for 
getting the output similar to what one gets for groups but for the whole 
pattern (and only that) would be very convenient for some scenarios (like when 
writing a procedure which processes texts using different (and unknown at the 
time of writing the procedure) regex patterns which uses a variable number of 
groups but also the pattern as a whole (also for performing the split 
operation)).
Of course it can be worked around using many different approaches but still, as 
I said at start, I believe it would be useful (and would not break 
compatibility).  Another possible solution (i.e. different than the one I 
suggested at start) would be to have a parameter to tell re.split to ignore the 
groups (or, going even further, to select which groups to ignore).  Anyway, I 
am not the developer of this module so if you feel it would be too much of a 
bother to add such a parameter just for the sake of convenience then, by all 
means, please feel free to disregard my comments and just close this report.
Cheers,
T
P.S.  It is very late so I can only hope I have been sane enough to properly / 
clearly express my thoughts.  Apologies if not.


----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17668>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue17668] re.split loses characters matching ungrouped parts of a pattern

Reply via email to