[issue17668] re.split loses characters matching ungrouped parts of a pattern

2013-04-10 Thread Tomasz J. Kotarba
Tomasz J. Kotarba added the comment: The example I gave was the simplest possible to illustrate my point but yes, you are correct, I often match the whole string as I do recursive matches. I do use non-capturing groups but they would not solve the problem I talked about. Anyway, I had

[issue17668] re.split loses characters matching ungrouped parts of a pattern

2013-04-08 Thread Tomasz J. Kotarba
Tomasz J. Kotarba added the comment: Hi, I can still see one piece of functionality I have mentioned missing. Using my first example, even when one uses '^(>(.*))$' one cannot get ['', '>Homo sapiens catenin (cadherin-associated)', ''] as one will

[issue17668] re.split loses characters matching ungrouped parts of a pattern

2013-04-08 Thread Tomasz J. Kotarba
Tomasz J. Kotarba added the comment: I agree that introducing an example like that plus making some slight changes in wording would be a welcome change to the docs to clearly explain the current behaviour. Still, I maintain it would be useful to give users the option I described to allow

[issue17668] re.split loses characters matching ungrouped parts of a pattern

2013-04-08 Thread Tomasz J. Kotarba
Tomasz J. Kotarba added the comment: Marking as open till I get your response. I hope you reconsider. -- resolution: invalid -> status: closed -> open ___ Python tracker <http://bugs.python.org/i

[issue17668] re.split loses characters matching ungrouped parts of a pattern

2013-04-08 Thread Tomasz J. Kotarba
Tomasz J. Kotarba added the comment: Hi R. David Murray, Thanks for your reply. I just explained in my previous message to Matthew that documentation does actually support my view (i.e. it is an issue according to the documentation). Re. the issue you mentioned (discarding information

[issue17668] re.split loses characters matching ungrouped parts of a pattern

2013-04-08 Thread Tomasz J. Kotarba
Tomasz J. Kotarba added the comment: Hi Matthew, Thanks for such a quick reply. I know I can get the > by putting it in grouping parentheses. That's not the issue here. The documentation you quoted says that it splits the string by the occurrences _OF_PATTERN_ and that texts of al

[issue17668] re.split loses characters matching ungrouped parts of a pattern

2013-04-08 Thread Tomasz J. Kotarba
New submission from Tomasz J. Kotarba: Tested in 2.7 but possibly affects the other versions as well. A real life example (note the first character '>' being lost): >>> import re >>> re.split(r'^>(.*)$', '>Homo sapiens catenin (cadherin-