New submission from Alexander Schmolck <a.schmolck+...@gmail.com>: In certain cases a zero-width /Z match that should be replaced isn't.
An example might help: re.compile('(?m)(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)').subn(lambda m:next('<'+k+'>' for k,v in m.groupdict().items() if v is not None), 'foobar ') this gives ('foobar<trailing_ws>', 1) I would have expected ('foobar<trailing_ws><no_final_newline>', 2) Contrast this with the following behavior: [m.span() for m in re.compile('(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)', re.M).finditer('foobar ')] gives [(6, 7), (7, 7)] The matches are clearly not overlapping and the re module docs for sub say "Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl.", so I would have expected two replacements. This seems to be what perl is doing: echo -n 'foobar ' | perl -pe 's/(?m)(?P<trailing_ws>[ \t]+\r*$)|(?P<no_final_newline>(?<=[^\n])\Z)/<$&>/g' gives foobar< ><>% ---------- components: Regular Expressions messages: 120499 nosy: Alexander.Schmolck priority: normal severity: normal status: open title: re.sub[n] doesn't seem to handle /Z replacements correctly in all cases type: behavior versions: Python 2.6, Python 3.1 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10328> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com