New submission from William Budd: pattern = re.compile('<div>(<p>.*?</p>)</div>', flags=re.DOTALL)
---------------------------------------------------------------- # This works as expected in the following case: print(re.sub(pattern, '\\1', '<div><p>foo</p></div>\n' '<div><p>bar</p>123456789</div>\n')) # which outputs: <p>foo</p> <div><p>bar</p>123456789</div> ---------------------------------------------------------------- # However, it does NOT work as I expect in this case: print(re.sub(pattern, '\\1', '<div><p>foo</p>123456789</div>\n' '<div><p>bar</p></div>\n')) # actual output: <p>foo</p>123456789</div> <div><p>bar</p> # expected output: <div><p>foo</p>123456789</div> <p>bar</p> ---------------------------------------------------------------- It seems that pattern matching/substitution iterations only go haywire once the matching iteration immediately prior to it turned out not to be a match. Maybe some internal variable is not cleaned up properly in an edge(?) case triggered by the example above? ---------- components: Regular Expressions messages: 296506 nosy: William Budd, ezio.melotti, mrabarnett priority: normal severity: normal status: open title: re.sub substitution match group contains wrong value after unmatched pattern was processed versions: Python 3.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30720> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com