New submission from Elias Tarhini <elt...@gmail.com>:
I believe I've found a bug in the `re` module -- specifically, in the 3.7+ support for splitting on zero-width patterns. Compare Java's behavior... jshell> "1211".split("(?<=(\\d))(?!\\1)(?=\\d)"); $1 ==> String[3] { "1", "2", "11" } ...with Python's: >>> re.split(r'(?<=(\d))(?!\1)(?=\d)', '1211') ['1', '1', '2', '2', '11'] (The pattern itself is pretty straightforward in design, but regex syntax can cloud things, so to be totally clear: it finds any point that follows a digit and precedes a *different* digit.) * Tested on 3.7.1 win10 and 3.7.0 linux. ---------- components: Regular Expressions messages: 338581 nosy: Elias Tarhini, ezio.melotti, mrabarnett priority: normal severity: normal status: open title: re.split() incorrectly splitting on zero-width pattern type: behavior versions: Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36397> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com