Charles T. Smith wrote: > What the original snippet does is parse *and consume* a string - actually, > to avoid maintaining a cursor traverse the string. The perl feature is > that substitute allows the found pattern to be replaced, but retains the > group after the expression is complete.
That is too technical for my taste. When is your "paradigm" more useful than a simple re.finditer(), re.findall(), or re.split() ? >> things = [] >> while some_str != tail: >> m = re.match(pattern_str, some_str) >> things.append(some_str[:m.end()]) >> some_str = some_str[m.end():] If that were common (or even ever occured) I'd write a helper which avoids the brittle some_str != tail comparison and exposes the functionality in a for loop: class MissingTailError(ValueError): pass class UnparsedRestError(ValueError): pass def shave_off(regex, text, tail=None): """ >>> for s in shave_off(r"[a-z]+ \\d+\\s*", ... "foo 12 bar 34 baz", tail="baz"): ... s 'foo 12 ' 'bar 34 ' """ if tail is not None: if text.endswith(tail): end = len(text) - len(tail) else: raise MissingTailError("%r does not end with %r" % (text, tail)) else: end = len(text) start = 0 r = re.compile(regex) while start != end: m = r.match(text, start, end) if m is None: raise UnparsedRestError( "%r does not match pattern %r" % (text[start:end], r.pattern)) yield text[m.start():m.end()] start = m.end() -- https://mail.python.org/mailman/listinfo/python-list