Paddy wrote: > George Sakkis wrote: > > It's always striked me as odd that you can express negation of a single > > character in regexps, but not any more complex expression. Is there a > > general way around this shortcoming ? Here's an example to illustrate a > > use case: > > > > >>> import re > > # split with '@' as delimiter > > >>> [g.group() for g in re.finditer('[EMAIL PROTECTED]', 'This @ is a @ > > >>> test ')] > > ['This ', ' is a ', ' test '] > > > > Is it possible to use finditer to split the string if the delimiter was > > more than one char long (say 'XYZ') ? [yes, I'm aware of re.split, but > > that's not the point; this is just an example. Besides re.split returns > > a list, not an iterator] > > > > George > > If your wiling to use groups then the following will split > > >>> [g.group(1) for g in re.finditer(r'(.+?)(?:@#|$)', 'This @# is a @# test > >>> ')] > ['This ', ' is a ', ' test ']
Nice! This covers the most common case, that is non-consecutive delimiters in the middle of the string. There are three edge cases: consecutive delimiters, delimiter(s) in the beginning and delimiter(s) in the end. The regexp r'(.*?)(?:@#|$)' would match re.split's behavior if it wasn't for the last empty string it returns: >>> s = '@# This @# is a @[EMAIL PROTECTED] test ' >>> re.split(r'@#', s) ['', ' This ', ' is a ', '', ' test '] >>> [g.group(1) for g in re.finditer(r'(.*?)(?:@#|$)', s)] ['', ' This ', ' is a ', '', ' test ', ''] Any ideas ? George -- http://mail.python.org/mailman/listinfo/python-list