Re: Negation in regular expressions

George Sakkis Fri, 08 Sep 2006 07:32:07 -0700

Paddy wrote:

> George Sakkis wrote:
> > It's always striked me as odd that you can express negation of a single
> > character in regexps, but not any more complex expression. Is there a
> > general way around this shortcoming ? Here's an example to illustrate a
> > use case:
> >
> > >>> import re
> > # split with '@' as delimiter
> > >>> [g.group() for g in re.finditer('[EMAIL PROTECTED]', 'This @ is a @ 
> > >>> test ')]
> > ['This ', ' is a ', ' test ']
> >
> > Is it possible to use finditer to split the string if the delimiter was
> > more than one char long (say 'XYZ') ? [yes, I'm aware of re.split, but
> > that's not the point; this is just an example. Besides re.split returns
> > a list, not an iterator]
> >
> > George
>
> If your wiling to use groups then the following will split
>
> >>> [g.group(1) for g in re.finditer(r'(.+?)(?:@#|$)', 'This @# is a @# test 
> >>> ')]
> ['This ', ' is a ', ' test ']


Nice! This covers the most common case, that is non-consecutive
delimiters in the middle of the string. There are three edge cases:
consecutive delimiters, delimiter(s) in the beginning and delimiter(s)
in the end.

The regexp r'(.*?)(?:@#|$)' would match re.split's behavior if it
wasn't for the last empty string it returns:
>>> s = '@# This @# is a @[EMAIL PROTECTED] test '
>>> re.split(r'@#', s)
['', ' This ', ' is a ', '', ' test ']
>>> [g.group(1) for g in re.finditer(r'(.*?)(?:@#|$)', s)]
['', ' This ', ' is a ', '', ' test ', '']

Any ideas ?

George

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Negation in regular expressions

Reply via email to