Re: RE Help

J. Cliff Dyer Fri, 21 Sep 2007 14:56:15 -0700

Thomas Jollans wrote:
> On Friday 21 September 2007, [EMAIL PROTECTED] wrote:
>   
>> Not specific to Python, but it will be implemented in it... how do I
>> compile a RE to catch everything between two know values? Here's what
>> I've tried (but failed) to accomplish... the knowns here are START and
>> END:
>>
>> data = "asdfasgSTARTpruyerfghdfjENDhfawrgbqfgsfgsdfg"
>> x = re.compile('START.END', re.DOTALL)
>>
>> x.findall(data)
>>     
>
> I'm not sure finding a variable number of occurences can be done with re. How 
> about
>
> # data = the string
> strings = []
> for s in data.split('START')[1:]:
>     strings.append(s.split('END')[0])
>   
Nice.  I've noticed that since I switched from Perl to Python, I hardly
ever use regular expressions anymore.  In perl, they're so easy to fire
up that they become the first tool out of the toolbox, but when you make
the barrier to access just a tiny bit higher (import re/re.compile) you
start noticing how easy it is to accomplish most of those feats without
regexes, and much more readably, too.


Of course, it should be noted that the different implementations
suggested behave differently, which could also affect the choice of
method.  If you have "abcSTARTdefSTARTghiEND", your version will spit
out strings = ['def', 'ghi'], but a regex, depending on whether it is
greedy or non greedy, will either spit out ['STARTdefSTARTghiEND'] or
['STARTghiEND'].

Correction, it will spit out the first one, whether greedy or not.  The
difference comes with two END tags in a row.


Cheers,
Cliff
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: RE Help

Reply via email to