Thomas Jollans wrote: > On Friday 21 September 2007, [EMAIL PROTECTED] wrote: > >> Not specific to Python, but it will be implemented in it... how do I >> compile a RE to catch everything between two know values? Here's what >> I've tried (but failed) to accomplish... the knowns here are START and >> END: >> >> data = "asdfasgSTARTpruyerfghdfjENDhfawrgbqfgsfgsdfg" >> x = re.compile('START.END', re.DOTALL) >> >> x.findall(data) >> > > I'm not sure finding a variable number of occurences can be done with re. How > about > > # data = the string > strings = [] > for s in data.split('START')[1:]: > strings.append(s.split('END')[0]) > Nice. I've noticed that since I switched from Perl to Python, I hardly ever use regular expressions anymore. In perl, they're so easy to fire up that they become the first tool out of the toolbox, but when you make the barrier to access just a tiny bit higher (import re/re.compile) you start noticing how easy it is to accomplish most of those feats without regexes, and much more readably, too.
Of course, it should be noted that the different implementations suggested behave differently, which could also affect the choice of method. If you have "abcSTARTdefSTARTghiEND", your version will spit out strings = ['def', 'ghi'], but a regex, depending on whether it is greedy or non greedy, will either spit out ['STARTdefSTARTghiEND'] or ['STARTghiEND']. Correction, it will spit out the first one, whether greedy or not. The difference comes with two END tags in a row. Cheers, Cliff -- http://mail.python.org/mailman/listinfo/python-list