I'm still shaky on some of sre's syntax. Here's the task: I've got strings (never longer than about a dozen characters) that are guaranteed to be made only of characters 'x' and '/'. In each string I want to find the longest continuous stretch of pairs whose first character is 'x' and the second is either mark. So if marks = '/xx/xxx///', the "winning" stretch begins at position 2 and is 6 characters long ('x/xxx/'), which requires finding a second match that overlaps the first match (which is just 'xx' in position 1). (When there are multiple "winning" stretches, I might want to adjudicate among them, but that's a separate problem.) I hope this is clear enough.
Here's the best I've come up with so far:
pat = sre.compile('(x[x/])+')
(longest, startlongest) = max([(fnd.end()-fnd.start(), fnd.start()) for i in range(len(marks))
for fnd in pat.finditer(marks,i)])
It's pretty simple to put re.search() into a loop where subsequent searches start from the character after where the previous one matched. Here is a solution that uses a general-purpose longest match function:
import re
# RE solution def longestMatch(rx, s): ''' Find the longest match for rx in s. Returns (start, length) for the match or (None, None) if no match found. '''
start = length = current = 0
while True: m = rx.search(s, current) if not m: break
mStart, mEnd = m.span() current = mStart + 1
if (mEnd - mStart) > length: start = mStart length = mEnd - mStart
if length: return start, length
return None, None
pairsRe = re.compile(r'(x[x/])+')
for s in [ '/xx/xxx///', '//////xx//' ]: print s, longestMatch(pairsRe, s) -- http://mail.python.org/mailman/listinfo/python-list