Charles Hartman wrote:
I'm still shaky on some of sre's syntax. Here's the task: I've got strings (never longer than about a dozen characters) that are guaranteed to be made only of characters 'x' and '/'. In each string I want to find the longest continuous stretch of pairs whose first character is 'x' and the second is either mark. So if marks = '/xx/xxx///', the "winning" stretch begins at position 2 and is 6 characters long ('x/xxx/'), which requires finding a second match that overlaps the first match (which is just 'xx' in position 1). (When there are multiple "winning" stretches, I might want to adjudicate among them, but that's a separate problem.) I hope this is clear enough.

Here's the best I've come up with so far:

pat = sre.compile('(x[x/])+')
(longest, startlongest) = max([(fnd.end()-fnd.start(), fnd.start()) for i in range(len(marks))
for fnd in pat.finditer(marks,i)])

It's pretty simple to put re.search() into a loop where subsequent searches start from the character after where the previous one matched. Here is a solution that uses a general-purpose longest match function:


import re

# RE solution
def longestMatch(rx, s):
    ''' Find the longest match for rx in s.
        Returns (start, length) for the match or (None, None) if no match found.
    '''

    start = length = current = 0

    while True:
        m = rx.search(s, current)
        if not m:
            break

        mStart, mEnd = m.span()
        current = mStart + 1

        if (mEnd - mStart) > length:
            start = mStart
            length = mEnd - mStart

    if length:
        return start, length

    return None, None


pairsRe = re.compile(r'(x[x/])+')

for s in [ '/xx/xxx///', '//////xx//' ]:
    print s, longestMatch(pairsRe, s)
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to