I'm still shaky on some of sre's syntax. Here's the task: I've got strings (never longer than about a dozen characters) that are guaranteed to be made only of characters 'x' and '/'. In each string I want to find the longest continuous stretch of pairs whose first character is 'x' and the second is either mark. So if marks = '/xx/xxx///', the "winning" stretch begins at position 2 and is 6 characters long ('x/xxx/'), which requires finding a second match that overlaps the first match (which is just 'xx' in position 1). (When there are multiple "winning" stretches, I might want to adjudicate among them, but that's a separate problem.) I hope this is clear enough.

Here's the best I've come up with so far:

pat = sre.compile('(x[x/])+')
(longest, startlongest) = max([(fnd.end()-fnd.start(), fnd.start()) for i in range(len(marks))
for fnd in pat.finditer(marks,i)])


It seems to work, but it's not very smart. Given the string '//////xx//', it returns a list whose first seven tuples are all '(2,6)', followed by a single '(2,7)', which gets returned because of max's secondary attention to the second element in the tuple. In other words this code searches from position 0 and finds (2,6), then searches from position 1 and finds (2,6) . . .

Obviously it ought to adjust the second argument to pat.finditer according to what it finds, instead of blindly stepping. (By the way, I can't give 'range' a stepsize of 2 because the "winning" stretch could begin with an odd-numbered character.) That sounds like a job for a generator. But (1) I haven't been able to work out the details (I'm even shakier on generators than I am on regexes!) and (2) that *seems* likely to be less efficient, maybe by a large enough factor that the obnoxiousness of getting the first search-return seven times is something I should just swallow.

What magic am I missing?

Charles Hartman
Professor of English, Poet in Residence
http://cherry.conncoll.edu/cohar
http://villex.blogspot.com

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to