> Say I have some string that begins with an arbitrary > sequence of characters and then alternates repeating the > letters 'a' and 'b' any number of times, e.g. > "xyz123aaabbaabbbbababbbbaaabb" > > I'm looking for a regular expression that matches the > first, and only the first, sequence of the letter 'a', and > only if the length of the sequence is exactly 3. > > Does such a regular expression exist? If so, any ideas as > to what it could be? >
I'm not quite sure what your intent here is, as the resulting find would obviously be "aaa", of length 3. If you mean that you want to test against a number of things, and only find items where "aaa" is the first "a" on the line, you might try something like import re listOfStringsToTest = [ 'helloworld', 'xyz123aaabbaabababbab', 'cantalopeaaabababa', 'baabbbaaabbbbb', 'xyzaa123aaabbabbabababaa'] r = re.compile("[^a]*(a{3})b+(a+b+)*") matches = [s for s in listOfStringsToTest if r.match(s)] print repr(matches) If you just want the *first* triad of "aaa", you can change the regexp to r = re.compile(".*?(a{3})b+(a+b+)*") With a little more detail as to the gist of the problem, perhaps a better solution can be found. In particular, are there items in the listOfStringsToTest that should be found but aren't with either of the regexps? -tkc -- http://mail.python.org/mailman/listinfo/python-list