A brute-force pyparsing approach - define an alternation of all possible Words made up of the same letter. Plus an alternate version that just picks out the repeats, and gives their location in the input string:
from pyparsing import ZeroOrMore, MatchFirst, Word, alphas print "group string by character repeats" repeats = ZeroOrMore( MatchFirst( [ Word(a) for a in alphas ] ) ) test = "foo ooobaaazZZ" print repeats.parseString(test) print print "find just the repeated characters" repeats = MatchFirst( [ Word(a,min=2) for a in alphas ] ) test = "foo ooobaaazZZ" for toks,loc,endloc in repeats.scanString(test): print toks,loc Gives: group string by character repeats ['f', 'oo', 'ooo', 'b', 'aaa', 'z', 'ZZ'] find just the repeated characters ['oo'] 1 ['ooo'] 4 ['aaa'] 8 ['ZZ'] 12 (pyparsing implicitly ignores whitespace, that's why there is no ' ' entry in the first list) Download pyparsing at http://pyparsing.sourceforge.net. -- Paul -- http://mail.python.org/mailman/listinfo/python-list