On Apr 8, 3:40 pm, gry <georgeryo...@gmail.com> wrote: > > >>> s='555tHe-rain.in#=1234' > > >>> import re > > >>> r=re.compile(r'([a-zA-Z]+|\d+|.)') > > >>> r.findall(s) > > ['555', 'tHe', '-', 'rain', '.', 'in', '#', '=', '1234'] > > This is nice and simple and has the invertible property that Patrick > mentioned above. Thanks much!
Yes, like using split(), this is invertible. But you will see a difference (and for a given task, you might prefer one way or the other) if, for example, you put a few consecutive spaces in the middle of your string, where this pattern and findall() will return each space individually, and split() will return them all together. You *can* fix up the pattern for findall() where it will have the same properties as the split(), but it will almost always be a more complicated pattern than for the equivalent split(). Another thing you can do with split(): if you *think* you have a pattern that fully covers every string you expect to throw at it, but would like to verify this, you can make use of the fact that split() returns a string between each match (and before the first match and after the last match). So if you expect that every character in your entire string should be a part of a match, you can do something like: strings = splitter(s) tokens = strings[1::2] assert not ''.join(strings[::2]) Regards, Pat -- http://mail.python.org/mailman/listinfo/python-list