Thank you Bengt Richter and Sybren Stuvel for your comments, my little procedure can be improved a bit in many ways, it was just a first quickly written version (but it can be enough for a basic usage).
Bengt Richter: >good way to prepare for split Maybe there is a better way, that is putting in just the accepted letters and accented letters, instead of the not accepted symbols. >I suspect it's not possible to get '' in the list from somestring.split() You are probably right, the algorithm used is different if you don't give a splitting string: >>> ' abc a '.split() ['abc', 'a'] >>> '.abc..a.'.split(".") ['', 'abc', '', 'a', ''] >does that beat the try and get versions? I.e., (untested) Yes. >countDict[w] = countDict.get(w, 0) + 1 I think the if-else version is the fastest, I have tested it a long time ago... You can easely do a speed test to see if I am wrong. Bear hugs, bearophile -- http://mail.python.org/mailman/listinfo/python-list