Having all that whitespace in the 'wrong' spot breaks the idea of splitting words based on their being surrounded by whitespace. So get rid of __all__ whitespace. Then use other logic find what you want. E.g. if you want the 'word' following the 'word' processor, find the first occurance of 'processor' in the string (which contains the whole file), then look at each following character one at a time to see if it meets the criteria for being in the next word. E.g. if the following word must be a number and the word after that is not a number, take each successive character until its not a number and there you have your target word.
This would be easy in Python but since I don't do RE I couldn't begin to solve it using anything else. I still like the idea of fixing the source of these mangled files. Doug. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]