On Wed, 28 May 2008, Don Saklad wrote:
a. What are a dozen or so of the most frequently used strings of characters in spam messages?... like rolex, maxgain, ...?
Define "string." If you mean "word," then here are the 12 most common words in the TREC 2005 corpus, with the number of times they appear: enron 94799 message 38187 subject 34751 please 31261 company 31257 original 29529 energy 28476 would 28449 power 23643 about 20734 which 19533 there 16392 The data's a little old, but it's sufficient to make the point of why SpamAssassin doesn't just do naive word matching (and why you shouldn't, either). Chris St. Pierre Unix Systems Administrator Nebraska Wesleyan University