You might want to check www.ix.de/nixspam Their system calculates a checksum from, among others, the number of words per line
Wolfgang Hamann >> >> >> Dear All, >> >> I have spent some time searching for something I assumed existed, >> which is a method of detecting when large quantities of the same message, >> or messages with nearly the same content (body & subject) are passing >> through an MTA within a specified time period. It seems to me this could >> be a useful way to detect not only spam but other types of problems as >> well - error conditions, mail bombs, etc. Clearly, one could not block >> such messages until a certain number of them had already been delivered, >> but still it could be useful. A whitelist function would obviously be >> used to allow legitimate traffic, but otherwise a threshold could be set >> and when enough messages with the same or mostly the same body content >> are detected, any further could be quarantined or tagged. >> >> Just curious if anyone knows of any method of doing this, either >> with spamassassin or with another tool. >> >> Thanks. >> >> >>