Some friends and I have noticed that some spam messages arrive with multipart/alternative messages whose different multiparts don't match (they're supposed to be the same data in different formats, like txt vs html). In particular, the txt segments seem to be gibberish and the html segments seem to be the ad.
It seems like a test could be done where you convert the non-txt formats to txt, remove whitespace, and then compare the multipart/alternative segments to see if they're mostly the same. If they differ by more than a certain percentage, then you flag them for a certain number of points of score (perhaps different percentage thresholds yielding different scores).
Does SA do anything about the general issue (scoring for multipart segments that don't match)?
If so, does SA do it the way I describe, or a different way? (and if different, what does SA do?)
------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk