On Sat, 6 Dec 2003 22:04:15 +0000 (GMT), Martin Radford <[EMAIL PROTECTED]> writes:
> Hi all, > > I don't know how new this trick is, but I've not seen it before -- the > spammer is using HTML tables to break up the message content. Also, > most of the interesting words are mis-spelled. It does at least hit > on HG_HORMONE. Cute! I'd say let bayes deal with it. One of the purposes of bayes is to make sure that this sort of mangling can only happen once. Tokens like 'Humaan' 'Exprets', 'Hoormone' 'therrappy' 'youungerr' 'follow' 'linnk' 'todaay' 'intenret' 'discuont' 'frree' have just been learned as spam, and its probably safe to say that they can be pretty much permanently coonsidered to be spam-signs. This one may have gone through, the next one perhaps too, butnot too many. It may though be useful encode a few of these manglings directly as a rule, for instance 'MISSPELLED_FREE'/'MISSPELLED_DISCOUNT' with a point each. On the other hand, the use of a table is more disquieting. Its an example of the collage attack on watermarking systems. Break a message up into small pieces that are reassembled by a real client in a fashion that would be difficult for a scanner to reconstruct. Witih the sort of intelligent client that is an HTML engine, this class of attack on content-checkers may be nearly unblockable. On the good side, it might, JUST might kill off HTML email. Scott ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk