> -----Original Message-----
> From: Brad Wilkin
> Sent: Wednesday, December 10, 2003 9:23 AM
[...]
> Has anyone had success writing tests that can catch this sort of
> trickery?  It
> seems if you could come up with a level of punctuation WITHIN
> words or simply
> remove common punctuation from the subject/body before doing the
> pattern matching,
> SA will be able to identify these.

I was looking at this and was thinking that counting the ratio
of punctuation to other letters might be one way to go. Otherwise,
ofen the punctuation has only letters on each side and that is unusual
for punctuation marks like ';'.

Here's a line from a recent spam, as an example:
  We do the_work for you. By subrn;itting your infor;mation across_to
hundreds of L;enders, we can_get you the_best int;erest r;ates around.

A pattern like the following:
   /([a-z][;][a-z]+.*){5}/i
might get some traction. This has to be run after the HTML is stripped.






-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to