-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Daniel,

Friday, July 25, 2003, 1:41:54 PM, you wrote:

DC> Here is a thought:  We could test for n consecutive consonants.  The
DC> more consecutive consotants, the more likely it is to be spam.

Sounds like a good idea.

DC> body      MY_CONSONANT_4  /[^aeiou]{4}/
DC> describe  MY_CONSONANT_4  Body contains 4 consecutive consonants.
DC> score     MY_CONSONANT_4  0.15
DC> Like wise we can go on with more consonants:
DC> score     MY_CONSONANT_4  0.15
DC> score     MY_CONSONANT_5  0.30
DC> score     MY_CONSONANT_6  0.60
DC> score     MY_CONSONANT_7  1.20
DC> score     MY_CONSONANT_8  2.40

I'm more conservative than you, so I've lowered the scores somewhat, but
otherwise I like the idea. I've also added vowels-5 and up.

I know I'll get some matches in ham, siiiiiiigh, but that ham will
usually score well under half my required hits threshold, so I'm not
worried.

One thing I've been meaning to mention concerning these -- while I
usually do manually feed confirmed spam under or just barely over my
required hits threshold into Bayes, I do NOT do so with these spams -- I
don't want to garbage up my tokens database with unique
never-to-be-seen-again tokens.

(On rare occasion I have actually manually edited the mailbox file and
removed this garbage from the spam, then called upon sa-learn, but
usually I haven't had to bother with this extreme action.)

Bob Menschel

-----BEGIN PGP SIGNATURE-----
Version: PGP 8.0

iQA/AwUBPyHvP5ebK8E4qh1HEQIWqACfYdX8ARQQ60R67Hfei5nZIglJrosAoPAS
Am5r6+ezP3korqXv3/wbLYnD
=Mhnb
-----END PGP SIGNATURE-----




-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to