I was actually thinking of removing the 3-in-a-line restriction, but
splitting the rule in to at least 2 pieces: naughty words, and definite
signs of porn spam.  "fuck" falls in the former category, "cum" in the
latter.  I'll basically check each words' frequency in the corpus and
separate the words at some ratio of spam:nonspam presence.

C

On Thu, 2002-02-07 at 07:12, Shane Williams wrote:
> I was looking at the porn expressions and scoring, and thought of an
> idea to shoot by everybody.
> 
> If I'm reading the PORN_3 rule correctly, you must have three of the
> listed strings within 15 characters of each other, and this scores .7
> if caught.
> 
> Two things seem strange about this.  First, how often would two of
> these strings in close proximity not be pretty spammy?  And if there
> are actually three of them in a row, shouldn't it score higher than .7
> 
> What I was thinking was to reproduce the rule in full, except change
> the 3 to a 2 and then score the rules as such.  Two of these strings
> in proximity would score slightly lower than 3 in proximity.
> 
> -- 
> Public key #7BBC68D9 at            |                 Shane Williams
> http://pgp.mit.edu/                |
> =----------------------------------+-------------------------------
> All syllogisms contain three lines |              [EMAIL PROTECTED]
> Therefore this is not a syllogism  |   www.gslis.utexas.edu/~shanew
> 
> 
> _______________________________________________
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
> 
> 


_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to