On Thursday 07 March 2002 02:53 am, Matt Sergeant wrote:
> On Thu, 7 Mar 2002, Bart Schaefer wrote:

> > On Thu, 7 Mar 2002, Matt Sergeant wrote:

> > > Yep, I'm seeing this stuff too (though not in huge numbers yet). I'm
> > > going to examine the body rules in a bit more detail, and if it makes
> > > sense, to basically remove all punctuation chars (everything except
> > > whitespace and letters/numbers) from the body tests.

> > I've been wondering about just adding a test that looks for oddly placed
> > punctuation in the midst of words.  Something like the GAPPY_TEXT rule,
> > except that it'd match even if there were only one or two "misplaced"
> > punctuations but require that there be several such words (not
> > necessarily consecutive).

> The trouble with that is it sounds like it might hit false positives quite
> highly. You'd have to have some sort of second test like LINE_OF_YELLING
> does that checks how many times it got hit.

Alright, here's a first pass at it:

-----

body     DOT_HIDING             eval:check_for_dot_hiding()
describe DOT_HIDING             Might try to hide phrases with dots

body     DOT_HIDING_3           eval:check_for_num_dot_hiding_suspects("3")
describe DOT_HIDING_3           Good chance it's dot hiding

body     DOT_HIDING_5           eval:check_for_num_dot_hiding_suspects("5")
describe DOT_HIDING_5           Real good chance it's dot hiding

score    DOT_HIDING             1.0
score    DOT_HIDING_3           2.0
score    DOT_HIDING_5           2.0

------

# Spammers are trying to hide phrases by putting dots in them
sub check_for_dot_hiding {
    my ($self, $body) = @_;

    # Grab all words with a dot in them
    my @suspects = map(/(\S*\w\.\w\S*)/g, @{$body});

    # Get rid of URIs
    @suspects = grep(!/:\/\//, @suspects);

    # Get rid of email adresses
    @suspects = grep(!/\@/, @suspects);

    # Get rid of acronyms
    @suspects = grep(!/^([A-Z]\.)+[A-Z]\.?$/, @suspects);

    # Get rid of numbers, including IP adresses
    @suspects = grep(!/^[\d\.\,]+$/, @suspects);

    my $num_suspects = scalar @suspects;

    $self->{num_dot_hiding_suspects} = $num_suspects;

    return ($num_suspects > 0);
}

sub check_for_num_dot_hiding_suspects {
    my ($self, $body, $threshold) = @_;

    return ($self->{num_dot_hiding_suspects} > $threshold);
}

-- 
Visit http://dmoz.org, the world's   | Give a man a match, and he'll be warm
largest human edited web directory.  | for a minute, but set him on fire, and
                                     | he'll be warm for the rest of his life.
[EMAIL PROTECTED]  ICQ: 132152059 |

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to