>
>> > I've been using 2.54 for a while, and I'm now in the process of
>> > upgrading to 2.55.  I am using the default set of rules.
>> >
>> > Over the past several weeks, I've noticed an increasing amount of
>> > spam that is getting through SpamAssassin with scores in the 4.0-4.9
>> > range. This makes me wonder if perhaps some spammers have started to
>> > taylor their spams as follows: run the default version of
>> > SpamAssassin, feed their messages through it, and keep tweaking the
>> > messages until SpamAssassin lets them through.
>> >
>> > Does anyone else think that this could be possible?  It seems to me
>> > that this process could be easily automated, or at least
>> > semi-automated.
>> > If indeed this is going on, what can we do about it?
>>
>> Train bayes.  Everyone has a different bayes db, and they can't
>> work around that centrally.
>
> The problem I'm seeing is that I'm getting messages with a Bayes of 90%
> but it still slips through with 4.5-5.
>
> But, keep it in proportion. I'm still trapping over 98%.

I don't think theres anything sinister going on, simply that as new spam
techniques evolve, SA is falling behind, until the next release comes out.
A bit like a virus scanner gradually getting out of date before getting
the latest updates..

I see some spams like those you mention with scores between 3 and 5 (our
default threshold is 7) and from looking at the tests it is apparent that
only "generic" tests are being triggered on such spams, such as the HTML
font size tests, sometimes RBL tests, and sometimes BAYES, but no tests
specific to that kind of spam.

The good thing about this game of catchup however, is that even if
spammers are modifying their techniques to try and get around SA, they're
slowly painting themselves into a corner in regards to the kind of things
they can say and put in messages without being detected as spam.

The only kinds of spam that I see as fundamentally problematic for any
kind of scanner like SA are:

* URL only messages, like some of the current porn ones, that don't say
anything suspicious that you can reliably trigger on in the message, (or
say anything at all) but have a URL to a dubious site - perhaps checking
the URL against an RBL list that lists websites referenced from spam is
the answer here, rather than every copy of SA trying to read the site and
analyze content from it.

* Image only Spams (text in the images) - a very tough one to crack, and
perhaps impossible to solve, although if the images are loaded from an
image server somewhere, the same RBL technique mentioned above could be
aplied - spams with image tags going to servers listed in the above RBL of
"dubious sites" would also score highly.

Perhaps if no RBL exists of sites referenced in links and image tags from
spam someone could start one up ?

Thoughts anyone ?

Regards,
Simon



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to