i gather you are looking to reduce the cost of spam filtering,
by cutting out inefficient or costly tests.

my impression (not from measuring, just from looking at spam) is that
the checksum databases are relatively inefficient because spammers
have started to routinely introduce random components into the message
body such as customized greetings or random numbers.

i believe the database checks have high scores because they are highly
definitive, rather than highly efficient.

these tests introduce network latency and a dependency on outside
services so i haven't used them for high volume mission-critical
services.

the same costs apply to a dns-based blocking list.

of course, i could reduce those costs with local copies, if i thought
they were efficient.

i have a hand-tuned list of domains, ip addresses, and header strings
for postfix which are used to turn away email.  not accepting is an
excellent way to avoid processing overhead. (previously 55% of mail
was marked as spam. now only 15% is, and around 5% of that is
subsequently identified and treated as graylisted -- opt-in content
that just happens to closely resemble spam).

this is admittedly labor intensive.

On Thu, Jan 09, 2003 at 01:31:51AM +0100, Kai Schaetzl wrote:
> I set up a spamd for testing about a week ago and would like to 
> fine-tune it now. Especially, I want to get rid of unnecessary 
> network tests. Currently, it uses dns bls as with the default config, 
> no checksum tests like razor and dcc.
> Looking at the scores in 50_scores.cf and taking them of some sort of 
> "proof" shows that only a few dns bl tests really give good clues 
> about spam, but the checksum tests show quite high scores (and I 
> interpret this as them being correct in most cases). Going from that 
> I would just stop (most) rbl checking and use a checksum test? But 
> which one? A high score doesn't give any clue of how many checks 
> actually were "successful" (did they find a spam message at all?). It 
> doesn't help if a checksum system or bl system  is correct all the 
> time, but has too few data, so many checks simply will fail because 
> it got no clue about the host or the message (yet).
> 
> So, my basic question probably is, is it worth it to use BL or 
> checksum testing at all and if so, which ones?
> Also, how do I disable checks? The comment text in the scores.cf 
> seems to imply that just setting to 0 disables the test. Does it 
> really *disable* the test, so that it's not done at all or just stop 
> it from scoring? Of course, it doesn't make sense to me to carry out 
> a test but then discard it with a scoring of 0. Then I wouldn't carry 
> it out in the first place.
> 
> Thanks!
> 
> 
> Kai
> 
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.NET email is sponsored by:
> SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
> http://www.vasoftware.com
> _______________________________________________
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to