Florin Andrei wrote: > I'm using SA since... well, a long time ago, and one thing that i > noticed was a pattern in the way its efficiency varies: it's pretty > good soon after a new release, then it gets continuously worse; then > a new release and all of a sudden it's good again, then it starts > "decaying" again...
I noticed this for several releases up to the 2.4x series; and to a lesser degree into 2.5x and 2.6x. However, I've reached a fairly stable state with 2.64 (with the SpamCopURI "plugin"/patch) where I see maybe two or three spams a week slipping through - at most. I move those messages to a "missed-spam" folder, and sa-learn that folder manually every so often. Bayes and the SURBL checks have *REALLY* made a noticeable difference in long-term accuracy. If it weren't for SURBL alone, actually, I probably would have upgraded to 3.x by now. I also happen to maintain a "local-use-only" DNS zone that I refer to with the SURBL check; but I haven't added anything to it in several months. > Well, it's been a while since the last release, and it's already > noticeably worse. I know this has been discussed before, i am aware > of the VirusScannerTypeUpdates FAQ entry, but you know what, from an > end-user's point of view, it does not matter. All that matters is > that, despite brilliant technical discussions, the efficiency is > going down and, if a new version is not released soon enough, the > users start to complain. This is what's happening right now. This WILL HAPPEN if you rely entirely on static rules - spammers adjust their tactics to avoid those rules. That's why dynamic rules or systems such as Bayes and SURBL are so important. The program and rules themselves don't have to change; just the data source they work with. Manual feedback is NECESSARY for a well-adjusted Bayes system; without that feedback there's no way to guarantee that it won't behave incorrectly on your email stream. The SA devs could, in theory, release updated rules much more quickly... but then they'd be spending most of their time maintaining and creating new rules, then going through the score-balancing process to maximize spam detection while minimizing FPs across the official ruleset - this is a much faster process these days, but it's still a week-long process IIRC. (As compared to ~6 weeks up until ~2.63 IIRC.) The most common detail in most other reports like yours (you don't say much beyond "It's broke. Fix it.") is that spam is hitting BAYES_99.... and nothing else. In 2.6x, this wasn't a problem, BAYES_99 scored over the threshold of 5 in the default setup, and spam would be correctly tagged in that case. With 3.x, the BAYES_nn scores have been rather reduced, and a number of people have reported good results from just copying the 2.64 BAYES_nn scores. > I guess something has to change. "Then change it yourself" type of > advices will go straight to /dev/null, thank you, because as far as > SA is concerned, i'm just a user. I am merely pointing out the > problem. I'm a little puzzled what you're asking for, then; addon rulesets are available from SARE, and somewhere there's a tool to automatically check for updates on those rules. ISP mail administrators should at least be able to whitelist/blacklist email addresses (or provide a way for users to do so for themselves), and better ones will have a way for users to submit missed spam or FPs back to be whitelisted/blacklisted/learned by Bayes/manually poked for possible local rules. The core SA development team spends more time developing the code that dissects the message and pulls out specific parts; with 3.x anyone can now (more) easily add more complex "rules" that aren't "just" simple pattern matching but do things like counting occurrences of words or letters - or more complex checks. Quite a few SA "rules" rely on code like this; that code *can't* be quickly updated in the same way that the SARE rulesets (for instance) can. If you're really not interested in tweaking your SA setup, look into a mail client with its own spam filter - Netscape/Mozilla of recent versions have one that's pretty good, Apple Mail is supposedly pretty good, IIRC KMail has one. But ANY spam filter needs feedback on whether the filter is working correctly - in the case of a mail program, it's usually a few mouse clicks compared to the regex tweaking or arcane command line magic required for SA. If you're not the administrator of the system running SA on your mail, talk to the person/organization that is and complain. -kgd -- Get your mouse off of there! You don't know where that email has been!