From: "kalin mintchev" <[EMAIL PROTECTED]>

hi all...

using qmail/vpopmail.
so far users have individual .spamassassin under their vpopmail
directories using mailfilter to pipe incoming mail trough spamd for each
user. this apparently has the disadvantage of not being able to run per
user sa-learn. and not all users have the mailfilter in place. i was
thinking that if there are two spamds running - one using qmailscanner to
get the vast amount of spam away with a system wide score of, lets say 5 -
this one will use the system databases that can use sa-learn periodically
with a maildir where untagged spam is being sent to by users; and one
running under vpopmail for individual accounts with different scores,
black and white lists that might contain domain that should not be blocked
system wide.
i hope that a set up like that can make filtering more accurate.
i wake up to about 20 spam messages every day using spamd with rbl and a
score of 3 in my preferences.

is that a viable setup?

For the sake of the person most qualified to help please add a little
more data. About how many users are there to the nearest say 50%?
What is the average daily mail flow at a rough estimate? These can
alter the recommendations rather dramatically.

Right off the acronym SARE comes to mind. If you do not know what that
means visit http://www.rulesemporium.com/. Visit the "Rules" tab. Read
the descriptions of the rule sets. Install ones that meet the level of
conservative to reckless abandon you need.

As for a score of 3 and ONLY SBL? No, In My Characteristically Arrogant
Opinion, that score is way to low. And you should use several BLs with
scores appropriate to their accuracy.

3 will tag way too many hams as spam in a setup with a well trained
Bayes.

To get an estimate of your Bayes training check this tool out of
SARE: http://www.rulesemporium.com/programs/sa-stats.txt

If BAYES_99 is not hitting almost exclusively on spam the database
is hosed. If BAYES_00 is not hitting almost exclusively on ham the
database is poop. You MAY have to junk it and start training over.
In that case I'd cull a bunch of known ham that matches your user
needs and a bunch of known spam, say 500 of each. Then manually run
them through the learn process. In a corporate or home environment
with a proper IT policy keep both the spam and ham samples around for
future retraining if something corrupts the database. For an ISP you
have a little more of a problem getting the initial training done.
You have to respect email privacy there.

{^_^}   Beyond that I cannot go. I don't know QMail other than that
       at my present level of need and knowledge I'd not touch it.

Reply via email to